Research
My research sits at the intersection of applied mathematics (dynamical systems, mathematical biology) and scientific machine learning (sciML). I develop mathematical and statistical tools to understand complex systems from intracellular biochemical networks to human-scale epidemics. Central questions I pursue include:
- How can nonlinear dynamical systems be represented in a computationally tractable way?
- How do biological systems maintain robust function despite fluctuating conditions?
- How can we accurately infer parameters of models with realistic, non-Markovian assumptions?
The techniques I develop and apply span Koopman operator theory, chemical reaction network theory, Bayesian inference, stochastic processes, and machine learning. I also enjoy problem-driven collaboration with biologists and clinicians.
The Extended Dynamic Mode Decomposition (EDMD) approximates the Koopman operator using a finite dictionary of observable functions. A central challenge is choosing this dictionary in a principled way. I am developing an algorithm based on Personalized PageRank that systematically constructs and refines dictionaries, providing provable approximation guarantees for the finite-dimensional Koopman representation.
While EDMD yields approximate representations, a more fundamental question is: when does a nonlinear dynamical system admit an exact finite-dimensional Koopman invariant subspace? I am characterizing purely algebraic sufficient and necessary conditions for polynomial ODEs.
Classical Koopman theory targets autonomous systems, but many real-world systems are non-autonomous due to time-varying inputs or environmental changes. I am developing a neural-network-based framework to learn Koopman representations for such systems from data, extending the reach of operator-theoretic methods to a broader class of dynamical systems.
Stochastic biochemical systems are naturally modeled as continuous-time Markov chains (CTMCs). I work on deriving explicit, closed-form stationary distributions for such systems via a technique called network translation, which transforms a given reaction network into one with a more tractable structure. These analytic formulas unlock sensitivity analysis, robustness quantification, and Bayesian likelihood functions that would otherwise be computationally intractable.
In the deterministic setting, biochemical networks are described by ODEs, and a central goal is to understand steady-state behavior — particularly absolute concentration robustness (ACR) and robust perfect adaptation (RPA), whereby certain species concentrations remain invariant to perturbations. I develop structural and algebraic criteria that guarantee such behaviors directly from the network topology, without requiring explicit solutions.
Biochemical models often involve species operating on vastly different timescales. Quasi-steady-state approximation (QSSA) exploits this separation to reduce model complexity. I have worked on the validity and universality of such reductions in stochastic settings, establishing conditions under which simplified propensities yield accurate approximations of the full system.
Many biological processes involve unobserved intermediate steps that introduce effective time delays, rendering the system non-Markovian. Drawing on tools from queueing theory, I developed Bayesian MCMC methods to jointly infer kinetic and delay parameters from single-cell data. These methods have been applied to gene regulatory networks and cell signaling pathways, enabling accurate parameter estimation under realistic experimental constraints.
Standard compartmental epidemic models assume Markovian transitions, which can introduce systematic bias in parameter estimates. We have developed history-dependent epidemic models, accounting for realistic waiting-time distributions. By applying this method to 2020 Seoul COVID-19 data, we showed that this provides more accurate estimates of key epidemiological parameters
Cell-to-cell heterogeneity in signaling responses is a ubiquotous phenomena of biological systems, yet its sources are often difficult to disentangle. In collaboration with Hyeontae Jo, we developed density physics-informed neural networks (Density-PINNs) that directly learn the distribution of parameters from population-level data, identifying key sources of cell-to-cell heterogeneity in antibiotic responses.
"If the only tool you have is a hammer, everything starts to look like a nail."
In order to address meaningful scientific questions raised by field experts, I believe that the right mathematical framework should be chosen to fit the problem, not the other way around. Beyond developing mathematical theory, I actively engage in collaborations with biologists and clinicians. These projects are motivated by concrete scientific questions, such as identifying digital biomarkers of cognitive impairment from wearable device data, or modeling COVID-19 endemic transition, and often require adapting or extending existing methods in unexpected ways.
I am contributing to a project on (auto)formalizing graduate-level algebra in the LEAN proof assistant, specifically targeting problems from the textbook Abstract Algebra by Dummit and Foote. We are building a dataset of formalized graduate-level algebra problems, namely LEAN-GAP.