
Citation: | Ziyu Zhao, Yi Zhou, Jinxing Guan, Yan Yan, Jing Zhao, Zhihang Peng, Feng Chen, Yang Zhao, Fang Shao. The relationship between compartment models and their stochastic counterparts: A comparative study with examples of the COVID-19 epidemic modeling[J]. The Journal of Biomedical Research, 2024, 38(2): 175-188. DOI: 10.7555/JBR.37.20230137 |
Deterministic compartment models (CMs) and stochastic models, including stochastic CMs and agent-based models, are widely utilized in epidemic modeling. However, the relationship between CMs and their corresponding stochastic models is not well understood. The present study aimed to address this gap by conducting a comparative study using the susceptible, exposed, infectious, and recovered (SEIR) model and its extended CMs from the coronavirus disease 2019 modeling literature. We demonstrated the equivalence of the numerical solution of CMs using the Euler scheme and their stochastic counterparts through theoretical analysis and simulations. Based on this equivalence, we proposed an efficient model calibration method that could replicate the exact solution of CMs in the corresponding stochastic models through parameter adjustment. The advancement in calibration techniques enhanced the accuracy of stochastic modeling in capturing the dynamics of epidemics. However, it should be noted that discrete-time stochastic models cannot perfectly reproduce the exact solution of continuous-time CMs. Additionally, we proposed a new stochastic compartment and agent mixed model as an alternative to agent-based models for large-scale population simulations with a limited number of agents. This model offered a balance between computational efficiency and accuracy. The results of this research contributed to the comparison and unification of deterministic CMs and stochastic models in epidemic modeling. Furthermore, the results had implications for the development of hybrid models that integrated the strengths of both frameworks. Overall, the present study has provided valuable epidemic modeling techniques and their practical applications for understanding and controlling the spread of infectious diseases.
In December 2019, a novel enveloped RNA beta-coronavirus that causes coronavirus disease 2019 (COVID-19), emerged in Wuhan, China[1]. Over the past few years, the COVID-19 pandemic had significant impacts on the global economy, society, and public health, including school closures, industry collapses, and millions of job losses[2]. There are individual data-based approaches to modeling and analyzing the course of COVID-19 infections[3–4], and macro data-based epidemiological modeling approaches to interpreting and controlling the spread of COVID-19. Compartment models (CMs) and agent (individual)-based models (ABMs) are two representative frameworks to investigate the dynamics of epidemics and the efficiency of prevention strategies[5]. Although systematic comparisons of these two models have been made available in the literature[6–9], the relationship between deterministic CM and its corresponding stochastic version of ABM has not been well studied. This knowledge gap hinders the ability to effectively compare and integrate these modeling approaches.
CM is a classic approach to epidemic modeling that can be traced back to about 100 years ago[10–11]. Classical CMs are continuous-time dynamic systems based on nonlinear differential equations that are conventionally solved by numerical methods. CMs assume homogeneous mixing within populations and are computationally efficient, but may not accurately capture individual-level behavior and transmission rates of heterogeneous disease[6]. In infectious disease epidemiology, the SEIR model[12] is one of the most well-known CM, in which the population is divided into susceptible (S), exposed (E), infectious (I), and recovered (R) compartments. The SEIR model is an extension of the classic SIR model[11]. The SEIR model and its extensions have been widely used in the studies of COVID-19 pandemic[12–20].
Stochastic compartment models (SCMs) have been developed to address the limitations of CMs by incorporating stochastic events and discrete-time transitions between compartments[21–23]. It provides a more accurate representation of individual-level behavior and avoids the assumption of homogeneous mixing. SCMs have been widely used in the study of infectious diseases[24]. Similar to CMs, SCMs are computationally efficient but may vary wildly because of the variances of random variables.
ABM is a relatively new stochastic approach to modeling complex systems by representing individual agents with their characteristics and interactions. Agents represent individuals, households, governments, or any other entities of interest, and adapt their behaviors in response to interactions with other agents and their environment. The use of ABMs in public health has been advocated by Rutter et al[25], and the use of ABMs for COVID-19 epidemic modeling has also been proposed in Australia, Luxembourg, and Switzerland[26–28].
CM is a top-down modeling method, while ABM is a bottom-up modeling method. ABM is based on many individual agents with their own actions and the ability to interact with each other, while CM models subpopulations of different states as a few compartments. Therefore, CM is much less computationally intensive than ABM. On the other hand, ABM is more versatile and flexible than CM. ABM easily involves the spatial movement of agents, but CM requires a dynamic system with much more complicated partial differential equations than ordinary differential equations to achieve this. CM can be converted to its corresponding ABM like SCM[26], but ABM for a complex system may not have its CM counterpart.
The similarities between CMs, SCMs, and ABMs have been noted by many investigators[7–9]. However, there is a lack of comprehensive understanding regarding the relationship between CMs and their corresponding stochastic models, which hinders the ability to effectively compare and integrate these modeling approaches. This gap motivated us to conduct a comparative study of CM and its corresponding stochastic models to illustrate the relationship among them. In the remainder of this manuscript, we presented the relationship with the selected CMs by theory and simulation. Three representative CMs (SEIR, susceptible, exposed, infected, recovered and deceased [SEIRD], and susceptible, exposed, infected, hospitalized and removed [SEIHR]) in COVID-19 pandemic studies[17–19] were considered. By establishing the equivalence between numerical solutions of CMs using the Euler scheme[29] and their stochastic counterparts, we aimed to enhance our understanding of the relationships among these models. Furthermore, we acknowledged the limitations of discrete-time stochastic models in perfectly reproducing the exact solutions of continuous-time CMs. To overcome this challenge, we proposed an efficient model-calibration method allowing the replication of CM exact solutions in corresponding stochastic models through parameter adjustment, which minimized the differences between the CM exact solutions and CM numerical solutions by the Euler scheme. Additionally, we introduced a novel stochastic compartment-agent mixed model (CAMM) as an alternative approach to ABM, which offered a promising solution for conducting large-scale population simulations with a limited number of agents. By bridging the gap between deterministic CMs and stochastic models, we explored advanced epidemic modeling to facilitate the comparison, unification, and hybridization of these modeling approaches, ultimately improving our ability to understand and control the spread of infectious diseases.
The SEIR model and its extensions are classical CMs widely used in the COVID-19 pandemic studies[12–20]. These extended models incorporated more compartments[16,18–20] and the effects of important factors, such as vaccination and social distance[20,30–34], government action and public response[35–36], media[37–38], and environments[34,36,39]. In the present study, the SEIR model and its selected extensions were presented as typical representative CMs for comparisons with their corresponding stochastic models to demonstrate the relationships among them with more details shown afterward.
CMs are conventionally solved by numerical methods. In the present study, numerical solutions by the Euler scheme[29] were considered for comparisons. However, the Euler scheme, which is also known as the first-order Runge-Kutta method, is the most simple and basic numerical scheme, and cannot derive exact solutions in many situations. Therefore, numerical solutions by more sophisticated methods, such as LSODA[40] and the fourth-order Runge-Kutta[29], are considered exact solutions of CMs for comparisons.
In the literature[8,24], CMs were converted to their corresponding SCMs and ABMs using Bernoulli, binomial, and multinomial distributed random variables. Solutions of SCMs and ABMs can be verified to be equivalent to numerical solutions of the corresponding CMs by the Euler scheme.
While CMs and ABMs had their advantages and disadvantages, some investigators proposed hybrid models (HMs) that combined both approaches[41]. HMs can switch between CM and ABM under certain conditions to balance computational efficiency and modeling accuracy.
Because of the limitation of computation resources, it is difficult for ABMs to simulate the activity of individuals in a large population. This is because ABMs require more computational power and become computationally expensive as the population size grows. HM can be treated as a solution to this issue by switching ABM to CM, when the number of agents exceeds the maximum number that can be simulated. However, HM is a bridge between CM and ABM, and it is neither a purely deterministic nor a purely stochastic model. To compare CMs with their stochastic counterparts, a novel model, the CAMM, was proposed as an alternative to ABM to simulate large populations with a limited number of agents in the present study. Different from HM, CAMM directly models compartments as agents under the ABM framework, and merges agents with the same state, when the number exceeds a predefined maximum, which allows for the efficient modeling of complex systems with large populations under the ABM framework with more details shown afterward.
In the present study, CMs were compared with their stochastic counterparts, including SCMs, ABMs, and CAMMs, which were proposed by the authors and illustrated in the following sections. Four scenarios with three compartment models (SEIR, SEIRD, and SEIHR) and their stochastic counterparts are as follows.
The SEIR model was utilized in a study by Zhou and Liu et al[17], in which they estimated the basic reproduction number of COVID-19 in Wuhan by using the SEIR model, thus concluding that the early transmission of COVID-19 was close to, or slightly higher than, that of severe acute respiratory syndrome (SARS). The SEIR model divided the population into four compartments (Fig. 1A): susceptible (S), exposed (E), infected (I), and recovered (R). It can be expressed by the following nonlinear system of ordinary differential equations.
{dS(t)dt=−βI(t)NS(t)dE(t)dt=βI(t)NS(t)−γE(t)dI(t)dt=γE(t)−σI(t)dR(t)dt=σI(t) |
(1) |
The basic reproduction number,
The SEIR model is a set of four ordinary differential equations that describe how the number of individuals in each compartment changes over time. The equations are based on the assumption that the population is closed, meaning that there is no migration of individuals into or out of the population during the study period.
Differential equations can be solved by numerical methods. The following discrete-time model is equivalent to the basic Euler numerical scheme (the Euler method or the first-order Runge-Kutta method[29]) for solving the SIER model. If we set equally spaced discrete time step
{ˆSt+1=ˆSt−βˆStˆItNˆEt+1=ˆEt+βˆStˆItN−γˆEtˆIt+1=ˆIt+γˆEt−σˆItˆRt+1=ˆRt+σˆIt |
(2) |
It is important to note that the numerical solution of ordinary differential equations using the above Euler method is different from the exact solution and the numerical solutions by other more sophisticated and accurate numerical schemes, such as Runge-Kutta or LSODA, in general.
Based on the above Euler scheme, the corresponding SCM can be derived as follows[8].
{ˆSt+1=ˆSt−stˆEt+1=ˆEt+st−etˆIt+1=ˆIt+et−itˆRt+1=ˆRt+itst∼Binomial(ˆSt,βˆItN)et∼Binomial(ˆEt,γ)it∼Binomial(ˆIt,σ) |
(3) |
In SCM, individuals in a compartment except R will be transformed into the next compartment with a certain probability, which is the same as the rate of change in CM.
The corresponding ABM is as follows.
ai(t+1)={ai(t)+Bernoulli(βˆItN), if ai(t)=1ai(t)+Bernoulli(γ), if ai(t)=2ai(t)+Bernoulli(σ), if ai(t)=3ai(t), if ai(t)=4 |
(4) |
Here,
When the population size N is large, the number of agents is also N, which makes the simulation of ABM intractable. Therefore, to overcome this difficulty, we proposed to combine compartments and agents to construct a CAMM with the advantages of both CM and ABM. Compartments are treated as union sets of agents with states and sizes, and agents in the same state can be concatenated to a compartment by summing up their sizes. In this manner, ABM-type simulations can be implemented with a predefined limited maximum number of agents. The algorithm for CAMM is shown as follows.
For the time step
Step 0. Generate only four agents,
For each discrete time step
Step 1. For each agent
Next, if
asi,t∼{Binomial(Siai,t,βˆItN), if State(ai,t)=1Binomial(Siai,t,γ), if State(ai,t)=2Binomial(Siai,t,σ), if State(ai,t)=30, if State(ai,t)=4 |
(5) |
Here, the values of
Step 2. If the total number of agents at the time step
Remove all other agents
CAMM combined the advantages of both CM and ABM with high efficiency. CAMM was somewhat similar to HM, which switched between CM and ABM. However, unlike HM, CAMM directly treated compartments as agents and modeled them under the ABM framework. CAMM was proposed as an alternative modeling method to ABM for large population simulations with a limited number of agents.
Many epidemiological models are derived from SEIR. In this scenario, we used a SEIRD model to compare CM, SCM, ABM, and CAMM. The model was from Shin et al[18], who analyzed the time-varying transmission dynamics of the COVID-19 epidemic in Republic of Korea over multiple stages of development, demonstrating that the model offered a better model fit and could show how the infection pattern of COVID-19 changes over time. The SEIRD model divided the population into five compartments (Fig. 1B): susceptible (S), exposed (E), infected (I), recovered (R), and deceased (D).
{dS(t)dt=−αI(t)NS(t)dE(t)dt=αI(t)NS(t)−βE(t)dI(t)dt=βE(t)−γI(t)−δI(t)dR(t)dt=γI(t)dD(t)dt=δI(t) |
(6) |
The usage of each parameter in its original study was retained, which differed slightly from the SEIR model.
Our comparison involved the CM, SCM, ABM, and CAMM. Corresponding stochastic models of SEIRD were derived using the same method as scenarios 1 and 2. Details are shown in Supplementary Section 1.1, available online.
The SEIHR model is another extension of the SEIR model. It was proposed by Wang et al[19] to study the COVID-19 epidemic in Wuhan after the blockade, in the case of no population inflow or outflow and certain control of COVID-19 in China. We cited this SEIHR model to compare CM, SCM, and CAMM.
The SEIHR model divided the population into five compartments (Fig. 1C): susceptible (S), exposed (E), infected (I), hospitalized (H), and removed (R). The infected class included symptomatic and asymptomatic infections, and the removed class included individuals who recovered and deceased, and excluded natural births and deaths. In this model, a hospitalized class for the number of daily hospitalizations was introduced, which were obtained from public data. When a state change occurred in the infected compartment, individuals who have been confirmed were transferred to the hospitalized state while those who have not been confirmed were transferred to the removed state. The SEIHR model can be represented by the following nonlinear ordinary differential equations.
{dS(t)dt=−β[1−I(t)N]I(t)NS(t)dE(t)dt=β[1−I(t)N]I(t)NS(t)−λE(t)dI(t)dt=λE(t)−αI(t)−γI(t)dH(t)dt=αI(t)−μH(t)dR(t)dt=γI(t)+μH(t) |
(7) |
The parameters also differed slightly from the SEIR model.
In this scenario, CM, SCM, and CAMM were used to simulate the population change. ABM was omitted in this scenario because of the large population size N. Corresponding stochastic models of SEIHR were derived using the same method as scenarios 1 and 2. Details are shown in Supplementary Section 1.2, available online.
Gallagher[8] has proved the equivalence of the numerical solution of CM by the Euler method, the corresponding SCM, and ABM, i.e., the expected value of the corresponding SCM and the expected value of the corresponding ABM in terms of the state sizes at each time step was unbiased with the solution of the Euler method. However, the numerical solution by the Euler scheme was confused with the exact solution in the proofs of her thesis[8]. The equivalence of CM solution by the Euler method as well as the corresponding SCM, ABM, and CAMM can also be verified in the same manner.
However, as mentioned before, the numerical solutions by other sophisticated methods were different from those by the Euler method. Therefore, to mimic the solutions by other numerical methods with stochastic counterparts, it was first necessary to calibrate SCM, ABM, and CAMM. However, directly estimating the parameters to fit SCM, ABM, and CAMM to exact numerical solutions was time-consuming because of the calculation of means for SCM, ABM, and CAMM replication results as stochastic model estimates. Therefore, we proposed to estimate parameters using the numerical solutions by the Euler method instead of those of SCM, ABM, and CAMM because of their equivalence. This calibration strategy was much more efficient than the traditional estimation method.
CM and its stochastic counterparts, including SCM, ABM, and CAMM, were compared for three CMs (SEIR, SEIRD, and SEIHR) in four scenarios. Scenarios 1 and 2 were SEIR models with different initial values. Scenarios 3 and 4 were for SEIRD and SEIHR models, respectively. The parameter settings of these models were derived from the results of the corresponding literature for these models[17–19], respectively.
For stochastic models including SCM, ABM, and CAMM, repeated simulations were performed. The estimates for states were calculated as the means of the replications of the stochastic models at each time point and state. One thousand replications were set for simulations. The exact solutions of CM were calculated by the LSODA numerical scheme, which was the default solver for the R package "deSolve"[44] for numerical solutions of CM. To compare the results of CM, SCM, ABM, and CAMM, the exact solution of CM was treated as the benchmark to obtain the root mean squared error (RMSE) and mean absolute error (MAE) of the calculated state percentages of the initial population N by other methods. Parameter estimations for model calibration were also implemented to minimize the RMSE of the calculated state percentages of the initial population.
All statistical analyses were performed using R software[45], version 4.2.2 with R package "deSolve" (version 1.34)[44]. Sample codes for simulations and analyses in the present study are provided with details in Supplementary Section 2, available online.
The results of the proposed three CMs and their stochastic counterparts in four scenarios were demonstrated in this section.
The SEIR model was constructed based on the "Materials and methods" section. The equivalence of CM using the Euler method, the corresponding SCM, ABM, and CAMM, and the difference between the CM exact solution by LSODA and the former ones were demonstrated by simulations. Model calibration was conducted to adjust the parameter values of the stochastic methods so that their solutions were as consistent as possible with the CM solutions by LSODA.
The formulas for CM were the same as the above with derived parameters
Based on RMSEs and MAEs in scenario 1 (Table 1), the Euler method solutions were close to those of SCM, ABM, and CAMM, which demonstrated their equivalence. However, RMSE0 and MAE0 were larger than RMSE1 and MAE1, respectively. Therefore, model calibration was needed to mimic the CM solutions by other sophisticated numerical methods using the Euler method, SCM, ABM, and CAMM. Model calibration for these four methods was conducted as stated before, and the calibrated parameters were efficiently obtained using the Euler method solutions, which were
Scenarios | Model | Before calibration | After calibration | |||||||
RMSE0 | MAE0 | RMSE1 | MAE1 | RMSE0 | MAE0 | RMSE1 | MAE1 | |||
1 | Euler | 1.528 | 1.041 | 0.000 | 0.000 | 0.781 | 0.610 | 0.000 | 0.000 | |
SCM | 1.951 | 1.296 | 0.781 | 0.502 | 1.082 | 0.747 | 0.879 | 0.573 | ||
ABM | 2.060 | 1.370 | 0.873 | 0.568 | 1.031 | 0.715 | 0.798 | 0.523 | ||
CAMM | 2.247 | 1.502 | 1.051 | 0.700 | 1.125 | 0.774 | 0.940 | 0.612 | ||
2 | Euler | 1.377 | 0.955 | 0.000 | 0.000 | 0.409 | 0.301 | 0.000 | 0.000 | |
SCM | 1.377 | 0.955 | 0.001 | 0.001 | 0.408 | 0.300 | 0.003 | 0.002 | ||
CAMM | 1.376 | 0.954 | 0.001 | 0.000 | 0.408 | 0.300 | 0.002 | 0.001 | ||
3 | Euler | 1.132 | 0.637 | 0.000 | 0.000 | 0.418 | 0.305 | 0.000 | 0.000 | |
SCM | 1.283 | 0.726 | 0.179 | 0.102 | 0.396 | 0.286 | 0.098 | 0.053 | ||
ABM | 1.187 | 0.670 | 0.092 | 0.053 | 0.387 | 0.281 | 0.068 | 0.039 | ||
CAMM | 1.188 | 0.671 | 0.095 | 0.052 | 0.410 | 0.291 | 0.145 | 0.081 | ||
4 | Euler | 0.551 | 0.277 | 0.000 | 0.000 | 0.367 | 0.183 | 0.000 | 0.000 | |
SCM | 0.550 | 0.276 | 0.001 | 0.000 | 0.369 | 0.184 | 0.003 | 0.002 | ||
CAMM | 0.550 | 0.276 | 0.000 | 0.000 | 0.365 | 0.182 | 0.002 | 0.001 | ||
RMSE0 and MAE0 used the numerical solution of the CM model by the LSODA method as the benchmark, and RMSE1 and MAE1 used the numerical solution of the CM model by the Euler method as the benchmark. Scenario 1 was set to simulate the SEIR model with experiment data. Scenario 2 was set to simulate the SEIR model with real world data. Scenario 3 was set to simulate the SEIRD model with real world data. Scenario 4 was set to simulate the SEIHR model with real world data. Abbreviations: RMSE, root of mean squared error; MAE, mean absolute error; CM, compartment model; SCM, stochastic compartment model; CAMM, compartment-agent mixed model; ABM, agent-based model. |
Fig. 2 demonstrates these results graphically. The curves of CM by the Euler method and SCM as well as CAMM almost overlapped before and after calibration, verifying the equivalence of CM by the Euler method, SCM and CAMM. However, there was a clear difference between CM by the LSODA method and CM by the Euler method before calibration, which was significantly reduced after calibration.
For scenario 2, the same SEIR model and parameters of scenario 1 were implemented with different initial values
The results for scenario 2 were similar to those for scenario 1 (Table 1). RMSE1 and MAE1 for SCM and CAMM were small, showing the equivalence of the Euler method solutions, SCM, and CAMM. However, even after calibration, the difference between numerical solutions of CM by the LSODA and Euler methods still existed. Therefore, the exact solution of continuous-time CM cannot be perfectly reproduced by the corresponding discrete-time stochastic models. Fig. 3 visualizes these results. 95% reference ranges were too narrow to be displayed in the figure because of the large population N.
For scenario 3, the SEIRD model was constructed to describe the COVID-19 epidemic in Republic of Korea[18]. The initial values of SEIRD were
The results for scenario 3 were similar to those for scenario 2 (Table 1). RMSE1 and MAE1 showed the equivalence of the Euler method solutions, SCM, ABM, and CAMM. The difference between the exact solutions of CM by the LSODA and its stochastic models still existed after calibration. These results are shown in Fig. 4. 95% reference ranges of stochastic models enclosed the curves of the CM numerical solutions.
For scenario 4, the utilized SEIHR model was constructed to study the COVID-19 epidemic after the blockade in Wuhan[19]. Because of the large population base in the real data, ABM was difficult to handle, so we only demonstrated the equivalence between CM by the Euler method, SCM and CAMM. The total population was set to be
The models of Euler, SCM, and CAMM were calibrated as described in the previous section, and the parameters after calibration were
As presented in Table 1, the disparities in RMSE1 and MAE1 among SCM, CAMM, and CM using the Euler method were negligible both before and after calibration in scenario 4. Hence, it was concluded that the CM by the Euler method was equivalent to both SCM and CAMM. However, the RSME0 and MAE0 of CM by the Euler method, SCM, and CAMM were relatively large, indicating that there were differences between them and CM by the LSODA method, even after the model calibration. Therefore, the numerical solutions of CM, obtained by the Euler method and its corresponding stochastic models, SCM and CAMM, cannot exactly simulate the CM exact solutions. Fig. 5 is a visualization of these results. To make it clearer, S and R were hidden in the visualization, and the population proportions of E, I, and H states were shown.
Infectious disease modeling plays a crucial role in public health research[11]. While continuous-time deterministic CMs have long been the foundation of epidemic models, discrete-time stochastic models, such as SCMs and ABMs, have emerged to address some limitations of CMs[46]. Each modeling approach has its strengths and limitations. CMs are simpler and computationally less demanding, compared with stochastic models, but may lack the realism of more complex models because of the assumption of homogeneous populations within compartments. SCMs introduce randomness in inter-compartment transitions, while ABMs simulate individual-level interactions, providing highly detailed simulations. However, ABMs often require more computational resources and can be challenging to validate. Investigators should carefully choose the most appropriate model for their specific research problem, considering factors such as model assumptions, data availability, and computational resources.
The challenge in epidemiological studies lies in bridging the macroscopic and microscopic aspects. In the present study, we proposed a novel model, CAMM, which combined the macroscopic compartment of CM with the microscopic simulation of ABM. CAMM has integrated the advantages of both CM and ABM, and can serve as an alternative to ABM, when the number of simulation agents is limited. For instance, when simulating a large population, using ABM with one agent per person may become computationally intractable. In such cases, CAMM may offer a tractable simulation with a limited number of agents.
While CMs can be converted into corresponding stochastic counterparts including SCMs, ABMs, and CAMMs, it is important to note that the exact solutions of continuous-time CMs cannot perfectly match with discrete-time stochastic counterparts using the same parameter settings. The equivalence of CM numerical solutions using the Euler scheme, SCMs, ABMs, and CAMMs can be verified through existing theorems in the literature[26]. However, the Euler scheme is a basic numerical method for solving CMs, and its solutions differ from exact solutions and solutions obtained using more sophisticated and accurate schemes, such as LSODA and Runge-Kutta. We have demonstrated the differences between the exact CM solutions and solutions obtained from the four equivalent models with the same parameter settings. Therefore, caution should be exercised when calibrating stochastic models to reproduce the exact results of CMs. Direct model calibration of stochastic models can be time-consuming because of the need for averaging solutions from multiple simulation replications. To address this, we have proposed an efficient model calibration method based on CMs using the Euler scheme. This method minimizes the differences between the exact CM solutions and solutions obtained from stochastic methods, although slight discrepancies persist. It is important to note that discrete-time stochastic models cannot perfectly reproduce the exact solutions of continuous-time CMs.
Deterministic CMs are computationally efficient, but can only estimate the average values for each compartment. On the other hand, stochastic models are computationally less efficient, but because of the introduction of randomness, the interval estimates for each compartment can be calculated. Our findings can be applied to construct and compare deterministic CMs and corresponding stochastic models. This allows efficient model calibration of stochastic models, thereby creating a unified modeling framework that can be flexibly selected according to the practical application requirements of infectious disease prediction and control. Stochastic models with complex structures, such as ABMs, can be fully or partially converted to CMs, based on the equivalence between CMs using the Euler scheme and their corresponding stochastic models. Our proposed model calibration method enables efficient parameter estimation, improving the efficiency of stochastic model prediction. This, in turn, enhances the efficiency of comparing stochastic models and CMs. Furthermore, by bridging CMs and stochastic models under this unified framework, we have provided an efficient tool for HM construction and parameter estimation.
In conclusion, CMs are highly related to their stochastic counterparts (SCMs, ABMs, and CAMMs). We have verified the equivalence between CMs using the Euler scheme and their corresponding stochastic models. With limited computational resources, the proposed CAMM offers scalability and has the potential to serve as a substitute for ABM in simulating various infectious diseases in large-scale populations. Model calibration is necessary to reproduce the exact solutions of CMs using SCMs, ABMs, and CAMM. Here, we propose an efficient model calibration method based on the equivalence of these models, which can be extended to HMs. Our findings have contributed to the comparison and unification of deterministic CMs and stochastic models in the application of epidemic prediction and control.
We thank the editors and reviewers for their insightful comments and suggestions, which significantly enhanced the quality of our manuscript.
This study was supported by the National Natural Science Foundation of China (Grant Nos. 82173620 to Yang Zhao and 82041024 to Feng Chen). This study was also partially supported by the Bill & Melinda Gates Foundation (Grant No. INV-006371 to Feng Chen) and Priority Academic Program Development of Jiangsu Higher Education Institutions.
CLC number: R181, Document code: A
The authors reported no conflict of interests.
[1] |
Abebe EC, Dejenie TA, Shiferaw MY, et al. The newly emerged COVID-19 disease: a systemic review[J]. Virol J, 2020, 17(1): 96. doi: 10.1186/s12985-020-01363-5
|
[2] |
Nicola M, Alsafi Z, Sohrabi C, et al. The socio-economic implications of the coronavirus pandemic (COVID-19): A review[J]. Int J Surg, 2020, 78: 185–193. doi: 10.1016/j.ijsu.2020.04.018
|
[3] |
Ghosh S, Samanta GP, Mubayi A. Comparison of regression approaches for analyzing survival data in the presence of competing risks[J]. Lett Biomath, 2021, 8(1): 29–47. doi: 10.30707/LiB8.1.1647878866.022689
|
[4] |
Ghosh S, Samanta G, Nieto JJ. Application of non-parametric models for analyzing survival data of COVID-19 patients[J]. J Infect Public Health, 2021, 14(10): 1328–1333. doi: 10.1016/j.jiph.2021.08.025
|
[5] |
Tunc H, Sari FZ, Darendeli BN, et al. Analytical solution of equivalent SEIR and agent-based model of COVID-19; showing the bounds of contact tracing[EB/OL]. [2023-06-01]. https://www.medrxiv.org/content/10.1101/2020.10.20.20212522v1.
|
[6] |
Rahmandad H, Sterman J. Heterogeneity and network structure in the dynamics of diffusion: Comparing agent-based and differential equation models[J]. Manag Sci, 2008, 54(5): 998–1014. doi: 10.1287/mnsc.1070.0787
|
[7] |
Cassidy R, Singh NS, Schiratti PR, et al. Mathematical modelling for health systems research: a systematic review of system dynamics and agent-based models[J]. BMC Health Serv Res, 2019, 19(1): 845. doi: 10.1186/s12913-019-4627-7
|
[8] |
Gallagher SK. Catalyst: Agents of change—Integration of compartment and agent-based models for use in infectious disease epidemiology[D]. Pittsburgh: Carnegie Mellon University, 2019.
|
[9] |
Truong VT, Baverel PG, Lythe GD, et al. Step-by-step comparison of ordinary differential equation and agent-based approaches to pharmacokinetic-pharmacodynamic models[J]. CPT Pharmacomet Syst Pharmacol, 2022, 11(2): 133–148. doi: 10.1002/psp4.12703
|
[10] |
Kermack WO, McKendrick AG. A contribution to the mathematical theory of epidemics[J]. Proc R Soc A Math Phys Eng Sci, 1927, 115(772): 700–721. doi: 10.1098/rspa.1927.0118
|
[11] |
Bjørnstad ON, Shea K, Krzywinski M, et al. Modeling infectious epidemics[J]. Nat Methods, 2020, 17(5): 455–456. doi: 10.1038/s41592-020-0822-z
|
[12] |
He S, Peng Y, Sun K. SEIR modeling of the COVID-19 and its dynamics[J]. Nonlinear Dyn, 2020, 101(3): 1667–1680. doi: 10.1007/s11071-020-05743-y
|
[13] |
Guan J, Wei Y, Zhao Y, et al. Modeling the transmission dynamics of COVID-19 epidemic: a systematic review[J]. J Biomed Res, 2020, 34(6): 422–430. doi: 10.7555/JBR.34.20200119
|
[14] |
Khajanchi S, Sarkar K. Forecasting the daily and cumulative number of cases for the COVID-19 pandemic in India[J]. Chaos, 2020, 30(7): 071101. doi: 10.1063/5.0016240
|
[15] |
Sarkar K, Khajanchi S, Nieto JJ. Modeling and forecasting the COVID-19 pandemic in India[J]. Chaos Solitons Fractals, 2020, 139: 110049. doi: 10.1016/j.chaos.2020.110049
|
[16] |
Samui P, Mondal J, Khajanchi S. A mathematical model for COVID-19 transmission dynamics with a case study of India[J]. Chaos Solitons Fractals, 2020, 140: 110173. doi: 10.1016/j.chaos.2020.110173
|
[17] |
Zhou T, Liu Q, Yang Z, et al. Preliminary prediction of the basic reproduction number of the Wuhan novel coronavirus 2019-nCoV[J]. J Evid Based Med, 2020, 13(1): 3–7. doi: 10.1111/jebm.12376
|
[18] |
Shin HY. A multi-stage SEIR(D) model of the COVID-19 epidemic in Korea[J]. Ann Med, 2021, 53(1): 1160–1170. doi: 10.1080/07853890.2021.1949490
|
[19] |
Wang YJ, Wang P, Zhang SD, et al. Uncertainty modeling of a modified SEIR epidemic model for COVID-19[J]. Biology, 2022, 11(8): 1157. doi: 10.3390/biology11081157
|
[20] |
Poonia RC, Saudagar AKJ, Altameem A, et al. An enhanced SEIR model for prediction of COVID-19 with vaccination effect[J]. Life, 2022, 12(5): 647. doi: 10.3390/life12050647
|
[21] |
Abbey H. An examination of the Reed-Frost theory of epidemics[J]. Hum Biol, 1952, 24(3): 201–233. https://pubmed.ncbi.nlm.nih.gov/12990130/
|
[22] |
Abrams S, Wambua J, Santermans E, et al. Modelling the early phase of the Belgian COVID-19 epidemic using a stochastic compartmental model and studying its implied future trajectories[J]. Epidemics, 2021, 35: 100449. doi: 10.1016/j.epidem.2021.100449
|
[23] |
Mamis K, Farazmand M. Stochastic compartmental models of the COVID-19 pandemic must have temporally correlated uncertainties[J]. Proc R Soc Math Phys Eng Sci, 2023, 479(2269): 20220568. doi: 10.1098/rspa.2022.0568
|
[24] |
Getz WM, Salter R, Muellerklein O, et al. Modeling epidemics: A primer and Numerus Model Builder implementation[J]. Epidemics, 2018, 25: 9–19. doi: 10.1016/j.epidem.2018.06.001
|
[25] |
Rutter H, Savona N, Glonti K, et al. The need for a complex systems model of evidence for public health[J]. Lancet, 2017, 390(10112): 2602–2604. doi: 10.1016/S0140-6736(17)31267-9
|
[26] |
Chang SL, Harding N, Zachreson C, et al. Modelling transmission and control of the COVID-19 pandemic in Australia[J]. Nat Commun, 2020, 11(1): 5710. doi: 10.1038/s41467-020-19393-6
|
[27] |
Thompson J, Wattam S. Estimating the impact of interventions against COVID-19: From lockdown to vaccination[J]. PLoS One, 2021, 16(12): e0261330. doi: 10.1371/journal.pone.0261330
|
[28] |
Shattock AJ, Le Rutte EA, Dünner RP, et al. Impact of vaccination and non-pharmaceutical interventions on SARS-CoV-2 dynamics in Switzerland[J]. Epidemics, 2022, 38: 100535. doi: 10.1016/j.epidem.2021.100535
|
[29] |
Butcher JC. The numerical analysis of ordinary differential equations: Runge-Kutta and general linear methods[M]. Chichester: Wiley, 1987.
|
[30] |
Grimm V, Mengel F, Schmidt M. Extensions of the SEIR model for the analysis of tailored social distancing and tracing approaches to cope with COVID-19[J]. Sci Rep, 2021, 11(1): 4214. doi: 10.1038/s41598-021-83540-2
|
[31] |
Khajanchi S, Sarkar K, Mondal J, et al. Mathematical modeling of the COVID-19 pandemic with intervention strategies[J]. Results Phys, 2021, 25: 104285. doi: 10.1016/j.rinp.2021.104285
|
[32] |
Saha S, Samanta GP. Modelling the role of optimal social distancing on disease prevalence of COVID-19 epidemic[J]. Int J Dyn Control, 2021, 9(3): 1053–1077. doi: 10.1007/s40435-020-00721-z
|
[33] |
Saha S, Samanta G, Nieto JJ. Impact of optimal vaccination and social distancing on COVID-19 pandemic[J]. Math Comput Simul, 2022, 200: 285–314. doi: 10.1016/j.matcom.2022.04.025
|
[34] |
Rai RK, Tiwari PK, Khajanchi S. Modeling the influence of vaccination coverage on the dynamics of COVID-19 pandemic with the effect of environmental contamination[J]. Math Methods Appl Sci, 2023, 46(12): 12425–12453. doi: 10.1002/mma.9185
|
[35] |
Saha S, Samanta GP, Nieto JJ. Epidemic model of COVID-19 outbreak by inducing behavioural response in population[J]. Nonlinear Dyn, 2020, 102(1): 455–487. doi: 10.1007/s11071-020-05896-w
|
[36] |
Saha S, Dutta P, Samanta G. Dynamical behavior of SIRS model incorporating government action and public response in presence of deterministic and fluctuating environments[J]. Chaos Solitons Fractals, 2022, 164: 112643. doi: 10.1016/j.chaos.2022.112643
|
[37] |
Khajanchi S, Sarkar K, Mondal J. Dynamics of the COVID-19 pandemic in India[EB/OL]. [2023-06-01]. https://arxiv.org/abs/2005.06286.
|
[38] |
Rai RK, Khajanchi S, Tiwari PK, et al. Impact of social media advertisements on the transmission dynamics of COVID-19 pandemic in India[J]. J Appl Math Comput, 2022, 68(1): 19–44. doi: 10.1007/s12190-021-01507-y
|
[39] |
Sarkar K, Mondal J, Khajanchi S. How do the contaminated environment influence the transmission dynamics of COVID-19 pandemic?[J]. Eur Phys J Spec Top, 2022, 231(18): 3697–3716. doi: 10.1140/epjs/s11734-022-00648-w
|
[40] |
Petzold L. Automatic selection of methods for solving stiff and nonstiff systems of ordinary differential equations[J]. SIAM J Sci Stat Comput, 1983, 4(1): 136–148. doi: 10.1137/0904010
|
[41] |
Hunter E, Namee BM, Kelleher J. A hybrid agent-based and equation based model for the spread of infectious diseases[J]. J Artif Soc Soc Simul, 2020, 23(4): 14. doi: 10.18564/jasss.4421
|
[42] |
van den Driessche P. Reproduction numbers of infectious disease models[J]. Infect Dis Model, 2017, 2(3): 288–303. doi: 10.1016/j.idm.2017.06.002
|
[43] |
Khajanchi S, Bera S, Roy TK. Mathematical analysis of the global dynamics of a HTLV-I infection model, considering the role of cytotoxic T-lymphocytes[J]. Math Comput Simul, 2021, 180: 354–378. doi: 10.1016/j.matcom.2020.09.009
|
[44] |
Soetaert K, Petzoldt T, Setzer RW. Solving differential equations in R: Package deSolve[J]. J Stat Softw, 2010, 33(9): 1–25. doi: 10.18637/jss.v033.i09
|
[45] |
R Core Team. R: a language and environment for statistical computing[EB/OL]. [2022-06-01]. https://www.gbif.org/zh/tool/81287/r-a-language-and-environment-for-statistical-computing.
|
[46] |
Allen LJS. An introduction to stochastic epidemic models[M]//Brauer F, Driessche P, Wu J. Mathematical Epidemiology. Berlin: Springer, 2008: 81–130.
|
[1] | Jiao Chen, Can Zhao, Yingzi Huang, Hao Wang, Xiang Lu, Wei Zhao, Wei Gao. Malnutrition predicts poor outcomes in diabetic COVID-19 patients in Huangshi, Hubei[J]. The Journal of Biomedical Research, 2022, 36(1): 32-38. DOI: 10.7555/JBR.35.20210083 |
[2] | Chen Wei, Hu Zhiliang, Yi Changhua, Chi Yun, Xiong Qingfang, Tan Chee Wah, Yi Yongxiang, Wang Lin-Fa. An unusual COVID-19 case with over four months of viral shedding in the presence of low neutralizing antibodies: a case report[J]. The Journal of Biomedical Research, 2020, 34(6): 470-474. DOI: 10.7555/JBR.34.20200099 |
[3] | Yun Yangfang, Song Hengyi, Ji Yin, Huo Da, Han Feng, Li Fei, Jiang Nan. Identification of therapeutic drugs against COVID-19 through computational investigation on drug repurposing and structural modification[J]. The Journal of Biomedical Research, 2020, 34(6): 458-469. DOI: 10.7555/JBR.34.20200044 |
[4] | Pan Wei, Miyazaki Yasuo, Tsumura Hideyo, Miyazaki Emi, Yang Wei. Identification of county-level health factors associated with COVID-19 mortality in the United States[J]. The Journal of Biomedical Research, 2020, 34(6): 437-445. DOI: 10.7555/JBR.34.20200129 |
[5] | Gorzalski Andrew J., Hartley Paul, Laverdure Chris, Kerwin Heather, Tillett Richard, Verma Subhash, Rossetto Cyprian, Morzunov Sergey, Van Hooser Stephanie, Pandori Mark W.. Characteristics of viral specimens collected from asymptomatic and fatal cases of COVID-19[J]. The Journal of Biomedical Research, 2020, 34(6): 431-436. DOI: 10.7555/JBR.34.20200110 |
[6] | Guan Jinxing, Wei Yongyue, Zhao Yang, Chen Feng. Modeling the transmission dynamics of COVID-19 epidemic: a systematic review[J]. The Journal of Biomedical Research, 2020, 34(6): 422-430. DOI: 10.7555/JBR.34.20200119 |
[7] | Mehta Neha, Qiao Renli. Medical management of COVID-19 clinic[J]. The Journal of Biomedical Research, 2020, 34(6): 416-421. DOI: 10.7555/JBR.34.20200118 |
[8] | Liu Shuying, Lu Shan. Antibody responses in COVID-19 patients[J]. The Journal of Biomedical Research, 2020, 34(6): 410-415. DOI: 10.7555/JBR.34.20200134 |
[9] | Slonim Anthony D., See Helen, Slonim Sheila. Challenges confronting rural hospitals accentuated during COVID-19[J]. The Journal of Biomedical Research, 2020, 34(6): 397-409. DOI: 10.7555/JBR.34.20200112 |
[10] | Yang Wei. Editorial commentary on special issue of COVID-19 pandemic[J]. The Journal of Biomedical Research, 2020, 34(6): 395-396. DOI: 10.7555/JBR.34.20200701 |
Scenarios | Model | Before calibration | After calibration | |||||||
RMSE0 | MAE0 | RMSE1 | MAE1 | RMSE0 | MAE0 | RMSE1 | MAE1 | |||
1 | Euler | 1.528 | 1.041 | 0.000 | 0.000 | 0.781 | 0.610 | 0.000 | 0.000 | |
SCM | 1.951 | 1.296 | 0.781 | 0.502 | 1.082 | 0.747 | 0.879 | 0.573 | ||
ABM | 2.060 | 1.370 | 0.873 | 0.568 | 1.031 | 0.715 | 0.798 | 0.523 | ||
CAMM | 2.247 | 1.502 | 1.051 | 0.700 | 1.125 | 0.774 | 0.940 | 0.612 | ||
2 | Euler | 1.377 | 0.955 | 0.000 | 0.000 | 0.409 | 0.301 | 0.000 | 0.000 | |
SCM | 1.377 | 0.955 | 0.001 | 0.001 | 0.408 | 0.300 | 0.003 | 0.002 | ||
CAMM | 1.376 | 0.954 | 0.001 | 0.000 | 0.408 | 0.300 | 0.002 | 0.001 | ||
3 | Euler | 1.132 | 0.637 | 0.000 | 0.000 | 0.418 | 0.305 | 0.000 | 0.000 | |
SCM | 1.283 | 0.726 | 0.179 | 0.102 | 0.396 | 0.286 | 0.098 | 0.053 | ||
ABM | 1.187 | 0.670 | 0.092 | 0.053 | 0.387 | 0.281 | 0.068 | 0.039 | ||
CAMM | 1.188 | 0.671 | 0.095 | 0.052 | 0.410 | 0.291 | 0.145 | 0.081 | ||
4 | Euler | 0.551 | 0.277 | 0.000 | 0.000 | 0.367 | 0.183 | 0.000 | 0.000 | |
SCM | 0.550 | 0.276 | 0.001 | 0.000 | 0.369 | 0.184 | 0.003 | 0.002 | ||
CAMM | 0.550 | 0.276 | 0.000 | 0.000 | 0.365 | 0.182 | 0.002 | 0.001 | ||
RMSE0 and MAE0 used the numerical solution of the CM model by the LSODA method as the benchmark, and RMSE1 and MAE1 used the numerical solution of the CM model by the Euler method as the benchmark. Scenario 1 was set to simulate the SEIR model with experiment data. Scenario 2 was set to simulate the SEIR model with real world data. Scenario 3 was set to simulate the SEIRD model with real world data. Scenario 4 was set to simulate the SEIHR model with real world data. Abbreviations: RMSE, root of mean squared error; MAE, mean absolute error; CM, compartment model; SCM, stochastic compartment model; CAMM, compartment-agent mixed model; ABM, agent-based model. |