On COVID-19 outbreaks predictions: Issues on stability, parameter sensitivity, and precision

This is a recently published collaborative article in Stochastic Analysis & Applications, Aug 2020, on the problems of calculating infection rates and outbreaks, etc, of Covid 19, by colleagues from Austria, Chile, Slovakia and USA, namely: M. Stehlik, J. Kiselak, M. Alejandro Dinamarca, Y. Li & Y. Ying.”

1. Introduction

Coronavirus Disease 2019 (COVID-19) produced by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is a disease first identified in late 2019 and declared pandemic on March 11. COVID-19 is an international, national and public health emergency [1, 2]. Nevertheless, other important contagious routes such as fecal-oral transmission, has been reported [3]. Fever, cough, sore throat, fatigue, and shortness of breath are characteristics symptoms of COVID-19, nevertheless, other such as, diarrhea, loss of smell and taste, and headache, associated with other organs and systems have been recently accepted. Symptoms may appear 2–14 days after exposure. According to [4], the experts extracted data regarding 1,099 patients with laboratory confirmed COVID-19 from 552 hospitals in 30 provinces. We can see that 1.18% of the infected people had direct contact with wild animals, 31.30% had been to Wuhan, and 71.80% had contact with people from Wuhan. The median incubation period for the virus is 3 days (range 0–24 days). In addition, studies have found that COVID-19 spreads rapidly from person to person.

Here we formulate selected statistical, mathematical, and real-life challenges of COVID-19 outbreak prediction. In particular we justify an exponential curve from microbiological point of view as a reasonable model for outbreak of COVID-19 epidemics. We need to point out that information criteria for estimation and prediction are not necessarily reaching their maximums/optimums on the same sampling schemes. Such ill posed information relationships can be formulated in the form of 1-st kind Fredholm equations. Under reasonable regularities and simplicity of the underling process, e.g., autoregressive statistical models, one can apply FIRCEP methodology [5] to obtain such relationships. Such kind of ill-posed information relationships can be formulated also from the perspective of information divergences, and ill-posedness can be translated to normal language as a not-avoidable imprecision of any model with respect to underlying parameter estimation/prediction.

The manuscript is organized as follows. In the following Section 2 we introduce some important information about ill-posed problems. We illustrate ill-posedness on sensitivity of parameter b in a simple exponential growth by using reported data from Iowa State, USA. From economical point of view, for the “restart of country” the correct prediction and estimation of exponential shape of COVID-19 curve plays an important role, this problem has been well visible in Chile. Chile’s economic model is neoliberal doctrine with an important role for the market. The pension system is managed by private operators, the economy is dependent on exports of raw materials such as copper, fishing and agriculture, and medium, small, and micro (family) enterprises. Thus facing the pandemic with strong quarantine measures (e.g., restricted mobility and limited public and private economic activities) is generally a complex problem. According to PAHO (Pan American Health Organization), since the pandemic came to Chile, today (July 1, 2020) there have been monitored 319.493 positive cases and 7.069 deaths. Chile has close to 18 millions of habitants. In this scenario the exponential slope of Chilean COVID-19 growth (positive accumulates cases) can be used as an analytic tool for the proper scaling of governmental policies. On the other hand, the behavior of the exponential growth of COVID-19 in Chile (positive cases) and in particular the slope or rate of contagion, can be used as one of indices of COVID-19 impact on the country’s economy. Since entering the exponential phase, the government has emitted actions and policies of economic aid to face the rise of unemployment to a historic 11.2% (Chilean Institute of Statistics INE) and the fall in the monthly index of economic activity (IMACEC) to −15,3 in May 2020 (Banco Central de Chile). IMACEC is predictive statistics of the per capita gross internal product (PIB) in Chile.

Above mentioned societal issues naturally justify the importance to study the stability and sensitivity of underlying dynamics of an individual outbreak models to the input data and estimated parameters. Also, we still have to analyze some special situations; for example, more samples can be detected daily in the later period, which will also cause the growth of number of infected cases. Moreover, the following observations shall be pointed out: each country is having a different COVID-19 approach and different modeling. Some of countries use discretization of SIR (Susceptible, Infectious, or Removed) model. But not each discretization will be convergent to the same solution of continuous SIR model. Moreover, several effects on equilibrium and stability of SIR has been found, see e.g., [6]. Data from COVID-19 outbreaks are briefly discussed in Section 3. Virological backgrounds for exponential shaped growth curves of COVID-19 outbreak are given in Section 4. In Section 5, we provide a parameter sensitivity study, both from theoretical and empirical perspective, for SIR model without vital dynamics. In the last Section 6, we give concluding remarks and overview of selected important issues for the proper modeling of COVID-19 outbreak.

2. A sensitivity of exponential model to the input/starting parameter

Ill posed problems and parameter estimation are difficult issues for COVID-19 growth models. The random perturbation of parameters can have serious effects on the quality of modeling. As said by Paul Krée in the Preface of [7]: “Random phenomena has increasing importance in Engineering and Physics, therefore theoretical results are strongly needed. But there is a gap between the probability theory used by mathematicians and practitioners. Two very different languages have been generated in this way…”

One can indeed observe several discrepancies between COVID-19 policy makers and modelers, possibly caused by the usage of different languages. In principle simple growth models (like the exponential one aebtaebt or SIR model) looks to be attractive for a straightforward implementation with ad-hoc disretization schemes and various estimation techniques. Thus sensitivity of these models to the principal parameters, e.g. b in case of exponential growth or β in the case of SIR may deceive its user. An independent observer may wonder why the same mistakes in the estimation of outbreaks has been repeated again and again in various countries by using the same models, even when time shift has allowed some possibilities to learn from the mistakes of others. On the other hand we did not want to simplify the whole situation and overemphasize the theory of calibration, estimation and regularization. But more caution is needed in these areas. In the next subsection, we introduce a distributed dynamical system from the stability perspective.

2.1. Distributed dynamical systems

Distributed parameter systems are everywhere. Because they are difficult to deal with, engineers generally avoid partial differential equations. They reason that lumped parameter models will generally suffice and in recent years, finite element analysis has provided a real verification of that idea and the tools to work with. However, there are still some benefits from thinking things in terms of continuum mechanics. The dynamical distributed systems can be very useful for so called inverse problems. Estimation of various flow and mass transport parameters can be seen as the inverse problem of groundwater modeling (see e.g., [8]).

The mathematical term well-posed problem stems from a definition given by Hadamard (see [9]). He believed that mathematical models of physical phenomena should have the properties that

A solution exists
The solution is unique
The solution depends continuously on the data, in some reasonable topology.

Examples of archetypal well-posed problems include the Dirichlet problem for Laplace’s equation, and the heat equation with specified initial conditions. These might be regarded as “natural” problems in that there are physical processes that solve these problems. By contrast the backwards heat equation, deducing a previous distribution of temperature from final data is not well-posed in that the solution is highly sensitive to changes in the final data. Problems that are not well-posed in the sense of Hadamard are termed ill-posed. Inverse problems are often ill-posed. Such continuum problems must often be discretized in order to obtain a numerical solution. While in terms of functional analysis such problems are typically continuous, they may suffer from numerical instability when solved with finite precision, or from errors in the data. A measure of well-posedness of a discrete linear problem is the condition number. If a problem is well-posed, then it stands a good chance of solution on a computer using a stable algorithm. If it is not well-posed, it needs to be re-formulated for a numerical treatment. Typically this involves adding some additional assumptions, such as smoothness of the solution. Such a process is known as regularization.

To illustrate mathematical inverse problems, let us consider differentiation. We can construct a simple example with sequence fn,Δ(x)=f(x)+Δ sin (nxΔ),fn,Δ(x)=f(x)+Δ sin (nxΔ), where f and fn,Δfn,Δ are the exact and perturbed data. For an arbitrary small data error Δ,Δ, the error in the result can be arbitrary large: the derivative does not depend continuously on the data with respect to the uniform norm. Following [10] we have demonstrated some effects that are typical for ill-posed problems, i.e.,

amplification of high frequency errors;
restoration of stability by using a-priory information;
two error terms of different nature, one for the approximation; error, the other one for the propagation of the data error, adding up to a total error;
loss of information even under optimal circumstances;
the appearance of an optimal discretization parameter, whose choice depends on a-priori information.

2.2. Parameter stability and sensitivity

Modeling of evolution of infectious disease is important since it can helps to predict the future course of an outbreak and to evaluate strategies to control an epidemic. Naturally, models are only as good as the assumptions on which they are based. We believe that parametric modeling is the most common form of modeling used (represented often by a distributed dynamical system). While it is rather straightforward to test the appropriateness of parameters, it can be more difficult to test the validity of the general mathematical form of a model. What is worse, even in the positive case, the sensitivity of the solution to parameter changes (initial conditions included) must be taken into account. In general, it doesn’t really matter if it is deterministic or stochastic, continuous or a discrete model. Similar considerations hold essentially to all of them.

Let X be a smooth manifold. The mapping ϕ:X×R→Xϕ:X×R→X (ϕ:X×Z→Xϕ:X×Z→X) is called the continuous (discrete) C^k-dynamical system on X if

ϕ(x,0)=xϕ(x,0)=x for all x∈X;x∈X;
mapping ϕt:X→X, ϕt(x)=ϕ(x,t)ϕt:X→X, ϕt(x)=ϕ(x,t) is the C^k -diffeomorphisms for all t∈R (Z);t∈R (Z);
ϕt+s=ϕt°ϕsϕt+s=ϕt°ϕs for all t,s∈R (Z).t,s∈R (Z).

ϕ is often called the evolution function of the dynamical system and X a phase (state) space. Most common construction of dynamical systems is given by initial value problems of ordinary differential equations (or difference equations for discrete case). We illustrate it in the next on typical (evolving in time t) parametrized ODE systemẋ =f(x,θ,t),ẋ=f(x,θ,t),(1)where x∈Rnx∈Rn is the state vector, θ∈Rpθ∈Rp is the vector of parameters and f:Rn+p+1→Rnf:Rn+p+1→Rn represents the dynamics. Naturally, initial state is represented by the initial condition ¹x(t0)=x0.x(t0)=x0.(2)

Here indisputable fact is that this condition may depend on the parameters, i.e., x0=x0(θ).x0=x0(θ). The above representation subsumes the case where the initial condition may itself be seen as a parameter. We assume here that all dependencies are smooth enough to do analysis. Notice that the solution of (2) is parametrized evolution function x(x0,t0;θ;t)=:ϕt(x0;θ),x(x0,t0;θ;t)=:ϕt(x0;θ), i.e., a parametrized dynamical system. Notice that in some literature a dynamical system is triple (T,X,ϕ),T(T,X,ϕ),T is a monoid (usually RR or ZZ).

2.2.1. Stability

Here we assume that x0x0 does not depend on θθ (however it might depend on other parameter ββ). Solution x(t;θ,t0,x0)x(t;θ,t0,x0) of given model is stable ² if for every (small) ϵ>0,ϵ>0, there exists a δ>0δ>0 such that having initial conditions within distance δ i.e., ||x0−x1||<δ||x0−x1||<δ remains within distance ϵϵ i.e., ||x(t;θ,t0,x0)−x(t;θ,t0,x1)||<ϵ||x(t;θ,t0,x0)−x(t;θ,t0,x1)||<ϵ for all t≥t0.t≥t0. Notice that δ can depend only on ϵ.ϵ. That means a resistance to change in time (the trajectories do not change too much under small perturbations). The opposite situation means instability. Typical example of instable system is an exponential growth, which is a natural model of COVID-19 outbreak. Thus the main purpose of developing stability theory is to examine dynamic responses of a system to disturbances as the time approaches infinity. But in practical situations this is not the goal. The predictions we are interested in COVID-19 outbreak models, are short-term predictions, e.g., 2-weeks. These short-term predictions motivate the next notion, sensitivity. Here natural question arises, how the instability relates to sensitivity.

2.2.2. Sensitivity

Sensitivity analysis [11] is used to determine how the parameters of a model influence its outputs, i.e., to study of how the uncertainty in the output can depend on different sources of uncertainty in its inputs. If the observables are highly sensitive to perturbations in certain parameters then these parameters are likely to be identifiable. There is sometimes hard to answer the question how the magnitude of the sensitivities can be interpreted. What is also important to mention is that in linear case a nonzero right hand side (nonhomogeneous system) might influence sensitivity in contrast to stability. Due to smoothness in sensitivity analysis methods we use so called n × p matrix of sensitivity functionsS=∇θx,S=∇θx,(3)which can be understood as a local sensitivity measure. I.e., S_ij is related to sensitivity of x_i to parameter θ_j . It can be shown by the chain rule that it satisfies the following ODE matrix systemṠ =(∇xf) S+∇θfṠ=(∇xf) S+∇θf(4)with initial condition S(t0)=∇θx0(θ).S(t0)=∇θx0(θ). It is very helpful especially when the explicit form of solution is not known. I.e., we do need to solve dynamical system to study its sensitivity. S can also be used to study the evolution of the state covariance matrix of the joint vector of the state and the parameters under the assumption of multivariate Normal distribution and a first-order discretization of Equation (4).

It is often better to explain arising differences as a percentage. Here the elasticity can be used. For simplicity now assume f:R→R.f:R→R. The ratio of the relative (percentage) change in the function’s output with respect to the relative change in its input is called elasticity and when considering a smooth function f of a variable at point a it is defined asEf(a)=af(a)f'(a)=d ln f(a)d ln alimx→a1−f(x)f(a)1−xa≈%Δf(a)%Δa.Ef(a)=af(a)f′(a)=d ln f(a)d ln alimx→a1−f(x)f(a)1−xa≈%Δf(a)%Δa.

Clearly, the elasticity ³ can also be defined if the input and/or output is consistently negative (away from zero). The elasticity of a function f is a constant α if and only if the function has the form f(x)=Cxα,f(x)=Cxα, i.e., a power functions. There exist also generalizations to multi-input-multi-output cases in the literature. We also use notation EcfEcf for the elasticity of function f w.r.t to parameter c.

2.3 A Malthusian growth model (a simple exponential growth model)

This model is the unique solution of (1) with (2) in the formẋ =b x,ẋ=b x,(5)x(t0)=a,x(t0)=a,(6)

i.e., n=1, p=2,n=1, p=2, and θ=(a,b)T.θ=(a,b)T. We admit here only b > 0 and a > 0.Remark 2.1.

Of course in this case one can obtain all forthcoming information directly from explicit form of the solutionx(t)=a eb (t−t0)x(t)=a eb (t−t0)(7)since it is known.

From the stability point of view it is clear that here we deal with instability. This follows directly from the fact that derivative is positive. But we are more interested in sensitivity. This is because for COVID-19 outbreak estimation and prediction we have to know how the evolution behaves e.g., in two weeks, not in a long-term periods like twelve months. Sensitivity system of Equation (3) has the formṠ 1=b S1,Ṡ 2=b S2+x,Ṡ1=b S1,Ṡ2=b S2+x,(8)with S1(t0)=∂x∂a(t0)=1S1(t0)=∂x∂a(t0)=1 and S2(t0)=∂x∂b(t0)=0S2(t0)=∂x∂b(t0)=0 following from (6).

We do not need to solve system (8) either. Clearly one can find first integral as followsdxx=dS1S1,dxx=dS1S1,which is equivalent to S1=k x.S1=k x. Moreover, after appropriate multiplication and addition of equations we get Ṡ 2x−S2ẋ =x2,Ṡ2x−S2ẋ=x2, which yieldsddt(S2x)=1.ddt(S2x)=1.

Thus using initial states we have S1=xaS1=xa and S2=x(t−t0).S2=x(t−t0). Now we want to express the change in the output quantity as a percentage of the nominal value of parameters. Often we can computeEax=ax ∂x∂a=a S1x, Ebx=bx ∂x∂b=b S2xEax=ax ∂x∂a=a S1x, Ebx=bx ∂x∂b=b S2xeven if we do not know explicit form of a solution. Indeed, thanks to result above we have Eax=1Eax=1 and Ebx=b(t−t0).Ebx=b(t−t0). From percentage sensitivity functions we can conclude

When parameter a is changed by 1%, the state change is also permanently 1%.
When parameter b is changed by p%, the percentage change of state x increases with time linearly. The change in status is linear function of its nominal value, i.e., p100b(t−t0)%.p100b(t−t0)%.

For the better illustration see also Figures 1–4. To perform sensitivity analyses on the population sizes with respect to the uncertain parameters one can use a full factorial design. One can read a lot from graphical representation of sensitivity indices. In the lower subplot of Figure 5, the sensitivity indices (from package multisensi with design.args = list(b = c(0.08,0.12,0.16),a = c(82,122,162))) for the main effects and the first-order interactions at time t are given. Their lengths are normalized and differentiated by colors along the vertical bar. One would might to deduce that at first week the population size is sensitive to the main effect of a. However, the upper subplot illustrates how output quantiles (the extreme (tirets), inter-quartile (grey) and median (bold line) output values at all time steps) vary along the time steps. Thus, we can avoid over-interpretation of the sensitivity indices since the variability between simulations is low at these times.

Figure 1. A difference in parameter implies more than a double difference in output.

1. Introduction

2. A sensitivity of exponential model to the input/starting parameter

2.1. Distributed dynamical systems

2.2. Parameter stability and sensitivity

2.2.1. Stability

2.2.2. Sensitivity

2.3 A Malthusian growth model (a simple exponential growth model)

3. On COVID-19 outbreaks

Table 1. Age groups with confirmed cases.

Table 2. Age groups with deaths.

3.1. Chile: Heterogeneity rules the country

4. Virological backgrounds for exponential shaped growth curves for COVID-19 outbreak

4.1. Considerations and assumptions for predictive models of the evolution of COVID-19 in a population

4.1.1. The SARS-COV-2 contagion curves reflect microbial growth under controlled conditions

5. The SIR model without vital dynamics

6. Conclusions and discussion

Acknowledgments

Notes