Search for a command to run...
Introduction. Biokinetic models are used to elucidate the characteristics of chemical and physical processes occurring in living organisms. In various fields of biology, including ecology, formal methods for constructing biokinetic mathematical models are well-developed. The most common models comprise systems of differential equations. Two types of problems can be formulated for such systems: direct and inverse problems. The direct problem involves finding a solution to the system given specific parameter values, initial conditions, or boundary conditions. The inverse problem most often consists of model parameter identification: finding numerical parameter values for which the system's solution best fits the available experimental (observed) data. Typically, realistic and practically relevant systems of equations (mathematical models) lack analytical solutions. While numerical methods for solving direct problems are well-developed, universal and effective methods for solving inverse problems still seemingly do not exist. The aim of this lecture is, through concrete examples, to introduce students to some frequently used methods for the parametric identification of biokinetic mathematical models. Problem Statement. A typical biokinetic problem is the parametric identification for a system of several interacting components whose concentrations change over time. For example, a system of microorganisms consuming a substrate can be considered. As a result of the interaction between microorganisms and the substrate, microbial concentration will increase over time, while substrate concentration will decrease. If component concentrations at various time points are known, to determining unknown parameters one can minimize a function expressing the sum of squared deviations between observed and predicted values. The primary requirement for the sought model parameters is that their variation must have a noticeable effect on this function, called the residual function. These problems are known as well-conditioned and will be considered in this lecture. There are several types of residual functions. For example, they may incorporate a scaling parameter, also called a weighting factor or weight. Weights determine the importance of including certain deviations between experimental data and calculated values of variables into the corresponding measure of disagreement. The larger the weights, the more important the corresponding deviation (i.e. data point) is considered, which should influence the results of the identification problem. Weights are often set as the inverse of the variance of observed values. Another type of scaling parameter is the concentration of a characteristic component. In this case, the sum of squared relative deviations, rather than simple deviations, is minimized. Besides the least squares criterion, other distance measures between observed data and predicted values can be used, such as the criterion of absolute deviations or the Chebyshev (minimax) criterion. Methods for Solving Parameter Identification Problems. The main issue of parameter identification lies in the fact that in real-world problems, the residual function can have several local minima. Unfortunately, there are currently no universal and effective methods for finding the global minimum of a function when it has multiple local minima. Consequently, several types of parametric identification problems are distinguished, for which specific computational methods are applied. Two main types are based on whether analytical expressions exist for functions describing the time-dependent properties of the system. If such expressions exist, linearization of input data is applied in the simplest cases, while nonlinear approaches are used in more complex ones. 1. Linearization is the transformation of nonlinear formulas into a linear expression through specific transformations. Models that can be transformed into linear ones are called intrinsically linear models, as opposed to intrinsically nonlinear models. As a first example, an intrinsically linear function representing the simplest mathematical model expressed by a first-order linear differential equation is considered. Logarithmic transformation allows rewriting this equation in linear form. Special weights must be used to account for the specific transformation applied to the initial nonlinear equation. As a second example, a system of equations describing microbial biomass growth during the consumption of 4-chloroaniline is considered. The method of approximate linearizing transformation is applied for parameter identification in this model. 2. Substitution method. Using the aforementioned 4-chloroaniline consumption model as an example, it is shown how the substitution method allows obtaining a fully satisfactory description of the experimental results by the model within a certain concentration range. In this case, just two points from the original data were judiciously selected for calculations. Furthermore, it is demonstrated how the substitution method can be implemented to describe 4-chloroaniline consumption using more realistically justified functions, specifically the Monod equation. In this case, the function describing 4-chloroaniline concentration dependence on time becomes implicit, and three points from the original data are required to identify the model parameters. Notably, obtained parameters result in satisfactory model predictions across the entire range of 4-chloroaniline concentrations. The substitution method can also be implemented in cases when differential equations cannot be solved analytically. In such cases, function extrema, where derivatives are zero by definition, can be used. Substituting experimental data at extremum points into the system equations allows determining at least some model parameters or their ratios. This approach is illustrated using a model of biomass growth in a continuous bioreactor (chemostat), accounting for the lag due to the ribosome synthesis. 3. Elimination of the independent variable. Most often for biokinetic models, the eliminated independent variable is time. It is demonstrated how eliminating time by dividing one differential equation of the system by another allows computing one parameter for the aforementioned model of biomass growth in a continuous bioreactor. Another fairly simple and limited-use method for solving inverse problems is model simplification. In some cases, a complex model can be well approximated by a simple model that allows analytical integration in a form to which a straightforward linearizing transformation can be applied. It is shown how this approach can be applied to the aforementioned model of biomass growth in a continuous bioreactor. 4. Representing the model equations as a Taylor series. This approach is used when dependent variables change almost linearly or exponentially over a certain time interval. It is demonstrated how this approach is applied to the aforementioned model of biomass growth in a continuous bioreactor, yielding two dependencies between model coefficients. 5. Linearization can also be applied to complex systems of nonlinear differential equations. There are no universal algorithms for applying this method; a creative approach is necessary. It is shown how linearization is applied to the aforementioned model of biomass growth in a continuous bioreactor. All the variables are divided by their maximum experimental values, after which new notations for variables and parameters are introduced. The concentration equation is then rewritten to enable partial integration with the new variables. Then linear regression is applied to the left-hand side (normalized concentration) to identify new introduced parameters, while numerical integration using observed concentrations is applied for the right-hand side. As a result, another model parameter is found. 6. Using the two found parameters and three parameter ratios, the remaining parameter of the model of biomass growth in a continuous bioreactor is determined by means of one-dimensional nonlinear minimization of the residual function. The residual minimum is searched using a MATLAB program. The sum of absolute deviations between the observed and predicted values was used as the residual function, with the corresponding average concentration values as weights. It is shown that the obtained parameter is determined with some ambiguity related to the ill-posed nature of the problem. Comparing the results of identifying different parameters shows that parameters can be determined with significant error, reaching several tens of percent. Solving optimization problems in the MATLAB environment is described in detail including various minimization search functions, as well as settings regulating search accuracy, output of results, and other operation parameters. 7. Finally, it is shown how all parameters in the model of biomass growth in a continuous bioreactor could be identified simultaneously from the experimental data. This inverse problem is solved using multidimensional optimization via the fminsearch function in the MATLAB. The search was conducted in the region of non-negative parameter values. Parameter values obtained earlier by other methods were used as initial guess. It is demonstrated that experimental data can be equally well fitted by different sets of model parameters, which vary widely with relative errors of tens and hundreds of percent. This indicates the inherent ill-posed nature of the formulated problem (i.e. of the particular model used to describe the experimental data).
Published in: Environmental Dynamics and Global Climate Change
Volume 17, Issue 1, pp. 4-44
DOI: 10.18822/edgcc703758