Exploring Digital Twins of Nonlinear Systems through Meta-Modeling with Echo State Networks
Explorando
Gemelos Digitales de Sistemas
No Lineales vía Meta-Modelado con
Redes de Estado de Eco
Laisa Cristina Juffo
Campos
Departamento de Engenharia
Rural
Universidade Federal do
Espírito Santo
Alegre, Brasil
https://orcid.org/0009-0003-4427-0395
Ana Carolina Spindola Rangel Dias
Serviço
Nacional de Aprendizagem Industrial (SENAI)
Rio de Janeiro, Brasil
https://orcid.org/0000-0001-7376-0703
Wellington Betencurte da Silva
Departamento de Engenharia Rural
Universidade
Federal do Espírito Santo
Alegre, Brasil
https://orcid.org/0000-0003-2242-7825
Julio Cesar Sampaio Dutra
Departamento de Engenharia Rural
Universidade
Federal do Espírito Santo
Alegre, Brasil
Abstract—Effective process monitoring, and control rely on precise dynamic models that can capture the inherent nonlinearities of chemical systems. However, rigorous modeling of complex industrial processes can be computationally demanding. Meta modeling using machine learning methodologies offers a viable approach to generate computationally efficient surrogate representations. Specifically, Echo State Networks (ESNs) are a promising neural network approach for meta-modeling nonlinear dynamical systems. ESNs simplify training through fixed input weights while focusing learning on output weights. This study explores the development of ESN-based digital twins for a nonlinear dynamic process. An ESN is employed to construct a meta-model of a simulated continuously stirred tank reactor with biochemical kinetic. The network was trained on input-output data obtained from the simulation of an ordinary differential equation system, and the performance was evaluated both in-sample and out-of-sample. The results indicate that the ESN meta-model can successfully approximate the underlying dynamics, accurately capturing temporal evolution. A closed-loop digital twin deployment using the ESN surrogate also showed reliable behavior. This work presents initial steps toward developing digital twins of chemical processes using ESN-driven meta-modeling. The findings suggest ESNs can effectively generate computationally efficient surrogate representations of nonlinear dynamical systems. Such digital twins hold promise for online process monitoring and optimized control of industrial plants.
Keywords— Echo State Networks, Dynamic systems, Digital twins.
Resumo—El monitoreo y control efectivos de procesos dependen de modelos dinámicos precisos capaces de capturar las inherentes no linealidades de los sistemas químicos. Sin embargo, la modelización rigurosa de procesos industriales complejos puede ser computacionalmente exigente. La meta-modelización ofrece una estrategia viable para generar representaciones de sustitución computacionalmente eficientes. Las Redes de Estado de Eco (ESNs, por sus siglas en inglés) se distinguen como un enfoque prometedor de redes neuronales para la meta-modelización de sistemas dinámicos no lineales. Las ESNs simplifican el entrenamiento mediante pesos de entrada fijos, mientras se enfocan en el aprendizaje de los pesos de salida. Este estudio explora el desarrollo de gemelos digitales basados en ESN para procesos dinámicos no lineales. Se emplea una ESN para construir meta-modelos de una simulación de reactor de tanque agitado continuamente. La red se entrena con datos de entrada y salida del modelo riguroso y su rendimiento se evalúa tanto en muestra como fuera de muestra. Los resultados indican que el meta-modelo ESN puede aproximar con éxito la dinámica subyacente, capturando la evolución temporal con alta precisión. Además, una implementación de gemelo digital en bucle cerrado utilizando el sustituto ESN también mostró un comportamiento confiable. Este trabajo presenta los primeros pasos hacia el desarrollo de gemelos digitales de procesos químicos utilizando la meta-modelización impulsada por ESN. Los hallazgos sugieren que las ESN pueden generar eficazmente representaciones de sustitución computacionalmente eficientes de sistemas dinámicos no lineales. Tales gemelos digitales prometen para el monitoreo en línea de procesos y el control optimizado de plantas industriales.
Palabras clave—Redes de Estado de Eco, Sistemas Dinámicos, Gemeos Digitales.
In recent years, rapid technological progress has resulted in substantial enhancements across diverse sectors, notably in enhancing quality and safety within chemical processes. The ubiquitous incorporation of computers into process management has empowered control over various variables, including temperature, pressure, and chemical composition, thereby generating extensive and diverse data archives [1]. Design challenges necessitating intensive computational resources are increasingly prevalent in manufacturing industries [2]. Moreover, creating tools capable of analyzing data and constructing predictive mathematical models has become imperative for real-time process monitoring and control.
Creating rigorous models that accurately capture the dynamics and nonlinearity of real systems may be impractical at plant sites, where rapid responses are crucial. One practical approach is to utilize metamodeling strategies [2][3] to tackle the challenges inherent in process systems. Widely utilized across engineering, computer science, and optimization, these strategies involve developing simplified models that approximate the behavior of complex systems or processes [4]. These simplified representations, named meta-models or surrogate models, aim to balance accuracy and computational efficiency.
In this context, digital twins emerge as virtual representations capable of reflecting the behavior of physical systems in real-time, showing potential for online monitoring and process optimization [5]. By generating simplified yet computationally efficient models, digital twins enable dynamic data analytics and rapid decision-making to optimize industrial plant control and performance.
Expanding on recent data science research, metamodeling can draw upon various machine learning techniques [2][6]. Artificial Neural Networks (ANNs) are widely recognized for their ability to approximate complex functions [7]. Modeled after the functioning mechanism of biological neurons, ANNs comprise an input layer, a hidden layer housing artificial neurons in quantities necessary to represent the data, and an output layer. Additionally, ANNs possess memory storage and learning capabilities, making them particularly suitable for dynamic and nonlinear systems. This work precisely investigates this characteristic regarding applying neural meta-models for generating digital twins of complex chemical processes [8][9]. The aim is to develop computationally efficient representations that approximately capture the underlying dynamics of these systems.
Depending on the network architecture, various types of neural networks exist, including Feedforward Neural Networks (FNNs) and Recurrent Neural Networks (RNNs). RNNs offer computational advantages for dynamic process systems owing to their inherent feedback loops. However, training traditional RNNs can be complicated due to issues like the "vanishing gradient" problem [10]. To address this, [11] introduced the Echo State Network (ESN). Unlike traditional RNNs that adjust all synaptic weights, ESNs maintain fixed input and recurrent connections, focusing solely on training output connections through a relatively simple linear regression process. This approach circumvents the complexities of training recurrent connections and mitigates gradient-related challenges. Consequently, ESNs present an effective solution for harnessing the power of RNNs while mitigating training complexities, particularly in scenarios where efficient learning is essential.
This article proposes using an Echo State Network as a meta-model to approximate dynamic nonlinear models and evaluate the performance in a closed-loop application. This article assesses the potential of this approach for this purpose, analyzing the performance of different methodologies in modeling a CSTR reactor through the construction of a digital twin. Section 2 presents a brief background on the metamodeling problem. Section 3 elaborates on the case study based on a simulated bioreactor and details the data acquisition procedure. The theory, rationale, and construction of the Echo State Network are described in Section 4, followed by the discussion of simulation results.
The contribution of this article lies in presenting initial steps towards developing digital twins of chemical processes using ESN-driven meta-modeling. By demonstrating the efficacy of ESNs in generating computationally efficient surrogate representations of a classical nonlinear dynamical system, this work opens space for online process monitoring and optimized control of industrial plants.
A meta-model (or surrogate model) can be conceived as a "model of a model" [6], functioning as a simplified representation of a high-fidelity simulation model [12]. It emulates the response by delineating the relationship between inputs () and outputs () based on data acquired with known precision or uncertainty [13]. Metamodeling's importance lies in its ability to balance accuracy and computational efficiency. Hence, metamodeling emerges as an essential approach to navigating real-world systems' intricacies, especially those characterized by nonlinear relationships, numerous variables, and complex behaviors.
In industrial settings, meta-models are employed for tasks necessitating the establishment of a (complex) relationship between the inputs and outputs of a process system. This relationship can be encapsulated by an extended meta-model equation that incorporates the feedback signal (1):
|
(1) |
Where represents the current output, denotes the current inputs, is the previous output (feedback signal), is the relationship incorporating inputs and feedback, and represents error or uncertainty in the meta-model prediction.
By offering a simplified representation of burdensome simulations, meta-models facilitate quicker evaluations and decision-making - crucial aspects in industries that demand real-time solutions. This approach enables approaching complex systems without needing resource-intensive full-scale simulations, which can be computationally demanding and time-consuming. Some commonly used metamodeling techniques encompass polynomial surface response models, Kriging, Radial Basis Functions, Support Vector Regression, and Artificial Neural Networks [13][14]. These techniques generate approximated mappings from inputs to outputs. The choice depends on problem characteristics, available data, and required predictions.
Metamodeling using neural networks adopts a data-driven approach that harnesses the principles of ANNs to construct efficient approximations of complex systems. This methodology entails training the neural network on a dataset that reflects the system's behavior under scrutiny. This dataset consists of input variables paired with corresponding output values, facilitating the network's identification of underlying patterns and correlations. Following training, the neural network can provide predictions for new input data, substantially alleviating computational burdens compared to resource-intensive full-scale simulations.
The increased processing speed has dramatically expanded the applicability of neural network-based metamodeling. For example, [15] employed a neural network as a meta-model to approximate a copper porphyry mine comminution circuit, leading to a significant acceleration of simulations compared to traditional phenomenological models. Additionally, [16] utilized neural networks in the metamodeling of reactive transport, reducing computational time for scenarios requiring multiple realizations. These studies highlight the versatility of neural network-based metamodeling in improving efficiency, accuracy, and computational performance across various domains.
The mathematical model employed to generate the data was adapted from [17], outlining the dynamic behavior of a bioreactor. The equations governing substrate balance, S, and cell balance, 𝑋, are expressed by (2) and (3), respectively, while the reaction rate, 𝜇(𝑆), is defined by (4), where 𝐷 is defined as the dilution rate, representing the ratio between the volumetric feed flow rate and the reactor volume, and stands for the substrate's feed concentration.
|
(2) |
|
(3) |
|
(4) |
All code implementations were developed in Python, utilizing the free Spyder development environment (version 3.9.16). The code was compiled and executed on a computer system featuring 128 GB of DDR4 RAM, and an Intel® Core I7-12700k processor operating at 5.00 GHz.
This specific case study adopted a supervised training strategy to construct the neural model. This approach required the generation of input and output data. The input data was synthesized using a Random Gaussian Signal (RGS) algorithm [18]. The RGS technique is widely utilized for dynamic systems identification, enabling a thorough exploration of the input space. Consequently, it effectively stimulates the process response across diverse conditions.
The input variables were the dilution rate and substrate feed concentration, with mean values of 0.1 h⁻¹ and 10.0 g L⁻¹, respectively. Each variable displayed variations of ± 0.1 h⁻¹ and ± 2.5 g L⁻¹. A total of 2500 samples were generated and collected at intervals of 0.25 h. The sampling interval was modified to 8 h to generate the second dataset, while the other parameters were kept constant. As for the output data, represented by 𝑆 and 𝑋, these were derived by solving the system of ordinary differential equations outlined in (2) and (3), using the solve_ivp function from the scipy.integrate library for this purpose. Gaussian random noise was added to the simulated result to make output data more complex and realistic, with a standard deviation of 5%. This makes the resulting data more complex while pushing the meta-model to discover the underlying patterns in a way that enhances its robustness against noise and variability when transferring to actual operation. Subsequently, all datasets were organized and stored within a spreadsheet.
The generated data is showcased in Figs. 1-4 which illustrate the obtained data with higher (Figs. 1-2) and lower frequency (Figs. 3-4). The red data points indicate outputs with the addition of measurement noise, which was introduced to better approximate reality and attenuate potential overfitting.
Acknowledging the potential of RNNs, [8] introduced a groundbreaking neural network architecture called the Echo State Network (ESN). The primary aim of this architecture is to harness the capabilities of effectively addressing complex problems while simplifying the learning process. In the conventional training of ANNs, adjusting synaptic weights across input, output, and feedback layers can impose substantial computational demands, often requiring significant computational resources. However, Jaeger's innovative network design focuses solely on training output weights, accomplished through a relatively straightforward linear regression process. This approach offers significant advantages in terms of computational efficiency and streamlining the intricate task of fine-tuning complex feedback loops.
The ESN remarkably simplifies the training process by compartmentalizing the learning process into distinct stages - initially training output weights while keeping other weights fixed. This streamlined approach enhances computational efficiency and facilitates faster convergence during the training phase. Furthermore, the methodology unlocks potential applications in scenarios where efficient learning is paramount. The innovative design of the ESN offers a promising pathway to address challenges related to training complexity, making it well-suited for scenarios demanding both computational efficiency and enhanced learning performance.
In this implementation, the ESN network algorithm was coded following the equations outlined by [8], with specific hyperparameters maintained at fixed values (Table I). These predetermined values were determined empirically. An optimization method was utilized and implemented through Python programming to identify the optimal hyperparameters - neuron count, sparsity, and leaking rate. Following this, the resulting network was validated using the fine-tuned hyperparameters.
TABLE I. Network Hyperparameters.
Hyperparameter |
Value |
Reservoir size |
1222 |
Leaking rate |
0.6964 |
Sparsity |
0.3536 |
Spectral radius |
0.70 |
Train fraction |
0.35 |
Ridge |
4E-4 |
Noise level |
1E-5 |
Random seed |
13042023 |
Another test was applied to evaluate the performance in a closed-loop simulation, allowing for the assessment of the feasibility of applying the trained network as a meta-model (that is, the digital twin). The control objective was to maintain cell concentration (X) around desired values, considering the substrate concentration in the feed (Sf) as the disturbance and the dilution rate (D) as the manipulated variable. For this purpose, we used a PI controller with the velocity algorithm.
A transfer function of the reactor dynamics was obtained to tune the controller, employing a step test of -5% on D, performed on the differential model from its initial conditions. The steady-state response obtained was Xs = 4.5 g L⁻¹ and Ss = 1.0 g L⁻¹. With the approach of [19], it was possible to approximate the process with a first-order plus dead time (FOPDT) system. Fig. 5 comparatively illustrates the original process (differential model), represented by red points, and the approximated process. The parameters obtained through such an approach are shown in Table II.
TABLE II. Process parameters
Parameter |
Value |
KP (L g -1 h-1) |
-6.6642 |
θ (h) |
0.0700 |
𝜏 (h) |
1.0050 |
After conducting tests on different controllers, three tuning techniques were applied: Internal Model Control (IMC), Integral of Time multiplied by Absolute Error for servo test (ITAE), and manual fine-tuning [17]. The parameters for each tuning technique are described in Table III. It was concluded that the manually tuned controller was the best choice for this study, even though it was a more conservative option. The manually tuned controller yielded a favorable result of less oscillation in the manipulated variable during closed-loop tests. Additionally, it demonstrated a slight difference in response time compared to the other controllers examined. The gain margin of the manually fine-tuned controller was 56.8437, which is significantly higher than the gain margins of the IMC (22.9541) and ITAE-servo test (3.0869) methods. This result suggests that the manually fine-tuned controller is more robust than the other methods. As a result, the manually fine-tuned controller was chosen due to its quick, highly stable, and oscillation-free response.
The results of the closed-loop simulation using the selected controller are presented in Figs. 6-7. Fig. 6 illustrates the behavior of the manipulated and disturbance variables, while Fig. 7 depicts the controlled variable with its setpoint, along with the other output.
TABLE III. Tuning Methods and Controller Parameters
Parameter |
Tuning method |
||
IMC |
ITAE |
Manual |
|
KC (L g -1 h-1) |
-0.15164 |
-1.12761 |
-0.06123 |
𝜏I (h) |
1.00500 |
1.00752 |
6.70000 |
To assess the neural network's efficacy in accurately representing the behavior of the simulated system, as required for a digital twin, its response was evaluated within a closed-loop control framework. Within this framework, the control actions computed for the original process (based on the differential model) using the tuned proportional-integral (PI) controller were integrated as one of the network's inputs. Moreover, these inputs encompassed process disturbance information and a feedback signal generated by the network's predictions rather than simulated measurements from the differential model simulation. Consequently, the neural network can autonomously adapt over time, dynamically responding to the evolving process inputs.
After fine-tuning the hyperparameters, the network's performance was evaluated on both datasets. The higher-frequency dataset was used to assess the network's predictive capacity. The neural network demonstrated exceptional training performance, accurately predicting the test data and effectively capturing the underlying dataset's patterns and relationships (Fig. 8). This success highlights the model's robust ability to generalize from complex training examples to unseen data, showcasing its deep understanding of system dynamics.
An autocorrelation analysis of the training modeling errors (residual) indicated significant autocorrelation only at lag = 0, resembling a Dirac delta function (Fig. 9), confirming that the residual distribution follows a white noise correlogram pattern. We can see this result as an indication of the absence of systematic errors or patterns in the model's predictions. Additionally, a white noise correlogram pattern suggests that the model has effectively captured all relevant information from the data, and the predictions are based on genuine signals rather than noise.
The following run evaluates the pre-trained network's adaptability to a distinct scenario (second dataset), as illustrated in Fig. 10-11. As can be seen, the successful prediction of the second test dataset resulted in a residual distribution that also adheres to a white noise correlogram pattern. Remarkably, despite being trained with higher-frequency data, the model's ability to accurately represent lower-frequency data underscores its robustness and versatility in capturing the system's dynamics across different temporal scales.
In the closed-loop control scenario, the neural network functioned autonomously, providing its feedback signal based on the predicted outputs. However, Fig. 9 reveals a systematic deviation between the predicted and actual responses, likely stemming from the absence of feedback control dynamical effects in the training data. This discrepancy highlights the challenge of accurately capturing real-time system behavior under closed-loop control conditions.
A bias was introduced to mitigate this issue, representing the disparity between the simulated process measurements, , and the predicted outputs, . This adjustment on the predicted outputs, being with , yielded a maximum relative error of just 1.1%, compared to the 2.7% observed without bias. The graphical representations depicting the predictions in the absence and presence of bias correction are presented in Figs. 12 and 13, correspondingly.
Detailed performance metrics for the training, testing, and closed-loop application phases are provided in Table IV. The findings demonstrate the network's exceptional predictive capabilities, achieving outstanding performance in forecasting output data despite being trained on a comparatively small dataset — contrasting with the higher training percentages commonly used in the literature. Notably, the network accurately captured the output dynamics in the first dataset with remarkable precision. Furthermore, the successful modeling of a scenario with lower variability in the second dataset suggests its versatility and robustness. Thus, inferring that the acquired meta-model fits both scenarios is reasonable. Moreover, the closed-loop results showcase the neural network's potential as a virtual representation that reflects real-time process responses, thereby mimicking real-world scenarios with fidelity.
TABLE IV. Network performance metrics.
Metrics |
Dataset 1 |
Dataset 2 |
Closed loop |
||
Training |
Test |
Test |
Without bias |
With bias |
|
R2 |
0.9790 |
0.9490 |
0.9812 |
0.9930 |
0.9996 |
MSE |
2.6676E-02 |
3.14347E-02 |
4.6535E-02 |
0.0007 |
0.0001 |
ExpVar |
0.9790 |
0.9491 |
0.9812 |
0.9979 |
0.9998 |
This study employed an Echo State Network (ESN) as a meta-model to tackle the complexities of a classical nonlinear bioreactor. Unlike traditional Recurrent Neural Networks, ESNs simplify learning by maintaining fixed input and recurrent connections, while training only output connections through linear regression. This approach mitigates the challenges associated with training recurrent connections.
The outcomes of our study showcase the robust predictive capabilities of the ESN, adeptly handling noisy data and limited samples across a broad spectrum of oscillations. These results underscore the ESN's adaptability to the diverse scenarios commonly encountered in industrial contexts. The results of the closed-loop test validate the efficacy of ESNs, with maximum errors limited to just 3%. This underscores the potential for further exploration of ESN applications in constructing digital twins, representing a paradigm shift from traditional models towards real-time control and monitoring contexts.
Moreover, the findings confirm the practical and effective utility of the ESN for metamodeling in industrial processes. The versatility and potential integration of ESNs into Process Control and Monitoring practices facilitate precise simulations and streamline optimization procedures, thereby enhancing the efficiency and effectiveness of industrial processes. However, it is essential to acknowledge the ongoing need for evaluating and discussing alternative strategies to enhance the network's predictive accuracy, given the inherent complexity and challenges inherent in industrial process control. Continued research in this area promises to unlock further advancements in ESN applications, driving innovation and optimization within industrial processes.
This study was funded in part by the Fundação de Amparo à Pesquisa e Inovação do Espírito Santo – FAPES.
[1] L. Chiang, B. Lu, and I. Castillo, "Big data analytics in chemical engineering," Annual Review of Chemical and Biomolecular Engineering, vol. 8, pp. 63–85, 2017.
[2] G. G. Wang and S. Shan, "Review of Metamodeling Techniques in Support of Engineering Design Optimization," ASME Journal of Mechanical Design, vol. 129, no. 4, pp. 370–380, 2006.
[3] T. W. Simpson et al., "Meta-models for computer-based engineering design: Survey and recommendations," Engineering with Computers, vol. 17, no. 2, pp. 129–150, 2001.
[4] Y. F. Li, S. H. Ng, M. Xie, and T. N. Goh, "A systematic comparison of metamodeling techniques for simulation optimization in decision support systems," Applied Soft Computing, vol. 10, no. 4, pp. 1257–1273, 2010
[5] R. He, G. Chen, C. Dong, S. Sun, and X. Shen, "Data-driven digital twin technology for optimized control in process systems," ISA Transactions, vol. 95, pp. 221-234, 2019.
[6] R. R. Barton, "Tutorial: Metamodeling for Simulation," Institute of Electrical and Electronics Engineers Inc., 2020
[7] S. Haykin, "Neural networks and learning machines," 3rd ed. Pearson, 2009.
[8] A. Rasheed, O. San, and T. Kvamsdal, "Digital twin: Values, challenges and enablers from a modeling perspective," IEEE Access, vol. 8, pp. 21980-22012, 2020.
[9] L. Wright and S. Davidson, "How to tell the difference between a model and a digital twin," Advanced Modeling and Simulation in Engineering Sciences, vol. 7, no. 1, pp. 1-13, 2020.
[10] J. Zupan, "Basics of artificial neural networks," Data Handling in Science and Technology, vol. 23, no. C, pp. 199–229, 2003.
[11] H. Jaeger, "The 'echo state' approach to analysing and training recurrent neural networks—with an erratum note," GMD Technical Report, no. 148, 2001.
[12] Chen Wang et al., "An evaluation of adaptive surrogate modeling-based optimization with two benchmark problems," Environmental Modelling and Software, vol. 60, pp. 167–179, 2014.
[13] A. Eriçok et al., "Gaussian process and design of experiments for surrogate modeling of optical properties of fractal aggregates," Journal of Quantitative Spectroscopy and Radiative Transfer, vol. 239, p. 106643, 2019.
[14] X. Li et al., "Online dynamic prediction of potassium concentration in biomass fuels through flame spectroscopic analysis and recurrent neural network modelling," Fuel, vol. 304, p. 121376, 2021.
[15] E. J. Y. Koh et al., "Utilising a deep neural network as a surrogate model to approximate phenomenological models of a comminution circuit for faster simulations," Minerals Engineering, vol. 170, p. 107026, 2021.
[16] Y. Li, P. Lu, and G. Zhang, "An artificial-neural-network-based surrogate modeling workflow for reactive transport modeling," Petroleum Research, 2021.
[17] D. E. Seborg et al., "Process dynamics and control," John Wiley & Sons, 2016.
[18] L. Ljung, "System Identification: Theory for the User," 2nd ed. Prentice Hall Information and System Sciences Series, 1999.
[19] K. R. Sundaresan and P. R. Krishnaswamy, "Estimation of time delay time constant parameters in time, frequency, and Laplace domains," The Canadian Journal of Chemical Engineering, vol. 56, no. 2, pp. 257-262, 1978.