(28w) Artificial Intelligence-Based Parametrization of Next Generation Systems Biology Models

Conference

AIChE Annual Meeting

Year

2023

Proceeding

2023 AIChE Annual Meeting

Group

Food, Pharmaceutical & Bioengineering Division

Session

Poster session: Engineering Fundamentals in Life Science

Time

Monday, November 6, 2023 - 3:30pm to 5:00pm

Authors

Karakoltzidis, A. - Presenter, Aristotle University of Thessaloniki

Karakitsios, S., Aristotle University of Thessaloniki

Sarigiannis, D., Aristotle University of Thessaloniki

Next-generation systems biology models (NGSB models) are built on a large scale, including hundreds of differential equations with millions of interactions and metabolic transformations. These models aim to approximate real life as closely as possible by incorporating all endogenous metabolites involved in a biological process as well as those that increase or decrease their concentration simultaneously and may be indirectly affected by a disruptor or other biological events. These applications provide the basis for the development of systems biology-based adverse outcome pathways (AOPs) and quantitative AOPs, tools that are valuable in 21^st century risk assessment and decision-making procedures.

The construction of such a mathematical model requires a high level of expertise; in addition, a very large number of kinetic parameters, such as turnover numbers or Michaelis â€“ Menten constants have to be properly estimated, in order for the models to be feasible. It is well known that most of the detected turnover numbers estimated by wet lab experiments have not been quantified yet. Turnover numbers upon which training and test sets rely on are taken from the BRENDA (Jeske et al., 2019) and SABIO databases (Wittig et al., 2012) to date. These databases except from providing the training basis feed the NGSB models with actual turnover numbers leading to more accurate predictions. Detection of unknown turnover numbers is time-consuming, thus often inhibiting the completion of this task.

Consequently, train and test sets built based on the information stemming from the relevant databases and then Deep Neural Network (DNN) models are trained and finely optimized to estimate turnover numbers The preprocessed integrated dataset includes the basis for the construction of each enzyme and reaction component, including enzymes sequences, molecular fingerprints and other chemical attributes derived from mol files. The description of endogenous metabolites participating in each reaction is introduced to the model with the MACCS and PUBCHEM fingerprints incorporated by Tanimoto similarity indexes. Mol files are derived from the KEGG database (Kanehisa, 2002). The construction of the dataset is followed by the training process.

TensorFlow (Abadi et al., 2016) and Keras (Chollet F. Keras, 2015) modules are currently used for the DNN model development. These tools provide a user-friendly interface and the advantage of the integration with other tools used for machine and deep learning. The optimization process was carried out with trial-and-error in addition to a multicore process developed in-house. This process allowed the development of a model well optimized with the selection of a proper set of parameters. The biological integration that each enzyme has, relies on fasta files production based on their sequences and, by incorporating the algorithm of Alley et al. (2019) and Natural Language Processing (NLP) techniques, numerical vectors were produced representing the structure of each enzyme. The model we developed can predict a turnover number with an R² of 0.56. The methodology constructed is independent of the organism for which kinetic parameters are predicted.

The procedure described is expected to provide solutions for the parametrization of systems biology and NGSB models, as well as for industrial and pharmaceutical applications including enzymatic processes which can benefit from these types of models. Soon, an online version of these models will be made available, thus offering users the opportunity to make their own applications, by providing the sequences of the enzymes they are interested in using already trained models as well as a GitHub repository in which users will be able to access the pre trained models and introduce them in their code.

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., & Devin, M. (2016). Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467.

Alley, E. C., Khimulya, G., Biswas, S., AlQuraishi, M., & Church, G. M. (2019). Unified rational protein engineering with sequence-based deep representation learning. Nature methods, 16(12), 1315-1322.

Chollet F. Keras. (2015). GitHub. Seattle, WA, USA. https://keras.io

Jeske, L., Placzek, S., Schomburg, I., Chang, A., & Schomburg, D. (2019). BRENDA in 2019: a European ELIXIR core data resource. Nucleic Acids Res, 47(D1), D542-D549.

Kanehisa, M. (2002). The KEGG database. â€˜In Silicoâ€™Simulation of Biological Processes: Novartis Foundation Symposium 247,

Wittig, U., Kania, R., Golebiewski, M., Rey, M., Shi, L., Jong, L., Algaa, E., Weidemann, A., Sauer-Danzwith, H., & Mir, S. (2012). SABIO-RKâ€”database for biochemical reaction kinetics. Nucleic Acids Res, 40(D1), D790-D796. https://doi.org/10.1093/nar/gkr1046

Topics

Systems Biology

Biological Engineering

Bioprocessing

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2024 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: September 2024

CEP: August 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(28w) Artificial Intelligence-Based Parametrization of Next Generation Systems Biology Models

AIChE Annual Meeting

2023

2023 AIChE Annual Meeting

Food, Pharmaceutical & Bioengineering Division

Poster session: Engineering Fundamentals in Life Science

Monday, November 6, 2023 - 3:30pm to 5:00pm

Authors

Topics

More Conference Links

Visit Orlando

Universal Studios Offer

Cancellation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams