(575e) Predicting the Substrate Specificity and Regioselectivity of Halogenases Using Deep Learning
AIChE Annual Meeting
2023
2023 AIChE Annual Meeting
Food, Pharmaceutical & Bioengineering Division
Machine Learning Based Protein Engineering
Thursday, November 9, 2023 - 4:42pm to 5:00pm
Millions of enzymes in databases lack reliable substrate specificity and regioselectivity information, impeding their industrial applications and our comprehensive understanding of biocatalytic diversity in nature. In this study, we present an integrated pipeline that leverages artificial intelligence and a biofoundry named Illinois Biological Foundry for Advanced Biomanufacturing (iBioFAB) for predicting enzyme substrate specificity and regioselectivity. Our approach involves semi-automatic data extraction from literature, a combined sequence- and structure-based Graph Neural Network for predicting enzyme substrate specificity and regioselectivity, and high-throughput validation of representative enzymes. As a proof of concept, we focused on halogenases across four protein families and validated our model by achieving above 85% accuracy. Our results demonstrate that the functional diversity of halogenases may be considerably underestimated. Extending this pipeline to other protein families could enhance our understanding of the enzyme landscape and enable the discovery of novel enzymes for industrial applications.