(117e) Effects of Sequence Features on Machine-Learned Enzyme Classification Fidelity and Implications on De Novo enzyme Design

Conference

AIChE Annual Meeting

Year

2022

Proceeding

2022 Annual Meeting

Group

Food, Pharmaceutical & Bioengineering Division

Session

Biocatalysis and Enzyme Engineering

Time

Monday, November 14, 2022 - 1:42pm to 2:00pm

Authors

Ferdous, M. S. - Presenter, Iowa State University

Reuel, N., Iowa State University

Assigning enzyme commission (EC) numbers using sequence information alone is the focus of classification algorithms where statistics, homology and machine-learning based methods are used. This work benchmarks performance of four recent algorithms and specifically maps fidelity as a function of sequence features such as chain length and amino acid composition (AAC). This enables determination of optimal classification windows for de novo sequence generation and enzyme design. In this work we developed a parallelization workflow which efficiently processes >500,000 annotated sequences through each candidate algorithm and a visualization workflow to observe the performance of the classifier over changing enzyme length, main EC class and AAC. We applied these workflows to the entire SwissProt database to date (n = 565254) using two, locally installable classifiers, ECpred and DeepEC, and collecting results from two other webserver-based tools, Deepre and BENZ-ws. It is observed that all the classifiers exhibit peak performance in the range of 300 to 500 amino acids in length. In terms of main EC class, classifiers were most accurate at predicting translocases (EC-6) and were least accurate in determining hydrolases (EC-3) and oxidoreductases (EC-1). We also identified AAC ranges that are most common in the annotated enzymes and found that all classifiers work best in this common range. Among the four classifiers, ECpred showed the best consistency in changing feature space. These workflows can be used to benchmark new algorithms as they are developed and find optimum design spaces for the generation of new, synthetic enzymes. Preliminary work is presented on generating candidate sequences for therapeutic enzyme design with a classifier used in its optimal prediction space.

Topics

Protein Engineering

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2024 International Mammalian Synthetic Biology Workshop (mSBW)

2024 Chemical Ventures Conference

Upcoming Conferences & Events

2024 Dow Sponsored CCPS Process Safety Faculty Workshop

2024 International Mammalian Synthetic Biology Workshop (mSBW)

2024 Chemical Ventures Conference

2024 China Chem-E-Car Competition

2024 India Student Regional Conference

CCPS India Regional Meeting

CCPS Process Safety Knowledge Webinar (Brazil)

2024 Indonesia Student Regional Conference

Procesa 2024: 6th AIChE Latin America Student Regional Conference

CEP: July 2024

CEP: June 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(117e) Effects of Sequence Features on Machine-Learned Enzyme Classification Fidelity and Implications on De Novo enzyme Design

AIChE Annual Meeting

2022

2022 Annual Meeting

Food, Pharmaceutical & Bioengineering Division

Biocatalysis and Enzyme Engineering

Monday, November 14, 2022 - 1:42pm to 2:00pm

Authors

Topics

More Conference Links

Cancellation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams