A Role for Biocuration in Building Useful in silico Models for Plant Synthetic Biology and All Types of Genome-Scale Analyses | AIChE

A Role for Biocuration in Building Useful in silico Models for Plant Synthetic Biology and All Types of Genome-Scale Analyses

Authors 

Naithani, S. - Presenter, Oregon State University
Typically, after genome sequencing, biologists rely on automated pipelines to annotate the function of individual genes based on Gene Ontology assignments and/or projections from the well-characterized orthologs in model organisms. Although this approach is powerful in assigning conserved genes functions, it falls short in assigning correct gene annotations to the members of large gene families. Often, most members of a gene family or its subclass bear similar gene annotation without considering gene duplicates’ evolutionary fate, such as sub-functionalization, functional diversification, neo-functionalization, and pseudogenization or biological relevance for a species. This problem is especially of concern for the plant biologists because plant genomes harbor unique complexities such as polyploidy and extensive duplication of genomic regions and genes. The individual members of the gene families are poorly annotated. This impacts all types of large-scale OMIC data analysis, metabolic networks, and pathway models—all of which provide the base for conducting synthetic biology experiments in plants. Thus, we advocate for the need of manual biocuration of large gene families by experts based on the published literature and analysis of publicly available genomic, transcriptomic, metabolomic and proteomic datasets. Such an approach can increase the visibility of potential candidate genes for further experimental analysis, improve the contents of genomic and pathway databases, and benefit the plant genomics and synthetic biology community, who rely on the accuracy of gene annotations. As an example, we will present a thorough functional annotation of 144 rice genes belonging to the S-domain related receptor-like kinase (SDRLK) gene family and their association with biotic and abiotic stress responses. We will also discuss the challenges, opportunities, strategies implemented in our ongoing biocuration projects, and expect that our study sets forth an example of re-use and re-analysis of genomic data for improving gene annotation of the large gene families.