Mining Curated Plant Biosynthetic Gene Clusters in Coffea Spp Genomes | AIChE

Mining Curated Plant Biosynthetic Gene Clusters in Coffea Spp Genomes

Authors 

Specialized metabolites from plants constitute the majority of natural products used as medicines and agrochemicals. Some of the enzymes responsible for specialized metabolite biosynthesis are encoded by genes that are proximally collocated in genomes - Biosynthetic Gene Clusters (BGCs). In coffee (Coffea spp), several compounds responsible for the complexity of flavor and aroma are specialized metabolites. Identifying homologous gene clusters aids in the study of their function and evolution. The MIBiG database is a repository containing curated BGC data and currently has 20 plant BGCs. Using BGC data from the MIBiG database and a preliminary inventory of putative BGCs in Coffea we assessed whether we could identify homologues of curated BGCs in the genomic sequences of three Coffea species: C.arabica, C. canephora and C. eugenioides. We used cblaster to perform searches using the protein sequences of each curated plant BGC as queries. Eight of the known plant BGCs had potentially homologous regions in Coffea species - one related to saccharide biosynthesis (dhurrin), one related to alkaloid biosynthesis (thebaine) and six are related to terpenoid biosynthesis. As a case study, we identified gene clusters partially homologous to the triterpenoid cucurbitacin BGC from Cucumis sativus, composed of six genes, including one cyclase, one acyltransferase and four P450s. In C. arabica, the partially similar gene cluster contains a putative acyltransferase and three P450s. In C. eugenioides, another similar genomic region was identified containing a putative acyltransferase and two P450s. These constitute interesting candidates for further investigation of similar metabolite compounds across different plant species. This approach is an important step in prioritizing BGCs for functional analysis and synthetic biology studies for the production of Coffea specialized compounds.