(312c) Automated Detection of Yeast Genetic Engineering in Whole Genomes and Metagenomes with Prymetime | AIChE

(312c) Automated Detection of Yeast Genetic Engineering in Whole Genomes and Metagenomes with Prymetime

Authors 

Young, E. - Presenter, Worcester Polytechnic Institute
Collins, J., Worcester Polytechnic Institute
Keating, K., Worcester Polytechnic Institute
Roehner, N., Raytheon BBN Technologies
Adler, A., SBOL Visual Working Group
Jones, T., Worcester Polytechnic Institute
Balaji, S., Worcester Polytechnic Institute
Como, M., Worcester Polytechnic Institute
Newlon, Z., Worcester Polytechnic Institute
Mitchell, T., Raytheon BBN Technologies
Yeast genomes can be assembled from sequencing data, but genetic engineering changes often fail to be resolved with accuracy, completeness, and contiguity. Further, searching for engineered sequences in sequence data is currently a manual process. To overcome these challenges, we applied nanopore assembly and short read error correction to create an integrated workflow that achieves accurate whole genome and plasmid sequences of engineered yeasts, automatically annotating synthetic biology parts. We named this workflow Prymetime, "Pipeline for Recombinant Yeast genoMEs That Illuminates Markers of Engineering."

Using Prymetime and a user-specified parts collection, we annotated the engineering within whole genomes of 15 engineered yeasts built from several Saccharomyces cerevisiae laboratory strains, Yarrowia lipolytica Po1f, and Komatagaella phaffii CBS 7435. We show that each pathway and plasmid can be correctly assembled and annotated, even in strains that have part repeats and multiple similar plasmids. Furthermore, the whole genomes are accurate, complete, and contiguous. We found that 40X sequencing depth of both nanopore and short reads was sufficient for accurate assembly, which reduces the resources needed for a complete yeast genome.

Finally, we used our validated workflow to identify engineering in metagenomic data. We combined reads from engineered yeast with a standard metagenomic NGS read set. We were able to correctly assemble and annotate the synthetic biology parts up to a 1:1000 dilution of the engineered yeast reads. Therefore, Prymetime can obtain high-quality de novo genome assemblies of engineered yeasts and label the engineering signatures within them, facilitating verification of yeast engineering and identification of engineering within isolates and metagenomes.

Engineered yeast are becoming ever more prevalent as cell factories for fuels, chemicals, and pharmaceuticals. Additionally, evolutionary and combinatorial strain engineering strategies produce more strains than can be reasonably sequenced and manually validated. Prymetime is a step towards overcoming these challenges, as it can be used to verify and detect engineering. This capability can apply to quality control of strain construction, monitoring waste streams for release of engineered organisms, and detection of synthetic biology parts in unknown samples.