(281f) Resolving Genetic Engineering Signatures in Yeast with the Iseq and Minion on-Site | AIChE

(281f) Resolving Genetic Engineering Signatures in Yeast with the Iseq and Minion on-Site

Authors 

Collins, J. - Presenter, Worcester Polytechnic Institute
Young, E., Worcester Polytechnic Institute
Generating an assembly that captures all of the genome and plasmid modifications resulting from metabolic engineering is essential for quality control, connecting genotype to phenotype, establishing and protecting intellectual property, and generating “ground truth” for monitoring potential release events. Furthermore, high quality de novo assemblies can be used to accurately determine the presence and function of metabolic engineering signatures in an unknown sample. Here, we use two new inexpensive sequencers, the Oxford Nanopore MinION and the Illumina iSeq, to enable fast acquisition of sequence on-site. We also use new processing algorithms to enable fast transformation of sequence information into a high-quality genome assembly. With these, we devise an integrated nanopore+Illumina sequencing, assembly, and polishing pipeline that can resolve low copy genome and plasmid modifications in Saccharomyces cerevisiae. Key to this pipeline is the blending of long read scaffolds from nanopore sequencing with high-accuracy Illumina reads. Using this pipeline, we generate high-quality de novo assemblies for a dozen engineered yeast strains with a variety of engineering signatures. Finally, we extend the pipeline to resequence six nonconventional yeasts of interest as platforms for metabolic engineering.

To establish the most accurate workflow, we evaluated four nanopore de novo assemblers and three polishing algorithms at varying genome coverage depths for the lab strain S. cerevisiae CEN.PK113-7D. Our results show that (1) nanopore genome coverage depth must be at least 40X, (2) Flye and Canu are currently the best assemblers due to their combination of structure, completeness, and accuracy, and (3) Illumina data is essential for polishing. Our final pipeline generated a better S. cerevisiae CEN.PK113-7D assembly than the publicly available reference genome.

We then applied this pipeline to 12 engineered S. cerevisiae strains of varying genetic background – including strains from S288C, BY4741, BY4742, CEN.PK, W303a, and brewery lineages. Interestingly, the nanopore assembler Flye was the only software able to resolve both chromosomally-integrated pathways and complete plasmids – and was able to do so even when presented with mixtures of plasmids. The more widely used nanopore assembler Canu was unable to resolve complete plasmids. We then extended the pipeline to resequence nonconventional yeasts. A high-quality genome is key to metabolic engineering, systems and synthetic biology, and connecting genotype to phenotype. Thus, we sequenced Pichia pastoris, Hansenula polymorpha, Yarrowia lipolytica, Kluyveromyces marxianus, Debaryomyces hansenii, and Xanthophyllomyces dendrorhous. The resulting de novo genomes show significant improvement over the respective references – achieving chromosomal resolution, closing large gaps, and revealing previously omitted genes. Thus, the engineered and “ground truth” assemblies we have created represent an advance in the ability to detect signatures of metabolic engineering and support further metabolic engineering of nonconventional yeasts.

Sequencing is becoming ever more prevalent in research workflows across disciplines, including metabolic engineering. We provide here a pipeline that can accurately determine genotype and resolve complete engineering signatures in unknown samples. This technology can be inexpensively implemented on-site in the many distributed locations where organisms are engineered to obtain high-quality de novo genome assemblies. Thus, this pipeline can be widely applied in academic, government, and industry settings to study and monitor engineered organisms without high capital costs and deep coverage depths characteristic of alternative sequencing platforms and algorithms.