Phylogenomics of Cas4 Family Nucleases | AIChE

Phylogenomics of Cas4 Family Nucleases

Authors 

The Cas4 family endonuclease is a component of the adaptation module in many variants of CRISPR-Cas adaptive immunity systems. Unlike most of the other cas genes, cas4 is often encoded outside CRISPR-cas loci (solo-Cas4) and is also found in mobile genetic elements (MGE-Cas4), particularly, archaeal viruses.

As part of our ongoing investigation of CRISPR-Cas evolution, we explored the phylogenomics of the Cas4 family. Notably, about 90% of the archaeal genomes encode Cas4 compared to only about 20% of the bacterial genomes. Many archaea encode both the CRISPR-associated form (Cas-Cas4) and solo-Cas4, whereas in bacteria, this combination is extremely rare. The solo-cas4 genes are notably over-represented in environmental bacteria of the Candidate Phyla Radiation (CPR) that typically lack CRISPR-Cas systems, suggesting that Cas4 could be a component of uncharacterized defense systems of the CPR bacteria. The results of phylogenomic analysis indicate that both the CRISPR-associated cas4 genes are often transferred horizontally but almost exclusively, as part of the adaption module, i.e. jointly with cas1 and cas2 genes. The evolutionary integrity of the adaptation module sharply contrasts the observed rampant shuffling of CRISPR-cas modules whereby a given variant of the adaptation module can combine with virtually any variety of effector modules. The solo-cas4 genes evolve primarily via vertical inheritance and are subject only to occasional horizontal gene transfer. The estimated selection pressure on cas4 genes does not substantially differ between the CRISPR-associated family members and solo-cas4, and is close to the genomic median. Thus, in terms of their evolutionary regime, cas4 genes, similarly to cas1 and cas2, behave like ‘regular’ microbial genes involved in various cellular functions, showing no evidence of direct involvement in virus-host arms races. A notable feature of the Cas4 family evolution is the frequent recruitment of cas4 genes by various mobile genetic elements (MGE), particularly, archaeal viruses. The functions of Cas4 in these elements are unknown and potentially might involve anti-defense roles.

Unlike most of the other Cas proteins, Cas4 family members are as often encoded by stand-alone genes as they are incorporated in CRISPR-Cas systems. In addition, cas4 genes were repeatedly recruited by MGE, perhaps, for anti-defense functions. Experimental characterization of the solo and MGE-encoded Cas4 nucleases is expected to reveal currently uncharacterized defense and anti-defense systems and their interactions with CRISPR-Cas systems.