Introduction to CRISPR-Cas9 and Genome Editing

The twentieth century was one of scientific revolutions: advances in agriculture saved millions of lives, the discovery and mass-production of antibiotics made routine medical procedures safer, vaccines became more widespread, and plastics made products cheaper and more affordable for the general populous.

It would be easy to go on, but to do so would gloss over the challenges that now face us. The agricultural revolution that took place is no longer increasing yield, and widespread use of nitrogen fertiliser is damaging already-strained ecosystems. Many antibiotics are worryingly ineffective, and the rate of discovery for these – and, indeed, vaccines – is not sufficient for the rate of pathogen emergence and evolution. The plastics that brought cheap products to the masses are now polluting the earth.

However, there is a suite of technologies that may help in resolving these issues, and many others. These are the“genome editing” technologies (also termed “gene editing” technologies), which allow the user to precisely and efficiently introduce an array of changes into host cells (Gupta and Musunuru, 2014). This vastly improves on homologous recombination, which, during the late twentieth century, was the most common way of introducing changes to cells (Strong and Musunuru, 2017). However, as this method suffers from low efficacy and is very financially and temporally costly, it has largely been transcended by the genome-editing technologies.

Until recently, the two main genome-editing techniques were zinc-finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs). ZFNs combine zinc finger proteins (ZFPs, which can bind specific regions of DNA) with a modified catalytic domain from the FOK1 nuclease (Urnov et al, 2010). Each ZFP binds a three base pair region and can also be used modularly to give higher specificity (Weber and Pabo, 1994). ZFNs work in dimers, with each DNA-binding domain recognising adjacent sites and the FOK1 domains cleaving between the two sites.

TALENs work on a similar principle, also utilising an adapted FOK1 enzyme. However, in this instance it is bound to a transcription activator-like endonuclease (TALE) domain (Bogdanove and Voytas, 2011). The TALE domains are highly conserved, DNA-binding proteins that only differ in the repeat variable di-residue (VDR). This variability allows each TALE to recognise different regions of DNA. Like ZFN, TALENs operate as a dimer (Strong and Musunuru, 2017).

However, though both techniques allowed higher levels of manipulation, they both had disadvantages that limited their use. ZFNs are small, but hard to engineer and consequently if a recognition site is lacking in a genome, they may be of little use (Urnov et al, 2010). This issue is partially overcome by using TALENs, which are easier to design, but are also much larger and thus harder to deliver to host cells (Gupt and Musunuru, 2014).

Whilst there are developments that could potentially overcome the above issues (for example, by making the FOK1 nuclease domain an obligate heterodimer, thereby reducing off-target effects (Doyon et al, 2011)), it was the advent of an entirely new method of genome editing that has allowed rapid advancements in the field: CRISPR-Cas9.

Before we consider the applications or ethical issues concerning the introduction of CRISPR-Cas9 technology, it is important to first have an understanding of how it works. The technique is adapted from one found in bacterial and archaeal genomes; the CRISPR motifs have thus far been characterised in ~40% of bacterial genomes and a clear majority (~90%) of archaeal genomes (Horvath and Barrangou, 2010). The CRISPR loci comprise alternating repeat and spacer sequences, which are in turn flanked by CRISPR-associated (cas) genes and have an upstream leader sequence (Barrangou and Marraffini, 2014). The repeat sequences are highly conserved, and are often palindromic, facilitating the formation of stable secondary structures such as hairpin loops.

Despite the identification of these CRISPR loci in the 1980’s, their purpose remained unknown for nearly three decades until it was noticed that the spacer-region sequences had homology to viral and plasmid DNA (Bolotin et al, 2005). This correlation didn’t prove causation, however, until it was shown that Streptococcus thermophiles and other bacteria could acquire defence against viral infection by incorporating short sequences homologous to viral DNA between the CRISPR repeats. These were then transcribed and provided RNA-guided digestion of invading DNA by cat nucleases (Barrangou et al., 2007).

As the incorporation of the spacer sequences provided the cell with a mechanism of recognising and targeting foreign DNA of the same sequence, this CRISPR-Cas system was the first recorded mechanism of acquired immunity in prokaryotes (Barrangou et al., 2007).

Though there have been three characterised CRISPR-Cas systems, the new CRISRP-Cas9 technology is based upon a Type II system, and as such that is what will be discussed here. There are three primary steps in this pathway: adaptation, crRNA biogenesis, and targeting. In the adaptation step, exogenous DNA is detected by the cell and new spacer elements are incorporated into the spacer loci, along with associated repeats (Horvath and Barrangou, 2010).

If the cell is then challenged again by infection with the same bacterium or virus, the cell can now produce CRISPR RNA (crRNA) by transcribing the spacer sequence. However, as there are many spacer sequences in the locus, and only some will be complementary to the target nucleic acid, the pre-crRNA molecule must first be matured. Another short RNA molecule, tracrRNA, mediates this process by guiding RNase III and the Cas9 protein (Deltcheva et al., 2011). The final, mature crRNA then forms a complex with tracrRNA and the Cas9 protein to form an active, RNA-guided complex.

In the final step, targeting, the crRNA guides the RNA-protein complex to the invasive DNA, where a double-stranded break is cleaved. This is achieved through two steps: the HNH activity of the Cas9 protein cleaves the strand complementary to the crRNA, whilst the RuvC-like domain of Cas9 cleaves the opposite strand (Jinek et al., 2012).

However, as with all immune systems, there is a possibility of mistaking “non-self” for “self” (van Erp et al, 2015). This danger is overcome in Type II CRISPR-Cas systems by recognising a protospacer adjacent motif (PAM) immediately downstream of the protospacer (exogenous) DNA (Mojica et al, 2009).

By designing their own guide RNAs (gRNAs, which serve the same function as the prokaryotic crRNAs and tracrRNAs) and complexing them with cas endonucleases, researchers have adapted this innate prokaryotic mechanism as a means of editing the genomes of eukaryotic cells. The primary advantage of CRISPR-Cas9 over the aforementioned editing techniques is its ease of use: designing a sRNA is much easier and more flexible than doing the same for proteins. It is thus more straightforward to design and implement in labs.

Already, CRISPR-Cas9 has revolutionised research and is extensively applied in disease modelling, agricultural research, vaccine development and disease control. The technique has already transitioned to clinical trials, where it is currently being investigated in anti-cancer therapies (Su et al., 2015). It is likely that, as off-target effects diminish and editing becomes more precise, more trials will be approved in the near future.

About the Author

Rachel Murray-Watson is currently pursuing a PhD in Cambridge University. Rachel obtained a first class honours (BSc) in Biological Sciences from Imperial College, London. Her thesis was on “Modelling the Spatial Spread of Gene Drives” and she won the Howarth Prize for excellence in plant sciences. Rachel won the Institute of Biology’s prize for 1st place in biology in the national examinations in Ireland. Her current area of research is mitigating the impact of communicable agriculural diseases by developing effective control strategies.

BioMed Advances

About the Author