Massively targeted evaluation of therapeutic CRISPR off-targets in cells
Methods for sensitive and high-throughput evaluation of CRISPR RNA-guided nucleases (RGNs) off-targets (OTs) are essential for advancing RGN-based gene therapies. Here we report SURRO-seq for simultaneously evaluating thousands of therapeutic RGN OTs in cells. SURRO-seq captures RGN-induced indels in cells by pooled lentiviral OTs libraries and deep sequencing, an approach comparable and complementary to OTs detection by T7 endonuclease 1, GUIDE-seq, and CIRCLE-seq.
Application of SURRO-seq to 8150 OTs from 110 therapeutic RGNs identifies significantly detectable indels in 783 OTs, of which 37 OTs are found in cancer genes and 23 OTs are further validated in five human cell lines by targeted amplicon sequencing.
Finally, SURRO-seq reveals that thermodynamically stable wobble base pair (rG•dT) and free binding energy strongly affect RGN specificity. Our study emphasizes the necessity of thoroughly evaluating therapeutic RGN OTs to minimize inevitable off-target effects.
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) RNA-guided nucleases (RGNs) has been used in therapy of several inherited human diseases1,2,3,4. Major efforts have focused on improving RGN editing efficiency via stabilization of the small guide RNA (sgRNA) thermodynamics5, modification of the RGNs6,7,8, utilization of homology-independent mediated targeted integration (HITI)9 and optimization of RGN delivery10,11.
The inevitably adverse effects caused by unspecific RGN editing of cancer genes are major concerns for the clinical application of RGN-based therapies. Improvement of RGN specificity and development of methods for identifying and evaluating the potential off-targets (OT) introduced by RGNs are equally essential to advance RGN-based gene therapy. Several experimental RGN OT identification/quantification methods have been developed (Supplementary Data 1), which can be grouped into three categories (Supplementary Fig. S1).
Category One contains genome-wide cell-free biochemical methods which relies on the capture of RGN-induced OT cleavage on either naked DNA or fixed chromatin fibers by sequencing. Examples are CIRCLE-seq (cell-free)12, Digenome-seq (cell-free)13, SITE-seq (cell-free)14, BLISS (ex vivo)15 and DIG-seq (ex vivo)16. Category Two contains methods depending on genome-wide in-cell capturing of RGN-induced off-target cleavage by sequencing, such as GUIDE-seq and IDLV-capture relying on insertion of double strand DNA and IDLV vector to the DNA double strand breaks respectively17,18, HTGTS and PEM-seq relying on translocation between on-target and off-targets19,20, and DISCOVER-seq relying on immunoprecipitation of DNA repair protein MRE11 to capture the DNA double strand break (DSB) sites21. While cell-free biochemical methods are rapid, conventional, and not depending on reference genomes, they inevitably capture many pseudo off-target sites. In-cell methods (e.g., GUIDE-seq) capture the bona fide RGN off-targets more faithfully as compared to cell-free methods.
However, spontaneous DSBs lead to capturing pseudo off-targets independent of RGNs17. To complement this, Category Three is composed of targeted in-cell RGN OT validation methods, such as T7 endonuclease 1 (T7E1), targeted deep sequencing, TIDE and CUT-PCR22,23. However, current targeted in-cell RGN off-target evaluation methods are greatly limited by their scales. Only a few sites can be evaluated for each RGN in a single study due to their high labor and time cost. A modified targeted amplicon sequencing method based on the rhAmpSeq has thus been reported for simulated and targeted analysis of several CRISPR gRNAs and hundreds of selected off-target sites in a single reaction24,25. This method has greatly improved the scale of targeted analysis of CRISPR off-targets by deep sequencing.
Here we introduce and apply SURRO-seq, a high-throughput method for targeted in-cell capture of RGN off-targets based on a pooled lentiviral vectors library encoding gRNA and barcoded surrogate off-target sites, to evaluate therapeutic RGN off-targets in cells. SURRO-seq exhibits high sensitivity and accuracy compared to GUIDE-seq and CIRCLE-seq by evaluating 170 previously investigated OTs from 11 RGNs in HEK293T cells. We then applied SURRO-seq to evaluate 8150 OTs from 110 therapeutic RGNs and identify 783 OTs showing significantly detectable indels. 37 OTs with significantly detectable indels are found in cancer genes, highlighting the clinical significance and great need of pre-assessing RGN OTs with SURRO-seq. The SURRO-seq identified OTs were further validated by targeted deep sequencing of five RGN-edited human cell lines. Analyses of OTs with high indel frequencies revealed that mismatch types leading to thermodynamically stable wobble base pair strongly increase RGN OT effect. We further perform benchmark analyses of latest RGN OT prediction tools with SURRO-seq OT data. The energy-based predictors, which incorporate gRNA and DNA binding energies, give the best performance.