CRISPR/Cas9 Sp. Pyog. target sites tracks   (All Genes and Gene Prediction Tracks)

Display mode:   

CRISPR Regions  Genome regions processed to find CRISPR/Cas9 target sites (exons +/- 200 bp)  
CRISPR Targets  CRISPR/Cas9 -NGG Targets  


This track shows regions of the genome within transcribed regions plus 200 bp on both sides and potential DNA sequences targetable by CRISPR RNA guides using the Cas9 enzyme from S. pyogenes (PAM: NGG). CRISPR target sites were annotated with predicted specificity (off-target effects) and predicted efficiency (on-target cleavage) by various algorithms through the CRISPOR tool.

Display Conventions and Configuration

The track "CRISPR Regions" shows the regions of the genome where target sites were analyzed, i.e. within transcribed regions plus 200 bp on both sides as annotated by Other RefSeq transcript models.

The track "CRISPR Targets" shows the potential target sites in these regions. The target sequence of the guide is shown with a thick bar (same thickness as exons in gene tracks). The PAM motif match (NGG) is shown with a thinner bar (same thickness as UTRs in gene tracks). Guides are colored to reflect both predicted specificity and efficiency. Specificity reflects the "uniqueness" of a 20-mer sequence in the genome; the less unique a sequence is, the more likely it is to cleave other locations of the genome (off-target effects). Shades of gray are used for items that are not very specific (MIT Specificity Score <= 50). Guides that are specific (MIT Specificity > 50) are then color-coded by their efficiency: the frequency of cleavage at the target site as determined by the Doench/Fusi 2016 score (on-target efficiency).

Shades of gray represent the sites that are hard to target specifically, as the 20-mer is common throughout the genome:

impossible to target: target site has at least one identical copy in the genome and was not scored
hard to target: many similar sequences in the genome that alignment stopped, repeat?
hard to target: target site was aligned but results in a low specificity score <= 50 (see below)

The track supports filtering on the configuration page using MIT Specificity score, but not the Doench/Fusi 2016 score.

Colors highlight targets that are specific in the genome (MIT specificity > 50) but have different predicted efficiencies:

low predicted cleavage: Doench/Fusi 2016 Efficiency percentile <= 30
medium predicted cleavage: Doench/Fusi 2016 Efficiency percentile > 30 and < 55
high predicted cleavage: Doench/Fusi 2016 Efficiency > 55

Mouse-over a target site to show predicted specificity and efficiency scores:

  1. The MIT Specificity score summarizes all off-targets into a single number from 0-100. The higher the number, the fewer off-target effects are expected. We recommend guides with an MIT specificity > 50.
  2. The efficiency score tries to predict if a guide leads to rather strong or weak cleavage. According to (Haeussler et al. 2016), the Doench 2016 Efficiency score should be used to select the guide with the highest cleavage efficiency when expressing guides from RNA Pol III Promoters such as U6. Scores are given as percentiles, e.g. "70%" means that 70% of mammalian guides have a score equal or lower than this guide. The raw score number is also shown in parentheses after the percentile.
  3. The Moreno-Mateos 2015 Efficiency score should be used instead of the Doench 2016 score when transcribing the guide in vitro with a T7 promoter, e.g. for injections in mouse, zebrafish or Xenopus embryos. The Moreno-Mateos score is given in percentiles and the raw value in parentheses, see the note above.

Click onto features to show all scores and predicted off-targets with up to four mismatches. The Out-of-Frame score by Bae et al. 2014 is correlated with the probability that mutations induced by the guide RNA will disrupt the open reading frame. The authors recommend out-of-frame scores > 66 to create knock-outs with a single guide efficiently.

Off-target sites are sorted by the CFD score (Doench et al. 2016). The higher the CFD score, the more likely there is off-target cleavage at that site. The large majority of predicted off-targets with CFD scores < 0.02 were false-positives.


Relationship between predictions and experimental data

Like most algorithms, the MIT specificity score is not always a perfect predictor of off-target effects. Despite low scores, many tested guides caused few and/or weak off-target cleavage when tested with whole-genome assays (Figure 2 from Haeussler et al. 2016), as shown below, and the published data contains few data points with high specificity scores. Overall though, the assays showed that the higher the specificity score, the lower the off-target effects.

Similarly, efficiency scoring is not very accurate: guides with low scores can be efficient and vice versa. As a general rule, however, the higher the score, the less likely that a guide is very inefficient. The following histograms illustrate, for each type of score, how the share of inefficient guides drops with increasing efficiency scores:

When reading this plot, keep in mind that both scores were evaluated on their own training data. Especially for the Moreno-Mateos score, the results are too optimistic, due to overfitting. When evaluated on independent datasets, the correlation of the prediction with other assays was around 25% lower, see Haeussler et al. 2016. At the time of writing, there is no independent dataset available yet to determine the Moreno-Mateos accuracy for each score percentile range.

Track methods

Exons as predicted by Other RefSeq gene models were used, extended by 200 base pairs on each side, and searched for the -NGG motif. Flanking 20-mer guide sequences were aligned to the genome with BWA and scored with MIT Specificity scores using the command-line version of CRISPOR. Non-unique guide sequences were skipped. Flanking sequences were extracted from the genome and input for CRISPOR efficiency scoring, available from the CRISPOR downloads page, which includes the Doench 2016, Moreno-Mateos 2015 and Bae 2014 algorithms, among others.

Data Access

The raw data can be explored interactively with the Table Browser. For automated analysis, the genome annotation is stored in a bigBed file that can be downloaded from our download server. The files for this track are called and Individual regions or the whole genome annotation can be obtained using our tool bigBedToBed, which can be compiled from the source code or downloaded as a precompiled binary for your system. Instructions for downloading source code and binaries can be found here. The tool can also be used to obtain only features within a given range, e.g. bigBedToBed -chrom=KE382941 -start=0 -end=1000000 stdout


Track created by Maximilian Haeussler, with helpful input from Jean-Paul Concordet (MNHN Paris) and Alberto Stolfi (NYU).


Bae S, Kweon J, Kim HS, Kim JS. Microhomology-based choice of Cas9 nuclease target sites. Nat Methods. 2014 Jul;11(7):705-6. PMID: 24972169

Doench JG, Fusi N, Sullender M, Hegde M, Vaimberg EW, Donovan KF, Smith I, Tothova Z, Wilen C, Orchard R et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat Biotechnol. 2016 Feb;34(2):184-91. PMID: 26780180; PMC: PMC4744125

Haeussler M, Schönig K, Eckert H, Eschstruth A, Mianné J, Renaud JB, Schneider-Maunoury S, Shkumatava A, Teboul L, Kent J et al. Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biol. 2016 Jul 5;17(1):148. PMID: 27380939; PMC: PMC4934014

Hsu PD, Scott DA, Weinstein JA, Ran FA, Konermann S, Agarwala V, Li Y, Fine EJ, Wu X, Shalem O et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat Biotechnol. 2013 Sep;31(9):827-32. PMID: 23873081; PMC: PMC3969858

Moreno-Mateos MA, Vejnar CE, Beaudoin JD, Fernandez JP, Mis EK, Khokha MK, Giraldez AJ. CRISPRscan: designing highly efficient sgRNAs for CRISPR-Cas9 targeting in vivo. Nat Methods. 2015 Oct;12(10):982-8. PMID: 26322839; PMC: PMC4589495