Mappability Tracks
 
Hoffman Lab Umap and Bismap Mappability tracks   (All Mapping and Sequencing tracks)

Display mode:   

 All
Bismap  Single-read and multi-read mappability after bisulfite conversion  
Umap  Single-read and multi-read mappability by Umap  
Related tracks
  • Problematic Regions: The problematic regions track contains various gene clusters and the ENCODE blacklist
Assembly: Human Dec. 2013 (GRCh38/hg38)

Description

These tracks indicate regions with uniquely mappable reads of particular lengths before and after bisulfite conversion. Both Umap and Bismap tracks contain single-read mappability and multi-read mappability tracks for four different read lengths: 24 bp, 36 bp, 50 bp, and 100 bp.

You can use these tracks for many purposes, including filtering unreliable signal from sequencing assays. The Bismap track can help filter unreliable signal from sequencing assays involving bisulfite conversion, such as whole-genome bisulfite sequencing or reduced representation bisulfite sequencing.

Bismap single-read and multi-read mappability

Bismap single-read mappability

These tracks mark any region of the bisulfite-converted genome that is uniquely mappable by at least one k-mer on the specified strand. Mappability of the forward strand was generated by converting all instances of cytosine to thymine. Similarly, mappability of the reverse strand was generated by converting all instances of guanine to adenine.

To calculate the single-read mappability, you must find the overlap of a given region with the region that is uniquely mappable on both strands. Regions not uniquely mappable on both strands or have a low multi-read mappability might bias the downstream analysis.

Bismap multi-read mappability

These tracks represent the probability that a randomly selected k-mer which overlaps with a given position is uniquely mappable. Multi-read mappability track is calculated for k-mers that are uniquely mappable on both strands, and thus there is no strand specification.

Umap single-read and multi-read mappability

Umap single-read mappability

These tracks mark any region of the genome that is uniquely mappable by at least one k-mer. To calculate the single-read mappability, you must find the overlap of a given region with this track.

Umap multi-read mappability

These tracks represent the probability that a randomly selected k-mer which overlaps with a given position is uniquely mappable.

For greater detail and explanatory diagrams, see the preprint, the Umap and Bismap project website, or the Umap and Bismap software documentation.

Data Access

The raw data can be explored interactively with the Table Browser, or the Data Integrator. For automated analysis, genome annotation is stored in a bigBed or bigWig file that can be downloaded from the download server. Individual regions or the whole genome annotation can be obtained using our tool bigBedToBed or bigWigToWig, which can be compiled from the source code or downloaded as a precompiled binary for your system. Instructions for downloading source code and binaries can be found here. The tool can also be used to obtain only features within a given range, for example:

bigBedToBed -chrom=chr6 -start=0 -end=1000000 http://hgdownload.soe.ucsc.edu/gbdb/hg38/hoffmanMappability/k24.Unique.Mappability.bb stdout
bigWigToWig -chrom=chr6 -start=0 -end=1000000 http://hgdownload.soe.ucsc.edu/gbdb/hg38/hoffmanMappability/k24.Umap.MultiTrackMappability.bw stdout

Please refer to our mailing list archives for questions, or our Data Access FAQ for more information.

Credits

Anshul Kundaje (Stanford University) created the original Umap software in MATLAB. The original Umap repository is available here. Mehran Karimzadeh (Michael Hoffman lab, Princess Margaret Cancer Centre) implemented the Python version of Umap and added features, including Bismap.

References

Karimzadeh M, Ernst C, Kundaje A, Hoffman MM., Umap and Bismap: quantifying genome and methylome mappability bioRxiv bioRxiv, p. 095463, 2016.; doi: https://doi.org/10.1101/095463.