Schema for Recomb. Breakpoints - Recombination Breakpoints from Thurakia et al 2022
  Database: wuhCor1    Primary Table: recomb Data last updated: 2022-09-27
Big Bed File Download: /gbdb/wuhCor1/recomb/ripples_breakpoints.bb
Item Count: 864
The data is stored in the binary BigBed format.

Format description: Browser Extensible Data
fieldexampledescription
chromNC_045512v2Reference sequence chromosome or scaffold
chromStart7011Start position in chromosome
chromEnd20262End position in chromosome
name199585_199574_168400_bp1Name of item.
score1000Score (0-1000)
strand.+ or - for strand

Sample Rows
 
chromchromStartchromEndnamescorestrand
NC_045512v2701120262199585_199574_168400_bp11000.
NC_045512v278312590794354_102299_94353_bp11000.
NC_045512v2996720962255596_255581_255541_bp11000.
NC_045512v21129619983324163_321690_324162_bp11000.
NC_045512v21129624914219196_275515_219195_bp11000.
NC_045512v2115142125540685_91082_40888_bp21000.
NC_045512v21203020262144321_101506_144320_bp11000.
NC_045512v21278919974354074_352705_351323_bp11000.
NC_045512v21297023604346018_327345_346017_bp11000.
NC_045512v21332923709166493_164712_166487_bp11000.

Recomb. Breakpoints (recomb) Track Description
 

Description

This track shows recombination breakpoints inferred by the RIPPLES software from a phylogenetic tree of 1.6 million SARS-CoV-2 sequences, described by Thurakia et al, Nature 2022.

The track is in "density" mode by default, it shows the density of recombinated sequences per nucleotide. By deactivating the "Density plot" checkbox on the configuration page, all recombinations can be shown.

Methods

From Thurakia et al, Nature 2022: "We developed a new method for detecting recombination in pandemic-scale phylogenies, Recombination Inference using Phylogenetic PLacEmentS, RIPPLES. Because recombination violates the central assumption of many phylogenetic methods, that is, that a single evolutionary history is shared across the genome, recombinant lineages arising from diverse genomes will often be found on 'long branches', which result from accommodating the divergent evolutionary histories of the two parental haplotypes. Note that as long as recombination is relatively uncommon, phylogenetic inference is expected to remain accurate even when branch lengths are artifactually expanded. RIPPLES exploits that signal by first identifying long branches on a comprehensive SARS-CoV-2 mutation-annotated tree. RIPPLES then exhaustively breaks the potential recombinant sequence into distinct segments and replaces each onto a global phylogeny using maximum parsimony. RIPPLES reports the two parental nodes-hereafter termed donor and acceptor-that result in the highest parsimony score improvement relative to the original placement on the global phylogeny. Our approach therefore leverages phylogenetic signals for each parental lineage and the spatial correlation of markers along the genome. We establish significance using a null model conditioned on the inferred site-specific rates of de novo mutation."

Data Access

You can download the bigBed file underlying this track (primers) from our Download Server. The data can be explored interactively with the Table Browser or the Data Integrator. The data can also be accessed from scripts through our API.

Credits

Thanks to Bryan Thornlow for sharing the data.

References

Turakhia Y, Thornlow B, Hinrichs A, McBroome J, Ayala N, Ye C, Smith K, De Maio N, Haussler D, Lanfear R et al. Pandemic-scale phylogenomics reveals the SARS-CoV-2 recombination landscape. Nature. 2022 Sep;609(7929):994-997. PMID: 35952714; PMC: PMC9519458