Schema for Segmental Dups - Duplications of >1000 Bases of Non-RepeatMasked Sequence
  Database: hg19    Primary Table: genomicSuperDups    Row Count: 51,599   Data last updated: 2011-09-26
Format description: Summary of large genomic Duplications (>1KB >90% similar)
fieldexampleSQL type info description
bin 585smallint(6) range Indexing field to speed chromosome range queries.
chrom chr1varchar(255) values Reference sequence chromosome or scaffold
chromStart 10000int(10) unsigned range Start position in chromosome
chromEnd 87112int(10) unsigned range End position in chromosome
name chr15:102446355varchar(255) values Other chromosome involved
score 0int(10) unsigned range Score based on the raw BLAST alignment score. Set to 0 and not used in later versions.
strand -char(1) values Value should be + or -
otherChrom chr15varchar(255) values Other chromosome or scaffold
otherStart 102446355int(10) unsigned range Start in other sequence
otherEnd 102521392int(10) unsigned range End in other sequence
otherSize 102531392int(10) unsigned range Total size of other chromosome
uid 9119int(10) unsigned range Unique id shared by the query and subject
posBasesHit 1000int(10) unsigned range For future use
testResult N/Avarchar(255) values For future use
verdict N/Avarchar(255) values For future use
chits N/Avarchar(255) values For future use
ccov N/Avarchar(255) values For future use
alignfile align_both/0008/both041194varchar(255) values alignment file path
alignL 77880int(10) unsigned range spaces/positions in alignment
indelN 71int(10) unsigned range number of indels
indelS 3611int(10) unsigned range indel spaces
alignB 74269int(10) unsigned range bases Aligned
matchB 73742int(10) unsigned range aligned bases that match
mismatchB 527int(10) unsigned range aligned bases that do not match
transitionsB 332int(10) unsigned range number of transitions
transversionsB 195int(10) unsigned range number of transversions
fracMatch 0.992904float range fraction of matching bases
fracMatchIndel 0.991956float range fraction of matching bases with indels
jcK 0.00712961float range K-value calculated with Jukes-Cantor
k2K 0.00713299float range Kimura K

Sample Rows

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.

Segmental Dups (genomicSuperDups) Track Description


This track shows regions detected as putative genomic duplications within the golden path. The following display conventions are used to distinguish levels of similarity:

  • Light to dark gray: 90 - 98% similarity
  • Light to dark yellow: 98 - 99% similarity
  • Light to dark orange: greater than 99% similarity
  • Red: duplications of greater than 98% similarity that lack sufficient Segmental Duplication Database evidence (most likely missed overlaps)
For a region to be included in the track, at least 1 Kb of the total sequence (containing at least 500 bp of non-RepeatMasked sequence) had to align and a sequence identity of at least 90% was required.


Segmental duplications play an important role in both genomic disease and gene evolution. This track displays an analysis of the global organization of these long-range segments of identity in genomic sequence.

Large recent duplications (>= 1 kb and >= 90% identity) were detected by identifying high-copy repeats, removing these repeats from the genomic sequence ("fuguization") and searching all sequence for similarity. The repeats were then reinserted into the pairwise alignments, the ends of alignments trimmed, and global alignments were generated. For a full description of the "fuguization" detection method, see Bailey et al., 2001. This method has become known as WGAC (whole-genome assembly comparison); for example, see Bailey et al., 2002.


The data were provided by Saba Sajjadian, Arthur Ko and Evan Eichler at the University of Washington.


Bailey JA, Gu Z, Clark RA, Reinert K, Samonte RV, Schwartz S, Adams MD, Myers EW, Li PW, Eichler EE. Recent segmental duplications in the human genome. Science. 2002 Aug 9;297(5583):1003-7. PMID: 12169732

Bailey JA, Yavor AM, Massa HF, Trask BJ, Eichler EE. Segmental duplications: organization and impact within the current human genome project assembly. Genome Res. 2001 Jun;11(6):1005-17. PMID: 11381028; PMC: PMC311093