Schema for HGSV Discordant - HGSV Discordant Clone End Alignments
  Database: hg17    Primary Table: kiddEichlerDiscAbc12    Row Count: 21,908   Data last updated: 2008-06-10
Format description: Browser extensible data
On download server: MariaDB table dump directory
fieldexampleSQL type info description
bin 585smallint(5) unsigned range Indexing field to speed chromosome range queries.
chrom chr1varchar(255) values Reference sequence chromosome or scaffold
chromStart 0int(10) unsigned range Start position in chromosome
chromEnd 416int(10) unsigned range End position in chromosome
name ABC12_49070700_I14,transchr...varchar(255) values Name of item
score 2446int(10) unsigned range Optional score, nominal range 0-1000
strand +char(1) values + or -
thickStart 0int(10) unsigned range Start of where display should be thick (start codon)
thickEnd 416int(10) unsigned range End of where display should be thick (stop codon)
itemRgb 0int(10) unsigned range Used as itemRgb as of 2004-11-22
blockCount 1int(10) unsigned range Number of blocks
blockSizes 416,longblob   Comma separated list of block sizes
chromStarts 0,longblob   Start positions relative to chromStart

Connected Tables and Joining Fields
        hg17.kiddEichlerDiscAbc10.name (via kiddEichlerDiscAbc12.name)
      hg17.kiddEichlerDiscAbc11.name (via kiddEichlerDiscAbc12.name)
      hg17.kiddEichlerDiscAbc13.name (via kiddEichlerDiscAbc12.name)
      hg17.kiddEichlerDiscAbc14.name (via kiddEichlerDiscAbc12.name)
      hg17.kiddEichlerDiscAbc7.name (via kiddEichlerDiscAbc12.name)
      hg17.kiddEichlerDiscAbc8.name (via kiddEichlerDiscAbc12.name)
      hg17.kiddEichlerDiscAbc9.name (via kiddEichlerDiscAbc12.name)
      hg17.kiddEichlerDiscG248.name (via kiddEichlerDiscAbc12.name)
      hg17.kiddEichlerToNcbi.name (via kiddEichlerDiscAbc12.name)

Sample Rows
 
binchromchromStartchromEndnamescorestrandthickStartthickEnditemRgbblockCountblockSizeschromStarts
585chr10416ABC12_49070700_I14,transchrm_chr52446+04160,0,01416,0,
585chr17930179987ABC12_46964400_N11,transchrm_chr23008+79301799870,0,01686,0,
586chr1155025155889ABC12_49237900_H2,transchrm_chr163161-1550251558890,0,01864,0,
586chr1225902226739ABC12_49222200_K5,transchrm_chr163174+2259022267390,0,01837,0,
586chr1227121227892ABC12_49138200_O17,transchrm_chr73046+2271212278920,0,01771,0,
586chr1233304233989ABC12_49072300_P5,transchrm_chr73126+2333042339890,0,01685,0,
586chr1233919234707ABC12_49228600_F19,transchrm_chr17_random3068+2339192347070,0,01788,0,
586chr1236948237681ABC12_46887000_O3,transchrm_chr72912+2369482376810,0,01733,0,
586chr1239740240092ABC12_49215100_I23,transchrm_chr101330-2397402400920,0,01352,0,
589chr1566714601690ABC12_46413200_I24,insertion1968+5667146016900,0,2242359,626,0,34350,

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.

HGSV Discordant (kiddEichlerDisc) Track Description
 

Description

This track shows data from the Human Genome Structural Variation Project. Clone ends from nine individuals from Kidd, et al. were mapped to the reference Human genome. This track shows clones whose end mappings were discordant with the reference genome in one of the following ways:

  • deletion: Clone mapping too large relative to reference
  • insertion: Clone mapping too small relative to reference
  • inversion: In appropriate orientation, clone mapping spans potential inversion breakpoint
  • OEA: One End Anchored clones (only one end could be mapped to reference)
  • transchrm: Clone ends map to different chromosomes (name indicates identity of other chromosome after the underscore).

Each individual's discordant clone end mappings are in a different subtrack. The nine individuals' labels used in Kidd, et al., populations of origin, and Coriell Cell Repository catalog IDs are shown here:

Individual  Population  Coriell ID
ABC14CEPHNA12156
ABC13YorubaNA19129
ABC12CEPHNA12878
ABC11ChinaNA18555
ABC10YorubaNA19240
ABC9JapanNA18956
ABC8YorubaNA18507
ABC7YorubaNA18517
G248UnknownNA15510

Methods

Excerpted from Kidd, et al.:

We selected eight individuals as part of the first phase of the Human Genome Structural Variation Project. This included four individuals of Yoruba Nigerian ethnicity and four individuals of non-African ethnicity. For each individual we constructed a whole genomic library of about 1 million clones, using a fosmid subcloning strategy. Each library was arrayed and both ends of each clone insert were sequenced to generate a pair of high-quality end sequences (termed an end-sequence pair (ESP)). The overall approach generated a physical clone map for each individual human genome, flagging regions discrepant by size or orientation on the basis of the placement of end sequences against the reference assembly. Across all eight libraries, we mapped 6.1 million clones to distinct locations against the reference sequence (http://hgsv.washington.edu). Of these, 76,767 were discordant by length and/or orientation, indicating potential sites of structural variation. About 0.4% (23,742) of the ESPs mapped with only one end to the reference assembly despite the presence of high-quality sequence at the other end (termed one-end anchored (OEA) clones).

Note: This track contains many more than the 76,767 + 23,742 items mentioned above because it also includes clones whose ends map to different chromosomes (transchrm).

References

Kidd JM, Cooper GM, Donahue WF, Hayden HS, Sampas N, Graves T, Hansen N, Teague B, Alkan C, Antonacci F, et al. Mapping and sequencing of structural variation from eight human genomes. Nature. 2008 May 1;453(7191):56-64.