Schema for 1000G Ph1 Vars - 1000 Genomes Phase 1 Integrated Variant Calls: SNVs, Indels, SVs
  Database: hg19    Primary Table: tgpPhase1
VCF File: /gbdb/hg19/1000Genomes/ALL.chr1.integrated_phase1_v3.20101123.snps_indels_svs.genotypes.vcf.gz
Format description: The fields of a Variant Call Format data line
See the Variant Call Format specification for more details
fielddescription
chromAn identifier from the reference genome
posThe reference position, with the 1st base having position 1
idSemi-colon separated list of unique identifiers where available
refReference base(s)
altComma separated list of alternate non-reference alleles called on at least one of the samples
qualPhred-scaled quality score for the assertion made in ALT. i.e. give -10log_10 prob(call in ALT is wrong)
filterPASS if this position has passed all filters. Otherwise, a semicolon-separated list of codes for filters that fail
infoAdditional information encoded as a semicolon-separated series of short keys with optional comma-separated values
formatIf genotype columns are specified in header, a semicolon-separated list of of short keys starting with GT
genotypesIf genotype columns are specified in header, a tab-separated set of genotype column values; each value is a colon-separated list of values corresponding to keys in the format column

Sample Rows
 
chromposidrefaltqualfilterinfoformatgenotypes
110583rs58108140GA100PASSAVGPOST=0.7707;RSQ=0.4319;LDAF=0.2327;ERATE=0.0161;AN=2184;VT=SNP;AA=.;THETA=0.0046;AC=314;SNPSOURCE=LOWCOV;AF=0.14;ASN_AF=0.13; ...GT:DS:GL0|0:0.200:-0.18,-0.47,-2.420|0:0.150:-0.24,-0.44,-1.160|0:0.150:-0.15,-0.54,-3.120|1:0.600:-0.48,-0.48,-0.480|0:0.550:-0.48,-0.48,-0.480|1:0.950:-1.92,-0.01,-2.500|0:0.050:-0.05,-0.93,-5.000|0:0.100:-0.11,-0.66,-4.220|1:0.550:-0.28,-0.43,-0.960|0:0.450:-0.48,-0.48,-0.48...
110611rs189107123CG100PASSAN=2184;THETA=0.0077;VT=SNP;AA=.;AC=41;ERATE=0.0048;SNPSOURCE=LOWCOV;AVGPOST=0.9330;LDAF=0.0479;RSQ=0.3475;AF=0.02;ASN_AF=0.01;A ...GT:DS:GL0|0:0.050:-0.48,-0.48,-0.480|1:0.750:-0.24,-0.44,-1.160|0:0.000:-0.22,-0.42,-1.800|0:0.000:-0.48,-0.48,-0.480|0:0.150:-0.48,-0.48,-0.480|0:0.200:-0.48,-0.48,-0.480|0:0.050:-0.18,-0.47,-2.460|0:0.000:-0.15,-0.55,-3.190|0:0.050:-0.28,-0.43,-0.950|0:0.000:-0.48,-0.48,-0.48...
113302rs180734498CT100PASSTHETA=0.0048;AN=2184;AC=249;VT=SNP;AA=.;RSQ=0.6281;LDAF=0.1573;SNPSOURCE=LOWCOV;AVGPOST=0.8895;ERATE=0.0058;AF=0.11;ASN_AF=0.02; ...GT:DS:GL0|0:0.050:-0.13,-0.58,-3.620|1:1.000:-2.45,-0.00,-5.000|0:0.400:-0.29,-0.32,-5.000|0:0.250:-0.48,-0.48,-0.480|0:0.500:-0.48,-0.48,-0.481|0:1.000:-2.73,-0.00,-3.850|0:0.050:-0.06,-0.92,-5.000|1:1.000:-4.40,-0.00,-5.000|0:0.000:-0.03,-1.21,-5.000|0:0.000:-0.01,-1.57,-5.00...
113327rs144762171GC100PASSAVGPOST=0.9698;AN=2184;VT=SNP;AA=.;RSQ=0.6482;AC=59;SNPSOURCE=LOWCOV;ERATE=0.0012;LDAF=0.0359;THETA=0.0204;AF=0.03;ASN_AF=0.02;A ...GT:DS:GL0|0:0.000:-0.03,-1.11,-5.000|1:0.950:-1.97,-0.01,-2.510|0:0.000:-0.01,-1.69,-5.000|0:0.050:-0.48,-0.48,-0.480|0:0.100:-0.48,-0.48,-0.481|0:0.900:-1.38,-0.02,-5.000|0:0.000:-0.02,-1.38,-5.000|0:0.000:-0.03,-1.12,-5.000|0:0.000:-0.02,-1.42,-5.000|0:0.000:-0.01,-1.65,-5.00...
113957rs201747181TCT28PASSAA=TC;AC=35;AF=0.02;AFR_AF=0.02;AMR_AF=0.02;AN=2184;ASN_AF=0.01;AVGPOST=0.8711;ERATE=0.0065;EUR_AF=0.02;LDAF=0.0788;RSQ=0.2501;T ...GT:DS:GL0|0:0.050:0,0,00|1:0.650:0,0,00|0:0.100:0,0,00|0:0.350:0,0,00|0:0.050:0.00,-0.30,-4.100|0:0.650:0,0,00|0:0.150:0,0,00|0:0.150:0.00,-0.30,-4.100|0:0.100:0,0,00|0:0.300:0,0,0...
113980rs151276478TC100PASSAN=2184;AC=45;ERATE=0.0034;THETA=0.0139;RSQ=0.3603;LDAF=0.0525;VT=SNP;AA=.;AVGPOST=0.9221;SNPSOURCE=LOWCOV;AF=0.02;ASN_AF=0.02;A ...GT:DS:GL0|0:0.050:-0.48,-0.48,-0.480|0:0.600:-0.48,-0.48,-0.480|0:0.000:-0.48,-0.48,-0.480|0:0.150:-0.48,-0.48,-0.480|0:0.100:-0.48,-0.48,-0.480|0:0.500:-0.48,-0.48,-0.480|0:0.050:-0.48,-0.48,-0.480|0:0.050:-0.48,-0.48,-0.480|0:0.000:-0.21,-0.46,-1.460|0:0.100:-0.48,-0.48,-0.48...
130923rs140337953GT100PASSAC=1584;AA=T;AN=2184;RSQ=0.5481;VT=SNP;THETA=0.0162;SNPSOURCE=LOWCOV;ERATE=0.0183;LDAF=0.6576;AVGPOST=0.7335;AF=0.73;ASN_AF=0.89 ...GT:DS:GL1|1:1.750:-5.00,-0.61,-0.120|0:0.350:-0.10,-0.69,-2.810|0:0.150:-0.11,-0.64,-3.491|1:1.350:-0.48,-0.48,-0.481|0:1.100:-0.48,-0.48,-0.480|0:0.600:-0.22,-0.42,-1.741|1:1.750:-0.48,-0.48,-0.481|0:0.800:-0.12,-0.64,-2.621|1:1.550:-2.71,-0.35,-0.261|1:1.550:-0.48,-0.48,-0.48...
146402rs199681827CCTGT31PASSAA=.;AC=8;AF=0.0037;AFR_AF=0.01;AN=2184;ASN_AF=0.0017;AVGPOST=0.8325;ERATE=0.0072;LDAF=0.0903;RSQ=0.0960;THETA=0.0121;VT=INDELGT:DS:GL0|0:0.050:0,0,00|0:0.150:0,0,00|0:0.350:0,0,00|0:0.400:0,0,00|0:0.150:0,0,00|0:0.200:0,0,00|0:0.100:0,0,00|0:0.100:0,0,00|0:0.050:0.00,-0.30,-5.200|0:0.150:0,0,0...
147190rs200430748GGA192PASSAA=G;AC=29;AF=0.01;AFR_AF=0.06;AMR_AF=0.0028;AN=2184;AVGPOST=0.9041;ERATE=0.0041;LDAF=0.0628;RSQ=0.2883;THETA=0.0153;VT=INDELGT:DS:GL0|0:0.150:0,0,00|0:0.000:0,0,00|0:0.150:0.00,-0.30,-3.600|0:0.150:0,0,00|0:0.000:0,0,00|0:0.050:0,0,00|0:0.000:0,0,00|0:0.000:0,0,00|0:0.050:0,0,00|0:0.050:0,0,0...
151476rs187298206TC100PASSERATE=0.0021;AA=C;AC=18;AN=2184;VT=SNP;THETA=0.0103;LDAF=0.0157;SNPSOURCE=LOWCOV;AVGPOST=0.9819;RSQ=0.5258;AF=0.01;ASN_AF=0.01;A ...GT:DS:GL0|0:0.000:-0.05,-0.93,-5.000|0:0.400:-0.23,-0.45,-1.270|0:0.000:-0.01,-1.69,-5.000|0:0.000:-0.48,-0.48,-0.480|0:0.000:-0.59,-0.13,-5.000|0:0.150:-0.48,-0.48,-0.480|0:0.000:-0.10,-0.68,-4.700|0:0.050:-0.15,-0.54,-2.680|0:0.000:-0.10,-0.68,-4.400|0:0.000:-0.01,-1.47,-5.00...

1000G Ph1 Vars (tgpPhase1) Track Description
 

Description

This track shows ~38,200,000 single nucleotide variants (SNVs), ~3,900,000 short insertion/deletion variants (indels), and ~14,000 large deletions (also called structural variants, or SVs) discovered by the 1000 Genomes Project through its Phase 1 sequencing of 1,092 genomes from 14 populations in Africa, Europe, East Asia and the Americas.

The variant genotypes have been phased by the 1000 Genomes Project (i.e., the two alleles of each diploid genotype have been assigned to two haplotypes, one inherited from each parent). This extra information enables a clustering of independent haplotypes by local similarity for display.

Display Conventions

In "dense" mode, a vertical line is drawn at the position of each variant. In "pack" mode, since these variants have been phased, the display shows a clustering of haplotypes in the viewed range, sorted by similarity of alleles weighted by proximity to a central variant. The clustering view can highlight local patterns of linkage.

In the clustering display, each sample's phased diploid genotype is split into two independent haplotypes. Each haplotype is placed in a horizontal row of pixels; when the number of haplotypes exceeds the number of vertical pixels for the track, multiple haplotypes fall in the same pixel row and pixels are averaged across haplotypes.

Each variant is a vertical bar with white (invisible) representing the reference allele and black representing the non-reference allele(s). Tick marks are drawn at the top and bottom of each variant's vertical bar to make the bar more visible when most alleles are reference alleles. The vertical bar for the central variant used in clustering is outlined in purple. In order to avoid long compute times, the range of alleles used in clustering may be limited; alleles used in clustering have purple tick marks at the top and bottom.

The clustering tree is displayed to the left of the main image. It does not represent relatedness of individuals; it simply shows the arrangement of local haplotypes by similarity. When a rightmost branch is purple, it means that all haplotypes in that branch are identical, at least within the range of variants used in clustering.

Methods

Single-nucleotide variants, short insertions/deletions, and larger deletions were called from alignments of 1,092 individuals' low-coverage genomes and high-coverage exomes. For each type of variant, the results of multiple variant-calling methods were merged and filtered in order to provide high-confidence variant calls. For more details, see:

Credits

Thanks to the 1000 Genomes Project for making these data available in advance of publication.

References

1000 Genomes Pilot Project:
1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature. 2010 Oct 28;467(7319):1061-73.

Phase 1 of the 1000 Genomes Project:
1000 Genomes Project Consortium, Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean GA. An integrated map of genetic variation from 1,092 human genomes. Nature. 2012 Nov 1;491(7422):56-65.

1000 Genomes Frequently Asked Questions (FAQ)