Description
The HapMap Project
identified a set of approximately four million
common SNPs, and genotyped these SNPs in four populations.
The intent is that this data can be used as a reference for future studies
of human disease. This track displays the genotype counts and allele
frequencies of those SNPs. The data displayed are from release 21a (HapMap
Phase II), based on
dbSNP
build 125. This track also provides orthologous alleles from chimp and
macaque.
The HapMap data are from these four human populations:
- Yoruba in Ibadan, Nigeria (YRI)
- Japanese in Tokyo, Japan (JPT)
- Han Chinese in Beijing, China (CHB)
- CEPH (Utah residents with ancestry from northern and western Europe) (CEU)
Each of the populations is displayed in a separate subtrack.
The CEU and YRI data are comprised of 90 individuals in parent-child trios. The UCSC display
removes the data for the children, leaving 60 individuals in each population.
The CHB and JPT data are comprised of 45 individuals. Over 12% of HapMap SNPs
are available for only a subset (1-3) of the populations. When
available, the CHB and JPT SNPs were assayed in a minimum of 18 individuals,
with over 97% of SNPs assayed in 45 or more individuals. The minimums for CEU
and YRI are 26 and 24 respectively, with over 94% of SNPs assayed in 55 or more
individuals.
The HapMap assays provide biallelic results. Over 99.9% of HapMap SNPs are
included in dbSNP build125 as biallelic; approximately 3,000 are more complex.
Two-thirds of the HapMap SNPs are transitions: one-third are A/G, one-third
are C/T.
The orthologous alleles in chimp (panTro2) and macaque (rheMac2)
were derived using
liftOver.
Chimp alleles are available for over 96%
of the human HapMap SNPs; macaque alleles are available for 88%.
15% of HapMap SNPs are monomorphic in all individuals in all populations.
Within single populations, 21.5% of the SNPs are monomorphic in YRI
and 38% of the SNPs are monomorphic in JPT individuals.
Approximately 20% of HapMap SNPs have a different major allele in different
populations.
No two HapMap SNPs occupy the same position. Aside from seven SNPs from the
pseudo
autosomal region of chrX, no rsIds are included more than once. No HapMap SNPs
occur on chrM or on "random" chromosomes.
Display Conventions and Configuration
Note: calculation of heterozygosity has changed since this version of
the track.
In this track, expected heterozygosity is calculated
as follows: the allele counts from all populations are summed
(not normalized for population size)
and used to determine overall major and minor allele frequencies.
Assuming Hardy-Weinberg equilibrium, overall expected heterozygosity
is calculated as two times the product of major and minor allele
frequencies
(see Modern Genetic Analysis, section 17-2).
[In the HapMap SNPs track in the Mar. 2006 (hg18) assembly,
observed heterozygosity is calculated as follows: each population's
heterozygosity is computed as the proportion of heterozygous individuals in
the population. The population heterozygosities are averaged to determine the
overall observed heterozygosity.]
The human SNPs are displayed in gray using a color gradient based on minor allele
frequency. The higher the minor allele frequency, the darker the display.
By definition, the maximum minor allele frequency is 50%.
When zoomed to base level, the major allele is displayed for each population.
Reversing the base position track
will cause the HapMap display to reverse as
well. This is the recommended configuration for SNPs on the negative strand.
The orthologous alleles from chimp and macaque are displayed in brown using a color
gradient based on quality score.
Quality scores range from 0 to 100 representing low to high quality. For
orthologous alleles, the higher the quality, the darker the display. Quality
scores are not available for chimp chromosomes chr21 and chrY; these were set to
98, consistent with the panTro2 browser quality track.
Filters are provided for the data attributes described above. Additionally,
a filter is provided for heterozgosity over all populations. The measure of
heterozygosity used is 2pq (from Hardy-Weinberg equilibrium).
Filters are applied to all six subtracks. This is true, even if a subtrack
is not displayed.
Notes on orthologous allele filters:
- If the major allele is different between populations, no overall major
allele for human is determined, thus the "matching" filters for
orthologous alleles do not apply to these SNPs.
- If a SNP is monomorphic in all populations, the minor allele is not
verified in the HapMap dataset. In these cases, the filter to match
orthologous alleles to the minor human allele will yield no results.
Credits
This track is based on International HapMap Project release 21a data, provided by the HapMap Data Coordination Center.
References
HapMap Project
The International HapMap Consortium.
A second generation human haplotype map of over 3.1 million SNPs.
Nature. 2007 Oct 18;449(7164):851-61.
The International HapMap Consortium.
A haplotype map of the human genome.
Nature. 2005 Oct 27;437(7063):1299-320.
The International HapMap Consortium.
The International HapMap Project.
Nature. 2003 Dec 18;426(6968):789-96.
HapMap Data Coordination Center
Thorisson GA, Smith AV, Krishnan L, Stein LD.
The International HapMap Project Web site.
Genome Res. 2005 Nov;15(11):1592-3.
A Sampling of HapMap Literature
Gibson J, Morton NE, Collins A.
Extended tracts of homozygosity in outbred human populations.
Hum Mol Genet. 2006 Mar 1; 15(5):789-95.
Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, Fiegler H, Shapero
MH, Carson AR, Chen W et al.
Global variation in copy number in the human genome.
Nature. 2006 Nov 23;444(7118):444-454.
Spielman RS, Bastone LA, Burdick JT, Morley M, Ewens WJ, Cheung VG.
Common genetic variants account for differences in gene
expression among ethnic groups. Nature Genet. 2007
Feb;39(2):226-31.
Tenesa A, Navarro P, Hayes BJ, Duffy DL, Clarke GM, Goddard ME, Visscher PM.
Recent human effective population size estimated from linkage
disequilibrium. Genome Res. 2007 Apr;17(4):520-6.
Voight BF, Kudaravalli S, Wen X, Pritchard JK.
A Map of Recent Positive Selection in the Human Genome.
PLoS Biol. 2006 Mar;4(3):e72.
Weir BS, Cardon LR, Anderson AD, Nielsen DM, Hill WG.
Measures of human population structure show heterogeneity among genomic
regions. Genome Res. 2005 Nov;15(11):1468-76.
Data Source
The source for this track are the genotypes_chr*_*_r21a_nr.txt.gz files from
http://www.hapmap.org/downloads/genotypes/2007-01/rs_strand/non-redundant.
|