SNPs Track Settings
 
Simple Nucleotide Polymorphisms (SNPs)   (All Variation and Repeats tracks)

Display mode:      Duplicate track

Colors and Filters:
Use the "Color Specification" buttons to specify a group to direct coloring for the track display.
Variants can optionally be excluded based on their values in each of the subsequent categories by choosing "exclude".

Color Specification:
Source  Molecule Type  Variant Class  Validation Status  Functional Class  Location Type  Black  

Minimum Average Heterozygosity: 

Exclude  |  Sources:
          |  red green blue black - dbSnp
          |  red green blue black - Affymetrix Genotyping Array 10K
          |  red green blue black - Affymetrix Genotyping Array 10K v2
          |  red green blue black - Affymetrix Genotyping Array 50K HindIII
          |  red green blue black - Affymetrix Genotyping Array 50K XbaI

Exclude  |  Molecule Types:
          |  red green blue black - Unknown
          |  red green blue black - Genomic
          |  red green blue black - cDNA
          |  red green blue black - Mitochondrial
          |  red green blue black - Chloroplast

Exclude  |  Variant Classes:
          |  red green blue black - Unknown
          |  red green blue black - Single Nucleotide Polymorphism
          |  red green blue black - Insertion / Deletion
          |  red green blue black - Heterozygous
          |  red green blue black - Microsatellite
          |  red green blue black - Named
          |  red green blue black - No Variation
          |  red green blue black - Mixed
          |  red green blue black - Multiple Nucleotide Polymorphism

Exclude  |  Validation Status:
          |  red green blue black - Unknown
          |  red green blue black - Other Population
          |  red green blue black - By Frequency
          |  red green blue black - By Cluster
          |  red green blue black - By 2 Hit / 2 Allele
          |  red green blue black - By HapMap
          |  red green blue black - By Genotype

Exclude  |  Functional Classes:
          |  red green blue black - Unknown
          |  red green blue black - Locus Region
          |  red green blue black - Coding
          |  red green blue black - Coding - Synonymous
          |  red green blue black - Coding - Non-Synonymous
          |  red green blue black - mRNA/UTR
          |  red green blue black - Intron
          |  red green blue black - Splice site
          |  red green blue black - Reference
          |  red green blue black - Exception

Exclude  |  
Location Type:
          |  red green blue black - Unknown
          |  red green blue black - Range
          |  red green blue black - Exact
          |  red green blue black - Between
Data schema/format description and download
Assembly: Human July 2003 (NCBI34/hg16)
Data last updated at UCSC: 2005-03-07

Description

This track consolidates all the Simple Nucleotide Polymorphisms (SNPs) into a single track. This represents data from dbSnp and commercially-available genotyping arrays.

Please be aware that some mapping inconsistencies are known to exist in the dbSnp data set. If you encounter information that seems incorrect on the details page for a variant, we advise you to verify the record information on the dbSnp website using the provided link. In some known instances, the size of the variant does not match the size of its genomic location; UCSC is working with dbSnp to correct these errors in the data set.

Interpreting and Configuring the Graphical Display

Variants are shown as single tick marks at most zoom levels. When viewing the track at or near base-level resolution, the displayed width of the SNP corresponds to the width of the variant in the reference sequence. Insertions are indicated by a single tick mark displayed between two nucleotides, single nucleotide polymorphisms are displayed as the width of a single base, and multiple nucleotide variants are represented by a block that spans two or more bases.

When the start coordinate for a SNP is shown as chromStart = chromEnd+1 on the SNP's details page, this is generally not an error; rather, it indicates that the variant is an insertion at this genomic position. In these instances, the location type will be set to "between". Note that insertions are represented as chromStart = chromEnd in the snp table accessible from the Table Browser or downloads server, due to the half-open zero-based representation of data in the underlying database.

The colors of variants in the display may be changed to highlight their source, molecule type, variant class, validation status, or functional classification. Variants can be excluded from the display based on these same criteria or if they fall below the user-specified minimum average heterozygosity. The track configuration options are located at the top of the SNPs track description page. By default variants are colored by functional classification, with SNPs likely to cause a phenotype in red (non-synonymous and splice site mutations).

The following configuration categories reflect the following definitions defined in the document type definition (DTD) that describes the dbSnp XML format.

  • Source: Origin of this data
    • dbSnp - From the current build of dbSnp
    • Affymetrix Genotyping Array 10K - SNPs on the commercial array
    • Affymetrix Genotyping Array 10K v2 - SNPs on the commercial array
    • Affymetrix Genotyping Array 50K HindIII - SNPs on the commercial array
    • Affymetrix Genotyping Array 50K XbaI - SNPs on the commercial array
  • Molecule Type: Sample used to find this variant
    • Unknown - sample type not known
    • Genomic - variant discovered using a genomic template
    • cDNA - variant discovered using a cDNA template
    • Mitochondrial - variant discovered using a mitochondrial template
    • Chloroplast - variant discovered using a chloroplast template
  • Variant Class: Variant classification
    • Unknown - no classification provided by data contributor
    • Single Nucleotide Polymorphism - single nucleotide variation: alleles of length = 1 and from set of {A,T,C,G}
    • Insertion/deletion - insertion/deletion variation: alleles of different length or include '-' character
    • Heterozygous - heterozygous (undetermined) variation: allele contains string '(heterozygous)'
    • Microsatellite - microsatellite variation: allele string contains numbers and '(motif)' pattern
    • Named - insertion/deletion of named object (length unknown)
    • No Variation - no variation asserted for sequence
    • Mixed - mixed class
    • Multiple Nucleotide Polymorphism - alleles of the same length, length > 1, and from set of {A,T,C,G}
  • Validation Status: Method used to validate the variant (each variant may be validated by more than one method)
    • Unknown - no validation has been reported for this refSNP
    • Other Population - at least one ss in cluster was validated by independent assay
    • By Frequency - at least one subsnp in cluster has frequency data submitted
    • By Cluster - cluster has 2+ submissions, with 1+ submissions assayed with a non-computational method
    • By 2 Hit/2 Allele - all alleles have been observed in 2+ chromosomes
    • By HapMap - validated by HapMap project
    • By Genotype - at least one genotype reported for this refSNP
  • Function: Predicted functional role (each variant may have more than one functional role)
    • Unknown - no known functional classification
    • Locus Region - variation in region of gene, but not in transcript
    • Coding - variation in coding region of gene, assigned if allele-specific class unknown
    • Coding - Synonymous - no change in peptide for allele with respect to contig seq
    • Coding - Non-Synonymous - change in peptide with respect to contig sequence
    • mRNA/UTR - variation in transcript, but not in coding region interval
    • Intron - variation in intron, but not in first two or last two bases of intron
    • Splice Site - variation in first two or last two bases of intron
    • Reference - allele observed in reference contig sequence
    • Exception - variation in coding region with exception raised on alignment. This occurs when protein with gap in sequence is aligned back to contig sequence. Variations that are on the 3' side of the gap have undefined functional inference.
  • Location Type: Describes how a segment of the reference assembly must be altered to represent the variant SNP allele
    • Unknown - undefined or error
    • Range - a range of two or more bases in the reference assembly must be altered. This occurs, for example, when the variant allele is a deletion of two or more bases relative to the allele represented by the reference assembly.
    • Exact - one base in the reference assembly must be altered. This occurs when the variant allele is a single-base substitution relative to the reference genome or when the variant allele is a deletion of a single base.
    • Between - no reference assembly bases must be altered. This occurs when the variant allele is an insertion of one or more bases relative to the allele represented by the reference assembly.

Large Scale SNP Annotation at UCSF

LS-SNP is a database of functional and structural SNP annotations with links to protein structure models. Annotations are based on a variety of features extracted from protein structure, sequence, and evolution. Currently only coding non-synonomous SNPs are included. LS-SNP at UCSF.

Data Filtering

The SNPs in this track include all known polymorphisms available in the current build of dbSnp that can be mapped against the current assembly. The version of dbSnp from which these data were obtained can be found in the SNP track entry in the Genome Browser release log.

There are two reasons that some variants may not be mapped and/or annotated in this track:

  • Submissions are completely masked as repetitive elements. These are dropped from any further computations. This set of reference SNPs is found in chromosome "rs_chMasked" on the dbSNP ftp site.
  • Submissions are defined in a cDNA context with extensive splicing. These SNPs are typically annotated on refSeq mRNAs through a separate annotation process. Effort is being made to reverse map these variations back to contig coordinates, but that has not been implemented. For now, you can find this set of variations in "rs_chNotOn" on the dbSNP ftp site.

The heuristics for the non-SNP variations (i.e. named elements and short tandem repeats (STRs)) are quite conservative; therefore, some of these are probably lost. This approach was chosen to avoid false annotation of variation in inappropriate locations.

Credits and Data Use Restrictions

Thanks to the SNP Consortium and NIH for providing the public data, which are available from dbSnp at NCBI.

Thanks to Affymetrix, Inc. for developing the genotyping arrays. Please see the Terms and Conditions page on the Affymetrix website for restrictions on the use of their data. For more details on the Affymetrix genotyping assay, see the supplemental information on the Affymetrix 10K SNP and Affymetrix Genotyping Array products. Additional information, including genotyping data, is available on those pages.

Karchin, R., Diekhans, M., Kelly, L., Thomas, D.J., Pieper, U., Eswar, N., Haussler, D. and Sali, A. LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources. Bioinformatics 21:2814-2820; April 12, 2005.