HAIB Genotype Track Settings
 
Genotype (CNV and SNP) by Illumina 1MDuo and CBS from ENCODE/HudsonAlpha   (All Variation tracks)

Display mode:       Reset to defaults   
Select subtracks by replicate and cell line: (help)
 All Replicate 1  2  3 
Cell Line
GM12878 (Tier 1) 
H1-hESC (Tier 1) 
K562 (Tier 1) 
A549 (Tier 2) 
HeLa-S3 (Tier 2) 
HepG2 (Tier 2) 
HUVEC (Tier 2) 
IMR90 (Tier 2) 
MCF-7 (Tier 2) 
AG04449 
AG04450 
AG09309 
AG09319 
AG10803 
AoSMC 
BJ 
Caco-2 
Chorion 
CMK 
GM06990 
GM12891 
GM12892 
GM19239 
HAEpiC 
HCF 
HCM 
HCPEpiC 
HEEpic 
HEK293 
HIPEpiC 
HL-60 
HMEC 
HNPCEpiC 
HPAEpiC 
HRCEpiC 
HRE 
HRPEpiC 
HSMM 
HSMM tube 
HTR8svn 
Jejunum BC H12817N 
Jurkat 
LNCaP 
MCF10A-Er-Src 
MCF10A-Er-Src (Tamoxifin) 
Melano 
Myometr 
NB4 
NH-A (Duke) 
NH-A (UW) 
NHBE 
NHDF-neo 
NT2-D1 
Ovcar-3 
PANC-1 
PFSK-1 
PrEC 
RPTEC 
SAEC 
SK-N-SH RA 
SKMC 
Small Intestine BC 01-11002 
T-47D 
U87 
Cell Line
 All Replicate 1  2  3 
List subtracks: only selected/visible    all    ()
  Cell Line↓1 Replicate↓2 ObtainedBy↓3   Track Name↓4    Restricted Until↓5
 
hide
 Configure
 GM12878  1  HudsonAlpha  GM12878 Copy number variants Replicate 1 from ENCODE/HAIB    Schema   2012-02-09 
 
hide
 Configure
 H1-hESC  1  HudsonAlpha  H1-hESC Copy number variants Replicate 1 (Lab Rep 2) from ENCODE/HAIB    Schema   2011-10-20 
 
hide
 Configure
 K562  1  HudsonAlpha  K562 Copy number variants Replicate 1 from ENCODE/HAIB    Schema   2011-11-17 
 
hide
 Configure
 HeLa-S3  1  HudsonAlpha  HeLa-S3 Copy number variants Replicate 1 (Lab Rep 2) from ENCODE/HAIB    Schema   2011-10-20 
 
hide
 Configure
 HepG2  1  HudsonAlpha  HepG2 Copy number variants Replicate 1 from ENCODE/HAIB    Schema   2011-11-17 
     Restriction Policy
Downloads

Description

This track is produced as part of the ENCODE project. The track displays copy number variation (CNV) as determined by the Illumina Human 1M-Duo Infinium HD BeadChip assay and circular binary segmentation (CBS). The Human 1M-Duo contains more than 1,100,000 tagSNP markers and a set of ~60,000 additional CNV-targeted markers. The median spacing between markers is 1.5 kb and the mean spacing is 2.4 kb. The B-allele frequency and genotyping single nucleotide polymorphism (SNP) data generated by the experiment are not displayed, but are available for download from the Downloads page.

Where applicable, biological replicates of each cell line are reported separately. Possible uses of the data include correction of copy number in peak-calling for ChIP-seq, transcriptome, DNase hypersensitivity, and methylation determinations.

Display Conventions and Configuration

Metadata for a particular subtrack can be found by clicking the down arrow in the list of subtracks.

The track displays regions of the genome where copy number variation has been assessed. CNV regions are colored by type:

  • blue = amplified
  • black = normal
  • orange = heterozygous deletion
  • red = homozygous deletion

The mean log R ratio for each region can be seen by clicking on each individual region. See Methods below for significance of log R ratio values. The mean log R ratio for each region is reported in the .bed file available for download.

The Illumina 1M-Duo B-allele frequency data is available from the Supplemental Materials directory on the Downloads page. The file (wgEncodeHaibGenotypeBalleleSnp.txt) was generated using the standard Illumina protocol and contains the B-allele frequency for all cell types tested. The genotype calls for all cell types tested are also available for download (wgEncodeHaibGenotypeGtypeSnp.txt). Genotyping calls with a Gencall value greater than 0.6 are considered significant.

Replicate Numbering

The replicate labeling in the genome browser view is a counter indicating the total number of replicates submitted (UCSC Rep). The producing lab has replicate numbers (Lab Rep) that correspond to their internal bio-replicate numbering. Where these two numbering systems conflict, both are listed in the long label of the specific track. When comparing data across tracks, the lab replicate number should be considered. In the downloads directory both replicate numbers are listed. The files are labeled with the lab replicate number.

Methods

Isolation of genomic DNA and hybridization

Cells were grown according to the approved ENCODE cell culture protocols by the Myers lab and by other ENCODE production groups. The production group is reported in the metadata. Genomic DNA was isolated using the DNeasy Blood and Tissue Kit (Qiagen). DNA concentration and quality were determined by fluorescence (Invitrogen Quant-iT dsDNA High Sensitivity Kit and Qubit Fluorometer), and 400 nanograms of each sample were hybridized to Illumina 1M-Duo DNA Analysis BeadChips.

Processing and Analysis

The genotypes from the 1M-Duo Arrays were ascertained with BeadStudio by using default settings and formatting with the A/B genotype designation for each SNP. Primary QC for each sample was a cut-off at a call rate of 0.95.

Copy Number Variation (CNV) analysis was performed with circular binary segmentation (DNAcopy) of the log R ratio values at each probe (Olshen et al., 2004). The parameters used were alpha=0.001, nperm=5000, sd.undo=1. The copy number segments are reported with the mean log R ratio for each chromosomal segment called by CBS. Log ratios of ~-0.2 to -1.5 can be considered heterozygous deletions, < -1.5 homozygous deletions, and > 0.2 amplifications. Primary QC for each sample was SD of < 0.6.

Release Notes

This is release 2 of this track (Jan 2012). This is a correction release. There are no new experiments. The affected tracks are:
wgEncodeHaibGenotypeGm12878RegionsRep1 - replaced by wgEncodeHaibGenotypeGm12878RegionsRep1V2 due to mapping off the end of the chromosome in the original version.
wgEncodeHaibGenotypeAstrocyRegionsRep1 - renamed to wgEncodeHaibGenotypeNhaDukeRegionsRep1 Astrocytes and NH-A are the same cell line.

In addition to the above changes, color values for the files have been corrected as well. This does not affect any data values.

This is the NCBI Build37 (hg19) release of this track. This release includes the 3 cell types previously released on NCBI Build36 (hg18) which were lifted to NCBI Build37 (hg19) and adds data for many more cell types. The track includes a single display for each cell type and reports the Log ratio in the .bed files. The B-allele frequency and SNP genotyping files are not displayed, but are available for download for the entire dataset from the downloads page.

Credits

These data were produced by the Dr. Richard Myers Lab and the Dr. Devin Absher lab at the HudsonAlpha Institute for Biotechnology.

Cells were grown by the Myers Lab and other ENCODE production groups.

Contact: Dr. Florencia Pauli.

References

Olshen AB, Venkatraman ES, Lucito R, Wigler M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics. 2004 Oct;5(4):557-72.

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column on the track configuration page and the download page. The full data release policy for ENCODE is available here.