This track is produced as part of the ENCODE project. The track displays copy number variation (CNV) as determined by the Illumina Human 1M-Duo Infinium HD BeadChip assay and circular binary segmentation (CBS). The Human 1M-Duo contains more than 1,100,000 tagSNP markers and a set of ~60,000 additional CNV-targeted markers. The median spacing between markers is 1.5 kb and the mean spacing is 2.4 kb. The B-allele frequency and genotyping single nucleotide polymorphism (SNP) data generated by the experiment are not displayed, but are available for download from the Downloads page.
Where applicable, biological replicates of each cell line are reported separately. Possible uses of the data include correction of copy number in peak-calling for ChIP-seq, transcriptome, DNase hypersensitivity, and methylation determinations.
Display Conventions and Configuration
Metadata for a particular subtrack can be found by clicking the down arrow in the list of subtracks.
The track displays regions of the genome where copy number variation has been assessed. CNV regions are colored by type:
- blue = amplified
- black = normal
- orange = heterozygous deletion
- red = homozygous deletion
The mean log R ratio for each region can be seen by clicking on each individual region. See Methods below for significance of log R ratio values. The mean log R ratio for each region is reported in the .bed file available for download.
The Illumina 1M-Duo B-allele frequency data is available from the Supplemental Materials directory on the Downloads page. The file (wgEncodeHaibGenotypeBalleleSnp.txt) was generated using the standard Illumina protocol and contains the B-allele frequency for all cell types tested. The genotype calls for all cell types tested are also available for download (wgEncodeHaibGenotypeGtypeSnp.txt). Genotyping calls with a Gencall value greater than 0.6 are considered significant.
The replicate labeling in the genome browser view is a
counter indicating the total number of replicates submitted (UCSC Rep).
The producing lab has replicate numbers (Lab Rep) that correspond to their internal bio-replicate numbering.
Where these two numbering systems conflict, both are listed in the long label of the specific track.
When comparing data across tracks, the lab replicate number should be considered. In the downloads directory both replicate numbers are listed. The files are labeled with the lab replicate number.
Isolation of genomic DNA and hybridization
Cells were grown according to the approved ENCODE cell culture protocols by the Myers lab and by other ENCODE production groups. The production group is reported in the metadata. Genomic DNA was isolated using the DNeasy Blood and Tissue Kit (Qiagen). DNA concentration and quality were determined by fluorescence (Invitrogen Quant-iT dsDNA High Sensitivity Kit and Qubit Fluorometer), and 400 nanograms of each sample were hybridized to Illumina 1M-Duo DNA Analysis BeadChips.
Processing and Analysis
The genotypes from the 1M-Duo Arrays were ascertained with BeadStudio by using default settings and formatting with the A/B genotype designation for each SNP. Primary QC for each sample was a cut-off at a call rate of 0.95.
Copy Number Variation (CNV) analysis was performed with circular binary segmentation (DNAcopy) of the log R ratio values at each probe (Olshen et al., 2004). The parameters used were alpha=0.001, nperm=5000, sd.undo=1. The copy number segments are reported with the mean log R ratio for each chromosomal segment called by CBS. Log ratios of ~-0.2 to -1.5 can be considered heterozygous deletions, < -1.5 homozygous deletions, and > 0.2 amplifications. Primary QC for each sample was SD of < 0.6.
This is release 2 of this track (Jan 2012). This is a correction release. There are no new experiments. The affected tracks are:
wgEncodeHaibGenotypeGm12878RegionsRep1 - replaced by wgEncodeHaibGenotypeGm12878RegionsRep1V2
due to mapping off the end of the chromosome in the original version.
wgEncodeHaibGenotypeAstrocyRegionsRep1 - renamed to wgEncodeHaibGenotypeNhaDukeRegionsRep1
Astrocytes and NH-A are the same cell line.
In addition to the above changes, color values for the files have been corrected as well. This does not affect any data values.
This is the NCBI Build37 (hg19) release of this track. This release includes the 3 cell types previously released on NCBI Build36 (hg18) which were lifted to NCBI Build37 (hg19) and adds data for many more cell types. The track includes a single display for each cell type and reports the Log ratio in the .bed files. The B-allele frequency and SNP genotyping files are not displayed, but are available for download for the entire dataset from the downloads page.
These data were produced by the Dr. Richard Myers Lab and the Dr. Devin Absher lab at the HudsonAlpha Institute for Biotechnology.
Cells were grown by the Myers Lab and other ENCODE production groups.
Dr. Florencia Pauli.
Olshen AB, Venkatraman ES, Lucito R, Wigler M.
Circular binary segmentation for the analysis of array-based DNA copy number data.
Biostatistics. 2004 Oct;5(4):557-72.
Data Release Policy
Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column on the track configuration page and the download page. The full data release policy for ENCODE is available here.