Schema for Common Cell CNV - ENCODE Common Cell Type Copy Number Variation, by Illumina 1M and CBS
  Database: hg18    Primary Table: wgEncodeHudsonalphaCnvSignalGM12878    Row Count: 511   Data last updated: 2009-08-10
Format description: bed-like graphing data
On download server: MariaDB table dump directory
fieldexampleSQL type info description
bin 1smallint(5) unsigned range Indexing field to speed chromosome range queries.
chrom chr1varchar(255) values Reference sequence chromosome or scaffold
chromStart 10004int(10) unsigned range Start position in chromosome
chromEnd 12165045int(10) unsigned range End position in chromosome
dataValue -0.0601float range data value for this range

Sample Rows
 
binchromchromStartchromEnddataValue
1chr11000412165045-0.0601
677chr11216538012174072-0.3244
10chr11217439516026497-0.0385
707chr11602732816027531-0.9597
1chr11603397647166943-0.0406
944chr14716769047167690-1.0231
1chr14716847357747699-0.0268
1025chr15774778557752254-0.7568
1chr15775918766971937-0.0114
1095chr16697475366974854-0.8516

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.

Common Cell CNV (wgEncodeHudsonalphaCnv) Track Description
 

Description

This track shows copy number variation (CNV) in the ENCODE Tier 1 and Tier 2 human cell lines GM12878, HepG2, and K562 as determined by Illumina's Human 1M-Duo Infinium HD BeadChip assay and CNV analysis by circular binary segmentation (CBS).

Two biological replicates were generated for each cell line. Because biological replicates gave very similar results, the replicates were averaged to provide a single genotyping dataset in order to apply these data to other ENCODE experiments. Possible uses of this data are for correction of copy number in peak-calling for interactome, transcriptome, DNase hypersensitivity, and methylome determinations.

Display Conventions and Configuration

This track is a multi-view composite track that contains multiple data types (views). For each view, there are multiple subtracks that display individually on the browser. Instructions for configuring multi-view tracks are here.

Regions
Regions of the genome where copy number variation has been assesed. CNV regions are colored by type:
  • blue = amplified
  • black = normal
  • orange = heterozygous deletion
  • red = homozygous deletion

Signal
Mean log R ratio for each region. See Methods below. Signals are colored by cell type, not by copy number variation.

To show only selected subtracks, uncheck the boxes next to the tracks that you wish to hide.

Methods

Cells were grown according to the approved ENCODE cell culture protocols.

Isolation of genomic DNA and hybridization

Genomic DNA was extracted using the QIAGEN DNeasy Blood & Tissue Kit according to the instructions provided by the manufacturer. For each biological replicate of each cell line, DNA concentrations and a level of quality were determined by UV absorbance. Genotypes were determined from 400 nanograms of each sample at 1 million loci using Illumina Human 1M-Duo arrays and standard Illumina protocols.

Processing and Analysis

Genotypes were ascertained from the 1M-Duo Arrays with BeadStudio using default settings and formatting with the A/B genotype designation for each SNP (see 1M-Duo manifest file for specific nucleotide). Copy Number Variation (CNV) analysis was performed using circular binary segmentation (DNAcopy) of the log R ratio values at each probe (Olshen et al., 2004). The parameters used were alpha=0.001, nperm=5000, sd.undo=1. Copy number segments are reported with the mean log R ratio for each chromosomal segment called by CBS. Log ratios of ~-0.2 to -1.5 can be considered heterozygous deletions, < -1.5 homozygous deletions, and > 0.2 amplifications. The coordinates for the genotypes and copy number calls are from Human Genome Build 36.

Release Notes

Release 2 (April 2011) of this track updates the colors used in the Regions view subtracks (the data remains unchanged). The colors now adhere to the color standards determined at the first annual International Standards for Cytogenomic Arrays (ISCA) Scientific Conference.

Credits

Tim Reddy, Rebekka Sprouse, Richard Myers, Devin Absher from HudsonAlpha Institute.

Contact: Flo Pauli.

References

Olshen AB, Venkatraman ES, Lucito R, Wigler M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics. 2004 Oct;5(4)557-572.

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column on the track configuration page and the download page. The full data release policy for ENCODE is available here.