Schema for Common Cell CNV - ENCODE Common Cell Type Copy Number Variation, by Illumina 1M and CBS

Home
Genomes
Genome Browser
Tools
Mirrors
- Euro/Asia Mirrors
- Mirroring Instructions
- US Server
- European Server
- Asian Server
Downloads
My Data
Projects
Help
About Us
- News
- Publications
- Blog
- Cite Us
- Credits
- Release Log
- Staff
- Conditions of Use
- Our History
- Jobs
- Licenses
- Contact Us

field

example

SQL type

info

description

bin

smallint(5) unsigned

range

Indexing field to speed chromosome range queries.

chrom

chr1

varchar(255)

values

Reference sequence chromosome or scaffold

chromStart

10004

int(10) unsigned

range

Start position in chromosome

chromEnd

12165045

int(10) unsigned

range

End position in chromosome

dataValue

-0.0601

float

range

data value for this range

bin

chrom

chromStart

chromEnd

dataValue

chr1

10004

12165045

-0.0601

677

chr1

12165380

12174072

-0.3244

chr1

12174395

16026497

-0.0385

707

chr1

16027328

16027531

-0.9597

chr1

16033976

47166943

-0.0406

944

chr1

47167690

-1.0231

chr1

47168473

57747699

-0.0268

1025

chr1

57747785

57752254

-0.7568

chr1

57759187

66971937

-0.0114

1095

chr1

66974753

66974854

-0.8516

Description

This track shows copy number variation (CNV) in the ENCODE Tier 1 and Tier 2 human cell lines GM12878, HepG2, and K562 as determined by Illumina's Human 1M-Duo Infinium HD BeadChip assay and CNV analysis by circular binary segmentation (CBS).

Two biological replicates were generated for each cell line. Because biological replicates gave very similar results, the replicates were averaged to provide a single genotyping dataset in order to apply these data to other ENCODE experiments. Possible uses of this data are for correction of copy number in peak-calling for interactome, transcriptome, DNase hypersensitivity, and methylome determinations.

Display Conventions and Configuration

This track is a multi-view composite track that contains multiple data types (views). For each view, there are multiple subtracks that display individually on the browser. Instructions for configuring multi-view tracks are here.

Regions

Regions of the genome where copy number variation has been assesed. CNV regions are colored by type:

blue = amplified
black = normal
orange = heterozygous deletion
red = homozygous deletion

Signal

Mean log R ratio for each region. See Methods below. Signals are colored by cell type, not by copy number variation.

To show only selected subtracks, uncheck the boxes next to the tracks that you wish to hide.

Methods

Cells were grown according to the approved ENCODE cell culture protocols.

Isolation of genomic DNA and hybridization

Genomic DNA was extracted using the QIAGEN DNeasy Blood & Tissue Kit according to the instructions provided by the manufacturer. For each biological replicate of each cell line, DNA concentrations and a level of quality were determined by UV absorbance. Genotypes were determined from 400 nanograms of each sample at 1 million loci using Illumina Human 1M-Duo arrays and standard Illumina protocols.

Processing and Analysis

Genotypes were ascertained from the 1M-Duo Arrays with BeadStudio using default settings and formatting with the A/B genotype designation for each SNP (see 1M-Duo manifest file for specific nucleotide). Copy Number Variation (CNV) analysis was performed using circular binary segmentation (DNAcopy) of the log R ratio values at each probe (Olshen et al., 2004). The parameters used were alpha=0.001, nperm=5000, sd.undo=1. Copy number segments are reported with the mean log R ratio for each chromosomal segment called by CBS. Log ratios of ~-0.2 to -1.5 can be considered heterozygous deletions, < -1.5 homozygous deletions, and > 0.2 amplifications. The coordinates for the genotypes and copy number calls are from Human Genome Build 36.

Release Notes

Release 2 (April 2011) of this track updates the colors used in the Regions view subtracks (the data remains unchanged). The colors now adhere to the color standards determined at the first annual International Standards for Cytogenomic Arrays (ISCA) Scientific Conference.

Credits

Tim Reddy, Rebekka Sprouse, Richard Myers, Devin Absher from HudsonAlpha Institute.

Contact: Flo Pauli.

References

Olshen AB, Venkatraman ES, Lucito R, Wigler M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics. 2004 Oct;5(4)557-572.

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column on the track configuration page and the download page. The full data release policy for ENCODE is available here.