Schema for Broad Histone - ENCODE Histone Modifications by Broad Institute ChIP-seq
  Database: hg18    Primary Table: wgEncodeBroadChipSeqSignalHmecCtcf    Row Count: 118,084   Data last updated: 2010-05-22
Format description: Wiggle track values to display as y-values (first 6 fields are bed6)
On download server: MariaDB table dump directory
fieldexampleSQL type info description
bin 585smallint(5) unsigned range Indexing field to speed chromosome range queries.
chrom chr1varchar(255) values Reference sequence chromosome or scaffold
chromStart 0int(10) unsigned range Start position in chromosome
chromEnd 25600int(10) unsigned range End position in chromosome
name chr1.0varchar(255) values Name of item
span 25int(10) unsigned range each value spans this many bases
count 1024int(10) unsigned range number of values in this block
offset 0int(10) unsigned range offset in File to fetch data
file /gbdb/hg18/wib/wgEncodeBroa...varchar(255) values path name to data file, one byte per value
lowerLimit 0double range lowest data value in this block
dataRange 508double range lowerLimit + dataRange = upperLimit
validCount 1024int(10) unsigned range number of valid data values in this block
sumData 9565double range sum of the data points, for average and stddev calc
sumSquares 2021630double range sum of data points squared, for stddev calc

Sample Rows
 
binchromchromStartchromEndnamespancountoffsetfilelowerLimitdataRangevalidCountsumDatasumSquares
585chr1025600chr1.02510240/gbdb/hg18/wib/wgEncodeBroadChipSeqSignalHmecCtcf.wib0508102495652021630
585chr12560051200chr1.12510241024/gbdb/hg18/wib/wgEncodeBroadChipSeqSignalHmecCtcf.wib010102420646724
585chr15120076800chr1.22510242048/gbdb/hg18/wib/wgEncodeBroadChipSeqSignalHmecCtcf.wib06102416364400
585chr176800102400chr1.32510243072/gbdb/hg18/wib/wgEncodeBroadChipSeqSignalHmecCtcf.wib0901024285873382
585chr1102400128000chr1.42510244096/gbdb/hg18/wib/wgEncodeBroadChipSeqSignalHmecCtcf.wib08102413733861
73chr1128000153600chr1.52510245120/gbdb/hg18/wib/wgEncodeBroadChipSeqSignalHmecCtcf.wib0301024163810508
586chr1153600179200chr1.62510246144/gbdb/hg18/wib/wgEncodeBroadChipSeqSignalHmecCtcf.wib0610247921966
586chr1179200204800chr1.72510247168/gbdb/hg18/wib/wgEncodeBroadChipSeqSignalHmecCtcf.wib00102400
586chr1204800230400chr1.82510248192/gbdb/hg18/wib/wgEncodeBroadChipSeqSignalHmecCtcf.wib013610242650132104
586chr1230400256000chr1.92510249216/gbdb/hg18/wib/wgEncodeBroadChipSeqSignalHmecCtcf.wib0381024219415334

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.

Broad Histone (wgEncodeBroadChipSeq) Track Description
 

Description

This track displays maps of chromatin state generated by the Broad/MGH ENCODE group using ChIP-seq. Chemical modifications (methylation, acylation) to the histone proteins present in chromatin influence gene expression by changing how accessible the chromatin is to transcription.

The ChIP-seq method involves cross-linking histones and other DNA associated proteins to genomic DNA within cells using formaldehyde. The cross-linked chromatin is subsequently extracted, mechanically sheared, and immunoprecipitated using specific antibodies. After reversal of cross-links, the immunoprecipitated DNA is sequenced and mapped to the human reference genome. The relative enrichment of each antibody-target (epitope) across the genome is inferred from the density of mapped fragments.

Display Conventions and Configuration

This track is a multi-view composite track that contains multiple data types (views). For each view, there are multiple subtracks that display individually on the browser. Instructions for configuring multi-view tracks are here. ENCODE tracks typically contain one or more of the following views:

Peaks
Regions of signal enrichment based on processed data (usually normalized data from pooled replicates). ENCODE Peaks tables contain fields for statistical significance. Peaks for this track include a signalValue and pValue. The signalValue represents the fold enrichment of reads across the length of the interval, relative to random expectation. The pValue reflects the likelihood of observing an interval of the given length and signalValue at random. A long interval with a moderate signalValue and a short interval with a high signalValue can therefore have the same pValue.
Signal
Density graph (wiggle) of signal enrichment based on processed data.
Additional data that were used to generate these tracks are located in the ENCODE Mappability track:
Alignability
The Broad alignability track displays whether a region is made up of mostly unique or mostly non-unique sequence.

Methods

Cells were grown according to the approved ENCODE cell culture protocols.

Chromatin immunoprecipitation was performed with each of the histone antibodies listed above. Isolated DNA was then end-repaired, adapter-ligated and sequenced using Illumina Genome Analyzers.

Sequence reads from each IP experiment were aligned to the human reference genome (hg18) using MAQ. Discrete intervals of ChIP-seq fragment enrichment were identified using a scan statistics approach, assuming a uniform background signal.

More details of the experimental protocol and analysis are available here.

Release Notes

Release 3 (Mar 2010) of this track adds the HSMM cell line and includes new experiments for H1-hESC and NHLF. No previously released data has been replaced in this release. Update to Release 3 (Jun 2010) of this track consists of a display change to the Signal subtracks. This update provides a better display of the data when zoomed in to a range spanning less than 16,500 base pairs.

Release 2 did contain newer versions of previously released data, however. All versioned data are marked with "submittedDataVersion=V2" in the metadata, along with the reason for the change. Previous versions of these files are available for download from the FTP site.

Please note that an antibody previously labeled "Pol2 (b)" is, in fact, Covance antibody MMS-128P with the target POLR2A.

Credits

The ChIP-seq data were generated at the Broad Institute and in the Bradley E. Bernstein lab at the Massachusetts General Hospital/Harvard Medical School.    Contact: Noam Shoresh.

Data generation and analysis was supported by funds from the NHGRI, the Burroughs Wellcome Fund, Massachusetts General Hospital and the Broad Institute.

References

Bernstein BE, Kamal M, Lindblad-Toh K, Bekiranov S, Bailey DK, Huebert DJ, McMahon S, Karlsson EK, Kulbokas EJ 3rd, Gingeras TR et al. Genomic maps and comparative analysis of histone modifications in human and mouse. Cell. 2005 Jan 28;120(2):169-81.

Bernstein BE, Mikkelsen TS, Xie X, Kamal M, Huebert DJ, Cuff J, Fry B, Meissner A, Wernig M, Plath K et al. A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell. 2006 Apr 21;125(2):315-26.

Mikkelsen TS, Ku M, Jaffe DB, Issac B, Lieberman E, Giannoukos G, Alvarez P, Brockman W, Kim TK, Koche RP et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature. 2007 Aug 2;448(7153):553-60.

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column on the track configuration page and the download page. The full data release policy for ENCODE is available here.