Schema for Affy RNA Loc - RNA Subcellular Localization by Tiling Microarray from ENCODE Affymetrix/CSHL
  Database: hg19    Primary Table: wgEncodeAffyRnaChipFiltTransfragsK562NucleolusTotal    Row Count: 1,471,952   Data last updated: 2010-07-09
Format description: BED6+3 Peaks of signal enrichment based on pooled, normalized (interpreted) data.
On download server: MariaDB table dump directory
fieldexampleSQL type description
bin 585smallint(5) unsigned Indexing field to speed chromosome range queries.
chrom chr1varchar(255) Reference sequence chromosome or scaffold
chromStart 16230int(10) unsigned Start position in chromosome
chromEnd 16306int(10) unsigned End position in chromosome
name .varchar(255) Name given to a region (preferably unique). Use . if no name is assigned.
score 0int(10) unsigned Indicates how dark the peak will be displayed in the browser (0-1000)
strand .char(2) + or - or . for unknown
signalValue 1871float Measurement of average enrichment for the region
pValue -1float Statistical significance of signal value (-log10). Set to -1 if not used.
qValue -1float Statistical significance with multiple-test correction applied (FDR -log10). Set to -1 if not used.

Sample Rows
 
binchromchromStartchromEndnamescorestrandsignalValuepValueqValue
585chr11623016306.0.1871-1-1
585chr12938529452.0.11017.3-1-1
585chr15591355988.0.402.75-1-1
585chr16944269578.0.318.958-1-1
585chr18216682272.0.381.224-1-1
585chr18975289808.0.464.5-1-1
585chr19149791590.0.343.15-1-1
586chr1243753243832.0.606.25-1-1
589chr1537465537552.0.430.447-1-1
589chr1564589564665.0.12469.5-1-1

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.

Affy RNA Loc (wgEncodeAffyRnaChip) Track Description
 

Description

This track is produced as part of the ENCODE Transcriptome Project. Transcription of different RNA extracts from different sub-cellular localizations in different cell lines is compared in companion experiments using three different technologies: tiling arrays, RNA-seq using Solexa, and RNA-seq using SOLiD. The tiling array data are shown in this track. The Transfrags data are lifted over from the hg18 assembly. The Raw Transfrags are available for download only. Other views are available on the hg18 assembly.

Display Conventions and Configuration

To show only selected subtracks, uncheck the boxes next to the tracks that you wish to hide.

Transfrags
The filtered transfrags view excludes repeats and other known annotations including: tRNAs and rRNAs, mi/snoRNAs, things mapping to the mitochondrial or Y chromosomes, and many predicted snoRNAs and miRNAs.

Methods

Cells were grown according to the approved ENCODE cell culture protocols. RNA molecules longer than 200 nt and present in RNA population isolated from different subcellular compartments (such as cytosol, nucleus, polysomes and others) were fractionated into polyA+ and polyA- fractions as described in these protocols. Each RNA fraction was converted into double-stranded cDNA using random hexamers, labeled and hybridized to a tiling 91-array set containing probes against the non-repetitive portion of the human genome tiled on average every 5 bp (center-to-center of each consecutive 25-mers).

All arrays were scaled to a median array intensity of 330. Within a sliding 61 bp window centered on each probe, an estimate of RNA abundance was found by calculating the median of all pairwise average PM-MM values, where PM is a perfect match and MM is a mismatch. Kapranov et al. (2002), Cheng et al. (2005) , Kapranov et al. (2007), and Cawley et al. (2004) are good references for the experimental methods. Cawley et al. also describes the analytical methods.

Verification

The reproducibility of the labeling method was assessed separately. Three independent technical replicates were generated from the same RNA pool for each RNA preparation and hybridized to duplicate arrays (two technical replicates) that contain the ENCODE regions. Labeled RNA samples were then pooled and hybridized to the tiling 91-array set spanning the whole genome. Transcribed regions (transfrags; see the Raw Transfrags view) were generated from the Raw Signal by merging genomic positions to which probes are mapped. This merging was based on a 5% false positive rate cutoff in negative bacterial controls, a maximum gap (MaxGap) of 40 base-pairs and minimum run (MinRun) of 40 base-pairs.

Release Notes

The track data were originally computed on the Human March 2006 assembly (hg18); the coordinates of the Transfrags were transformed to this assembly using UCSC's liftOver program.

Credits

These data were generated and analyzed by the transcriptome group at Affymetrix and Cold Spring Harbor Laboratories: P. Kapranov, I. Bell, E. Dumais, J. Drenkow, J. Dumais, N. Garg, M. Lubinsky, Carrie A. Davis, Huaien Wang, Kimberly Bell, Jorg Drenkow, Chris Zaleski, and Thomas R. Gingeras.

Contact: Tom Gingeras

References

Cawley S, Bekiranov S, Ng HH, Kapranov P, Sekinger EA, Kampa D, Piccolboni A, Sementchenko V, Cheng J, Williams AJ et al. Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell. 2004 Feb 20;116(4):499-509.

Cheng J, Kapranov P, Drenkow J, Dike S, Brubaker S, Patel S, Long J, Stern D, Tammana H, Helt G et al. Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science. 2005 May 20;308(5725):1149-54.

Kapranov P, Cawley SE, Drenkow J, Bekiranov S, Strausberg RL, Fodor SP, Gingeras TR. Large-scale transcriptional activity in chromosomes 21 and 22. Science. 2002 May 3;296(5569):916-9.

Kapranov P, Cheng J, Dike S, Nix DA, Duttagupta R, Willingham AT, Stadler PF, Hertel J, Hackermüller J, Hofacker IL et al. RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science. 2007 Jun 8;316(5830):1484-8.

Publications

Djebali S, Lagarde J, Kapranov P, Lacroix V, Borel C, Mudge JM, Howald C, Foissac S, Ucla C, Chrast J et al. Evidence for transcript networks composed of chimeric RNAs in human cells. PLoS One. 2012;7(1):e28213.

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column, above. The full data release policy for ENCODE is available here.