Schema for RIKEN CAGE Loc - ENCODE RIKEN RNA Subcellular Localization by CAGE Tags
  Database: hg18    Primary Table: wgEncodeRikenCageMinusClustersNhekCytosolLongnonpolya    Row Count: 264,205   Data last updated: 2010-01-20
Format description: bed-like graphing data
On download server: MariaDB table dump directory
fieldexampleSQL type info description
bin 585smallint(5) unsigned range Indexing field to speed chromosome range queries.
chrom chr1varchar(255) values Reference sequence chromosome or scaffold
chromStart 4268int(10) unsigned range Start position in chromosome
chromEnd 4305int(10) unsigned range End position in chromosome
dataValue 0.143858float range data value for this range

Sample Rows
 
binchromchromStartchromEnddataValue
585chr1426843050.143858
585chr1673867650.071929
585chr1688669130.015735
585chr1749275220.035965
585chr1757576020.013986
585chr1765976860.017982
585chr1821882450.017982
585chr1842284490.031469
585chr114162141890.075526
585chr114701147280.020979

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.

RIKEN CAGE Loc (wgEncodeRikenCage) Track Description
 

Description

This track shows 5' cap analysis gene expression (CAGE) tags and clusters in RNA extracts from different sub-cellular localizations in multiple cell lines. A CAGE cluster is a region of overlapping tags with an assigned value that represents the expression level. The data in this track were produced as part of the ENCODE Transcriptome Project.

Display Conventions and Configuration

This track is a multi-view composite track that contains multiple data types (views). For each view, there are multiple subtracks that display individually on the browser. Instructions for configuring multi-view tracks are here.

To show only selected subtracks, uncheck the boxes next to the tracks that you wish to hide.

This track contains the following views:

Plus and Minus Clusters
These views display clusters of overlapping read mappings on the forward and reverse genomic strands.
Alignments
The Alignments view shows the individual tags (read mappings), with mismatches from the genomic reference highlighted.

Color differences in subtracks are are used as a visual cue to distinguish between the different cell types, and between annotations on the plus and minus strand.

Methods

Cells were grown according to the approved ENCODE cell culture protocols. RNA molecules longer than 200 nt and present in the RNA population isolated from each subcellular compartment were fractionated into polyA+ and polyA- fractions as described in these protocols. The CAGE tags were sequenced from the 5' ends of cap-trapped cDNAs produced using RIKEN CAGE technology (Kodzius et al. 2006; Valen et al. 2009). To create the tag, a linker was attached to the 5' end of polyA+ or polyA- reverse-transcribed cDNAs which were selected by cap trapping (Carninci et al. 1996). The first 27 bp of the cDNA were cleaved using class II restriction enzymes. A linker was then attached to the 3' end of the cDNA.

After PCR amplification, the tags were sequenced (36 bp single reads) using ABI SOLiD technology (polyA- RNA from the cytosol and nucleus of K562 cell lines, and from whole cell in prostate cells) or Illumina/Solexa GA (all other data). Tags were mapped to the human genome (NCBI Build36, hg18) using the program nexalign (T. Lassmann manuscript in preparation). SOlid CAGE sequences were mapped with up to 3 mismatches; 2 mismatches were allowed for Solexa CAGE. Alignments of sequences mapping 10 times or fewer were retained. The expression level was computed as the number of reads making up the cluster, divided by the total number of reads sequenced, times 1 million.

Release Notes

This is Release 2 of this track. This release adds data for eight new cell-type/compartment combinations (GM12878 Nucleus, H1-hESC whole cell, HepG2 cytosol/nucleus/nucleolus, HUVEC cytosol, and NHEK cytosol/nucleus).

Credits

These data were generated and analyzed by Timo Lassmann, Phil Kapranov, Hazuki Takahashi, Yoshihide Hayashizaki, Carrie Davis, Tom Gingeras, and Piero Carninci.

Contact: Piero Carninci at RIKEN Omics Science Center

References

Kodzius R, Kojima M, Nishiyori H, Nakamura M, Fukuda S, Tagami M, Sasaki D, Imamura K, Kai C, Harbers M, et al. CAGE: cap analysis of gene expression. Nat Methods. 2006 March 1; 3(3):211-222.

Valen E, Pascarella G, Chalk A, Maeda N, Kojima M, Kawazu C, Murata M, Nishiyori H, Lazarevic D, Motti D, et al. Genome-wide detection and analysis of hippocampus core promoters using DeepCAGE. Genome Res. 2009 February; 19(2):255-265.

Carninci P, Kvam C, Kitamura A, Ohsumi T, Okazaki Y, Itoh M, Kamiya M, Shibata K, Sasaki N, Izawa M, et al. High-efficiency full-length cDNA cloning by biotinylated CAP trapper. Genomics. 1996 November 1; 37(3):327-336.

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column, above. The full data release policy for ENCODE is available here.