This track shows 5' cap analysis gene expression (CAGE) tags and clusters in
A CAGE cluster is a region of overlapping tags with
an assigned value that represents the expression level.
The data in this track were produced as part of the ENCODE
Display Conventions and Configuration
This track is a multi-view composite track that contains multiple data types
(views). For each view, there are multiple subtracks that
display individually on the browser. Instructions for configuring multi-view
To show only selected subtracks, uncheck the boxes next to the tracks that
you wish to hide.
This track contains the following views:
- Plus and Minus Clusters
- These views display clusters of overlapping read mappings on the
forward and reverse genomic strands.
- The Alignments view shows the individual tags (read mappings), with
mismatches from the genomic reference highlighted.
Color differences in subtracks are are used as a visual cue to
distinguish between the different cell types, and between annotations
on the plus and minus strand.
Cells were grown according to the approved
ENCODE cell culture protocols.
RNA molecules longer than 200 nt
and present in the RNA population isolated from each subcellular compartment
were fractionated into polyA+ and polyA- fractions as described in
The CAGE tags were sequenced from the 5' ends of cap-trapped cDNAs produced
using RIKEN CAGE technology
(Kodzius et al. 2006; Valen et al. 2009).
To create the tag, a linker was attached to the 5'
end of polyA+ or polyA- reverse-transcribed cDNAs which were selected by cap
trapping (Carninci et al. 1996). The first 27 bp of
the cDNA were cleaved using class II restriction enzymes. A linker was then
attached to the 3' end of the cDNA.
After PCR amplification,
the tags were sequenced (36 bp single reads) using ABI
(polyA- RNA from the cytosol and nucleus of K562 cell lines, and from whole cell in prostate cells)
or Illumina/Solexa GA (all other data).
Tags were mapped to the human genome
(NCBI Build36, hg18) using the program nexalign
(T. Lassmann manuscript in preparation).
SOlid CAGE sequences were mapped with up to 3 mismatches; 2 mismatches were allowed for Solexa CAGE.
Alignments of sequences mapping 10 times or fewer were retained.
The expression level was computed as the number of reads making up the cluster,
divided by the total number of reads sequenced, times 1 million.
This is Release 2 of this track. This release adds data for eight new cell-type/compartment combinations (GM12878 Nucleus, H1-hESC whole cell, HepG2 cytosol/nucleus/nucleolus, HUVEC cytosol, and NHEK cytosol/nucleus).
These data were generated and analyzed by Timo Lassmann, Phil Kapranov,
Hazuki Takahashi, Yoshihide Hayashizaki, Carrie Davis, Tom Gingeras, and Piero Carninci.
Piero Carninci at
RIKEN Omics Science Center
Kodzius R, Kojima M, Nishiyori H, Nakamura M, Fukuda S, Tagami M, Sasaki D,
Imamura K, Kai C, Harbers M, et al.
CAGE: cap analysis of gene expression.
Nat Methods. 2006 March 1; 3(3):211-222.
Valen E, Pascarella G, Chalk A, Maeda N, Kojima M, Kawazu C, Murata M, Nishiyori
H, Lazarevic D, Motti D, et al.
Genome-wide detection and analysis of hippocampus core promoters using
Genome Res. 2009 February; 19(2):255-265.
Carninci P, Kvam C, Kitamura A, Ohsumi T, Okazaki Y, Itoh M, Kamiya M,
Shibata K, Sasaki N, Izawa M, et al.
High-efficiency full-length cDNA cloning by biotinylated CAP trapper.
Genomics. 1996 November 1; 37(3):327-336.
Data Release Policy
Data users may freely use ENCODE data, but may not, without prior
consent, submit publications that use an unpublished ENCODE dataset until
nine months following the release of the dataset. This date is listed in
the Restricted Until column, above. The full data release policy
for ENCODE is available