Helicos RNA-seq Track Settings
ENCODE Helicos RNA-seq   (All Expression tracks)

Maximum display mode:       Reset to defaults   
Select views (Help):
Alignments ▾       Raw Signal ▾      
List subtracks: only selected/visible    all    Restricted Until 
 K562cyto pA+ Sig  ENCODE Helicos RNA-seq Raw Signal (PolyA+ RNA in K562 cytosol)    Schema   2010-01-03 
 K562cyto pA+ Tag  ENCODE Helicos RNA-seq Tags (PolyA+ RNA in K562 cytosol)    Schema   2010-01-03 


This track depicts high throughput sequencing of long RNAs (>200 nt) from whole cell RNA samples from tissues or sub cellular compartments from cell lines included in the ENCODE Transcriptome subproject. The overall goal of the ENCODE project is to identify and characterize all functional elements in the sequence of the human genome. RNA-Seq was performed by reverse-transcribing an RNA sample into cDNA, followed by high throughput DNA sequencing of the cDNA, which was done here on Helicos™ Genetic Analysis System (Harris et al; http://www.helicosbio.com/).

Display Conventions and Configuration

This is a multi-view track that provides the following views of the data:
RNA-seq tag alignments.
Raw Signal
Density graph (wiggle) of the number of reads overlapping a nucleotide in the genome.

To show only selected subtracks, uncheck the boxes next to the tracks that you wish to hide. Color differences among the views are arbitrary. They provide a visual cue for distinguishing between the different cell types and compartments.

Note that the strand of the RNA is not displayed in the track in the genome browser. The strand can be found in the download file.


Cells were grown according to the approved ENCODE cell culture protocols. RNA molecules longer than 200 nt and present in RNA population isolated from different subcellular compartments (such as cytosol, nucleus, polysomes and others) were fractionated into polyA+ and polyA- fractions as described in these protocols.

RNA was converted into first strand cDNA using a high excess of random hexamers without prior fragmentation. Spurious second-strand cDNA synthesis could occur under these conditions. The first strand cDNA molecules were tailed at the 3′ ends with polyA residues using terminal transferase and used directly for sequencing.

Filtered reads were aligned to the human genome using in-house and freely available Helicos Alignment software indexDPgenomic (http://open.helicosbio.com/mwiki/index.php/Docs/Software/Bioinformatics#Executables, requires registration (free)) with a minimum normalized alignment score of 4.5. The normalized score was defined as following:


For example, in the following alignment:

Length of alignment block: 33
Length of tag sequence: 32
Number of matches: 31
Number of errors: 2
Score: (31*5) - (2*4) = 155 - 8 = 147
Normalized score = 147/32 = 4.59375

Raw data can be found at Helicos (requires registration (free)).


Known exon maps as displayed on the genome browser are confirmed by the alignment of sequence reads.


Helicos BioSciences: Philipp Kapranov, Eldar Giladi, Steve Roels, Chris Hart, Stan Letovsky, Patrice Milos.

Cold Spring Harbor Laboratory: Carrie Davis, Kim Bell, Huaien Wang, Tom Gingeras.

Contacts: Philipp Kapranov ; Patrice Milos


Harris TD, Buzby PR, Babcock H, Beer E, Bowers J, Braslavsky I, Causey M, Colonell J, Dimeo J, Efcavitch JW, Giladi E, Gill J, Healy J, Jarosz M, Lapen D, Moulton K, Quake SR, Steinmann K, Thayer E, Tyurina A, Ward R, Weiss H, Xie Z. Single-molecule DNA sequencing of a viral genome Science. 2008 Apr 4;320(5872):106-9

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column, above. The full data release policy for ENCODE is available here.