Description
This track displays human-centric multiple sequence alignments in the
ENCODE regions for the 28 vertebrates included in the
September 2005 ENCODE MSA freeze,
based on comparative sequence data generated for the ENCODE project
as well as whole-genome assemblies residing at UCSC, as listed:
- human (May 2004, hg17)
- armadillo (NISC and May 2005 Broad Assisted Assembly v 1.0)
- baboon (NISC)
- chicken (Feb 2004, galGal2)
- chimp (Nov 2003, panTro1)
- colobus_monkey (NISC)
- cow (BCM)
- dog (July 2004, canFam1)
- dusky_titi (NISC)
- elephant (NISC and May 2005 Broad Assisted Assembly v 1.0)
- fugu (Aug 2002, fr1)
- galago (NISC)
- hedgehog (NISC)
- macaque (Jan 2005, rheMac1)
- marmoset (NISC)
- monodelphis (Oct 2004, monDom1)
- mouse (Mar 2005, mm6)
- mouse_lemur (NISC)
- owl_monkey (NISC)
- platypus (NISC and Aug 2005 Mullikin Phusion Assembly of WUGSC Traces)
- rabbit (NISC and May 2005 Broad Assisted Assembly v 1.0)
- rat (June 2003, rn3)
- rfbat (NISC)
- shrew (NISC and Sep 2005 Mullikin Phusion Assembly of Broad Traces)
- tenrec (Apr 2005 Mullikin Phusion Assembly of Broad Traces)
- tetraodon (Feb 2004, tetNig1)
- xenopus (Oct 2004, xenTro1)
- zebrafish (June 2004, danRer2)
The alignments in this track were generated using the
Threaded Blockset Aligner (TBA).
The Genome Browser companion tracks, TBA Cons and TBA Elements, display
conservation scoring and conserved elements for these alignments based on
various conservation methods.
Display Conventions and Configuration
In full display mode, this track shows pairwise alignments
of each species aligned to the human genome.
In dense mode, the alignments are depicted using a gray-scale
density gradient. The checkboxes in the track configuration section allow
the exclusion of species from the pairwise display.
When zoomed-in to the base-display level, the track shows the base
composition of each alignment. The numbers and symbols on the
"Gaps" line indicate the lengths of gaps in the human sequence at those
alignment positions relative to the longest non-human sequence. If there is
sufficient space in the display, the size of the gap is shown; if not, and if
the gap size is a multiple of 3, a "*" is displayed,
otherwise "+" is shown.
To view detailed information about the
alignments at a specific position, zoom in the display to 30,000 or fewer
bases, then click on the alignment.
Methods
The TBA was used to align sequences in the September 2005 ENCODE sequence data
freeze. Multiple alignments were seeded from a series of combinatorial pairwise
blastz alignments (not referenced to any one species). The specific
combinations were determined by the
species guide tree.
The resulting multiple alignments were projected onto the human reference
sequence.
Credits
The TBA multiple alignments were created by Elliott Margulies of NHGRI,
while at the Green Lab.
The programs Blastz and TBA, which were used to generate the alignments, were
provided by Minmei Hou, Scott Schwartz and Webb Miller of the
Penn State Bioinformatics
Group.
The phylogenetic tree is based on Murphy et al. (2001).
References
Blanchette M, Kent WJ, Reimer C, Elnitski L, Smit A,
Roskin K, Baertsch R, Rosenbloom KR, Clawson H et al.
Aligning Multiple Genomic Sequences With the Threaded Blockset
Aligner.
Genome Res. 2004;14(4):708-15.
Chiaromonte F, Yap VB, Miller W.
Scoring pairwise genomic sequence alignments.
Pac Symp Biocomput. 2002;115-26.
Murphy WJ, Eizirik E, O'Brien SJ, Madsen O, Scally M, Douady CJ, Teeling E,
Ryder OA, Stanhope MJ, de Jong WW et al.
Resolution of the early placental mammal radiation using Bayesian phylogenetics.
Science. 2001;294(5550):2348-51.
Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison R, Haussler D, Miller W.
Human-Mouse Alignments with BLASTZ.
Genome Res. 2003;13(1):103-7.
|
|