| |
The March 2006 chimpanzee (Pan troglodytes) browser displays
data from the 6X whole genome shotgun draft assembly (Build
2 Version 1, Oct. 2005) produced by the
Chimpanzee Sequencing and Analysis Consortium.
This assembly contains sequence from the initial 4X chimpanzee assembly
described and analyzed in Nature (The Chimpanzee Sequencing
and Analysis Consortium, 2005),
with additional 2X sequence generated, assembled, and assigned to
chromosomes by the
Genome Sequencing Center of Washington University School of
Medicine, St. Louis, MO, USA.
This assembly uses a new chromosomal numbering scheme that reflects
orthology between the human and chimpanzee chromosomes. For details, see
the Assembly details section below and the Genome Browser
FAQ. To read more about the
chimpanzee assembly, see the Washington University in St. Louis School of
Medicine
Pan troglodytes web page and the National Institutes
of Health NIH News summary of the chimpanzee analysis paper.
The chimpanzee is the species most closely related to humans, but is
endangered.
Consequently, it is the focus of multiple
conservation efforts.
Sample position queries
A genome position can be specified by the HUGO Gene Nomenclature Committee gene
name of a human RefSeq, the accession number of an EST or mRNA,
a chromosomal coordinate range, or keywords from
the GenBank description of an mRNA. The following list shows
examples of valid position queries for the chimpanzee genome.
See the
User's Guide for more information.
Request:
|
Genome Browser Response:
|
|
| chr22 |
|
Displays all of chromosome 22 |
| chr2a:11,250,001-12,250,000 |
|
Displays a million bases of chromosome 2a, beginning at
base 11,250,001. Note that
chromosome 2 in this assembly has been split into two parts: 2a
and 2b. |
| chr2a:11,250,001+2000 |
|
Displays a region of chr 2a that spans 2000 bases, starting at position 11,250,001 |
|
| BRCA1 |
|
Displays a list of genomic regions where human RefSeq gene BRCA1 (or features associated with BRCA1) aligns |
| AF115459 |
|
Displays region of genome with mRNA with GenBank accession number AF115459 |
| 348 |
|
Displays the region of genome with Entrez Gene identifier 348 |
|
| pseudogene mRNA |
|
Lists transcribed pseudogenes, but not cDNAs |
| sialic acid |
|
Lists mRNAs and RefSeqs with GenBank keywords sialic acid |
| huntington |
|
Lists mRNAs associated with Huntington's disease |
| Paabo,S. |
|
Lists mRNAs deposited by co-author S. Paabo |
|
| Use this last format for author queries. Although GenBank
requires the search format Paabo S, internally it uses
the format Paabo,S.. |
Assembly details
This assembly covers about 97 percent of the genome and is based on 6X
sequence coverage. It is composed of 265,882 contigs with an N50
length of 29 kb and 44,460 supercontigs with an N50 length of 9.7
Mb. The total contig length, not including estimated gap sizes, is
2.97 Gb. Of that total, 2.82 Gb of sequence have been ordered and oriented
along specific chimpanzee chromosomes, 107 Mb have been placed in chr*_random,
and 50 Mb remain in chrUn.
The whole genome shotgun data were derived primarily from the donor
Clint, a captive-born male chimpanzee from the Yerkes Primate Research Center
in Atlanta, GA, USA. The reads were assembled with the whole-genome assembly
program PCAP (Huang, 2006), using
stringent parameters derived by eliminating detectable global
misassemblies -- interchromosomal cross-overs determined by alignment
of the chimpanzee genome against the human genome -- larger than 50 Kb.
The assembly data were aligned against the human genome at UCSC
utilizing BLASTZ (Schwartz, 2003) to align and score
non-repetitive chimpanzee regions against repeat-masked human
sequence. The alignment chains differentiated between orthologous and
paralogous alignments (Kent, 2003); only "reciprocal best"
alignments were retained in the alignment set. The chimpanzee AGP
files were generated from these alignments in a manner similar to that
described in The Chimpanzee Sequencing and Analysis Consortium (2005).
Centromeres were introduced into the chimp sequence at the positions
of the centromeres in the human chromosomes. Ten documented/known
human inversions supported by the assembly were
introduced into the ordering, as was the separation of alignments to
human chromosome 2 into chimpanzee chromosomes 2a and 2b.
The regions in the WGS assembly corresponding to the finished sequences for
chromosomes 21 and Y and
a 5-Mb finished region from chimpanzee chromosome 7 were replaced
with the corresponding finished AGPs/sequences. See the
Credits page for
acknowledgements for these chromosomal regions.
A major difference between this assembly and the previous Nov. 2003 version is
the chromosomal numbering scheme, which has been changed to reflect a
new standard that preserves orthology with human chromosomes.
Proposed by E.H. McConkey in 2004,
the new numbering convention was subsequently endorsed by the
International Chimpanzee Sequencing and Analysis Consortium.
This standard assigns the identifiers "2a" and "2b" to the
two chimp chromosomes that fused in the human genome to form chromosome 2. Note
that
the genome assembly shown in the Nov. 2003 panTro1 Genome Browser retains the
older numbering scheme in which these chromosomes are numbered 12 and 13. To
view a table showing the correspondence between human and chimp chromosomes,
see the FAQ.
Bulk downloads of the sequence and annotation data are available via
the Genome Browser FTP
server or the Downloads page.
The complete set of sequence reads is available at the
NCBI trace archive. These data have specific
conditions for use.
The chimpanzee browser annotation tracks were generated by UCSC and
collaborators worldwide. See the
Credits
page for a detailed list of the organizations and individuals who contributed
to this release.
References
The Chimpanzee Sequencing and Analysis Consortium.
Initial sequence of the chimpanzee genome and comparison with the
human genome. Nature 437(7005), 69-87 (2005).
Huang, X., Yang, S., Chinwalla, A.T., Hillier, L.W., Minx, P., Mardis, E.R.
and Wilson, R.K.
Application of a superword array in genome assembly.
Nucleic Acids Res. 34(1), 201-5 (2006).
Kent, W.J., Baertsch, R., Hinrichs, A., Miller, W. and Haussler, D.
Evolution's cauldron: duplication, deletion, and rearrangement in
the mouse and human genomes.
P. Natl. Acad. Sci. USA 100(20), 11484-11489 (2003).
McConkey, E.H.
Orthologous numbering of great ape and human chromosomes is
essential for comparative genomics.
Cytogenet Genome Res. 105(1), 157-8 (2004).
Schwartz, S., Kent, W.J., Smit, A., Zhang, Z., Baertsch, R., Hardison, R.,
Haussler, D., and Miller, W.
Human-mouse alignments with BLASTZ.
Genome Res. 13(1), 103-107 (2003).
| |