Chimp Deletions Track Settings
 
Deletions in Chimp (Nov. 2003/panTro1) Relative to Human   (All Comparative Genomics tracks)

Display mode:      Duplicate track
Data schema/format description and download
Assembly: Human July 2003 (NCBI34/hg16)
Data last updated at UCSC: 2004-02-19

Description

This track displays regions of the human genome assembly (hg16) that are deleted in the chimpanzee draft assembly (panTro1). Only regions of between 80 and 12000 bases are included. The name of each deletion is a unique pointer to that deletion followed by an underscore and then its length. A similar track, showing human deletions in the chimpanzee assembly, appears in the chimp Genome Browser.

Methods

The human/chimpanzee alignments were created at UCSC with blastz and blat, using a reciprocal best strategy with chaining and netting. The initial alignments were generated using blastz on repeatmasked sequence with following matrix:

       A    C    G    T
 A   100 -300 -150 -300
 C  -300  100 -300 -150
 G  -150 -300  100 -300
 T  -300 -150 -300  100

 O = 400, E = 30, K = 4500, L = 4500, M = 50

The overall score is the sum of the score over all pairs.

The resulting alignments were processed by the axtChain program. To place additional chimp scaffolds that weren't initially aligned by blastz, a DNA blat of the unmasked sequence was performed. The resulting blat alignments were also chained, and then merged with the blastz-based chains produced in the previous step to produce "all chains", which were further processed by the chainNet and netSyntenic programs. Finally, a "reciprocal best" strategy was employed to minimize paralog fill-in for missing orthologous chimp sequence. Details of the alignment methods can be found in the descriptions of the Chimp Chain and Chimp Net tracks.

Chimp deletions in human were determined from the collection of indels implied by these alignments. The criteria for inclusion in the list of deletions were (i) within, not between, scaffolds; (ii) simple gaps only (no opposing, unmatched bases or double gaps); (iii) 80-12000 bp long; and (iv) not a missed overlap or incorrect gap size in assembly. These criteria aim to include plausible repeat insertions and exclude assembly and alignment artifacts.

Credits

The chimpanzee sequence used in this track was obtained from the 13 Nov. 2003 Arachne assembly. This sequence was provided by the National Human Genome Research Institute (NHGRI), the Eli & Edythe L. Broad Institute at MIT/Harvard, and Washington University School of Medicine.

The BLASTZ program was created by Webb Miller of the Penn State Bioinformatics Group.

Jim Kent at UCSC wrote the blat program, the chaining and netting programs, and the scripts for displaying the alignments in this browser.

The list of mid-sized (80-12000 bp) chimp deletions relative to human was provided by Tarjei Mikkelsen at MIT. The UCSC alignments of complete chimpanzee scaffolds to the human genome assembly were used to generate this list.

References

ARACHNE: A Whole-Genome Shotgun Assembler. Serafim Batzoglou, David B. Jaffe, Ken Stanley, Jonathan Butler, Sante Gnerre, Evan Mauceli, Bonnie Berger, Jill P. Mesirov, and Eric S. Lander. Genome Research 2002 Jan;12:177-189.

Whole-Genome Sequence Assembly for Mammalian Genomes: ARACHNE 2. David B. Jaffe, Jonathan Butler, Sante Gnerre, Evan Mauceli, Kerstin Lindblad-Toh, Jill P. Mesirov, Michael C. Zody, and Eric S. Lander. Genome Research 2003 Jan;13(1):91-96.

Human-Mouse Alignments with BLASTZ. Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison R, Haussler D, and Miller W. Genome Research 2003 Jan;13(1):103-7.

Scoring pairwise genomic sequence alignments. Chiaromonte F, Yap VB, Miller W. Pac Symp Biocomput 2002;:115-26.