Schema for Human Proteins - Human Proteins Mapped by Chained tBLASTn
  Database: sacCer3    Primary Table: blastHg18KG    Row Count: 5,956   Data last updated: 2011-08-31
Format description: Summary info about a patSpace alignment
fieldexampleSQL type info description
bin 585smallint(5) unsigned range Indexing field to speed chromosome range queries.
matches 26int(10) unsigned range Number of bases that match that aren't repeats
misMatches 20int(10) unsigned range Number of bases that don't match
repMatches 0int(10) unsigned range Number of bases that match but are part of repeats
nCount 0int(10) unsigned range Number of 'N' bases
qNumInsert 1int(10) unsigned range Number of inserts in query
qBaseInsert 4int(10) unsigned range Number of bases inserted in query
tNumInsert 2int(10) unsigned range Number of inserts in target
tBaseInsert 89int(10) unsigned range Number of bases inserted in target
strand ++char(2) values + or - for strand. First character query, second target (optional)
qName NM_145657varchar(255) values Query sequence name
qSize 264int(10) unsigned range Query sequence size
qStart 72int(10) unsigned range Alignment start position in query
qEnd 235int(10) unsigned range Alignment end position in query
tName chrIvarchar(255) values Target sequence name
tSize 230218int(10) unsigned range Target sequence size
tStart 179int(10) unsigned range Alignment start position in target
tEnd 67493int(10) unsigned range Alignment end position in target
blockCount 4int(10) unsigned range Number of blocks in alignment
blockSizes 18,4,19,5,longblob   Size of each block
qStarts 72,90,207,230,longblob   Start of each block in query.
tStarts 179,248,67421,67478,longblob   Start of each block in target.

Sample Rows
58593850051420734++NM_0010127102021200chrI23021826069113178226,9,8,13,1,5,3,3,7,8,14,17,13,17,16,11,3,8,5,4,3,4,1,7,16,24,37,38,46,49,55,62,70,84,101,114,131,147,165,171,181,186,190,196,26069,26135,26186,26258,26321,26330,26345,26432,26441,26468,26531,26585,26669,26720,26804,26855,113055,113064,113088,113115,1131 ...

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.

Human Proteins (blastHg18KG) Track Description


This track contains tBLASTn alignments of the peptides from the predicted and known genes identified in the hg18 UCSC Genes track.


First, the predicted proteins from the human Known Genes track were aligned with the human genome using the Blat program to discover exon boundaries. Next, the amino acid sequences that make up each exon were aligned with the S. cerevisiae sequence using the tBLASTn program. Finally, the putative S. cerevisiae exons were chained together using an organism-specific maximum gap size but no gap penalty. The single best exon chains extending over more than 60% of the query protein were included. Exon chains that extended over 60% of the query and matched at least 60% of the protein's amino acids were also included.


tBLASTn is part of the NCBI BLAST tool set. For more information on BLAST, see Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990 Oct 5;215(3):403-10. PMID: 2231712

Blat was written by Jim Kent. The remaining utilities used to produce this track were written by Jim Kent or Brian Raney.