Description
N-SCAN and N-SCAN EST gene predictions.
Methods
N-SCAN
N-SCAN combines biological-signal modeling in the target genome sequence along
with information from a multiple-genome alignment to generate de novo gene
predictions. It extends the TWINSCAN target-informant genome pair to allow for
an arbitrary number of informant sequences as well as richer models of
sequence evolution. It models the phylogenetic relationships between the
aligned genome sequences, context-dependent substitution rates, insertions,
and deletions.
Human N-SCAN uses mouse (mm5) as the informant and iterative pseudogene masking.
N-SCAN EST
N-SCAN EST combines EST alignments into N-SCAN. Similar to the
conservation sequence models in TWINSCAN, separate probability models
are developed for EST alignments to genomic sequence in exons, introns,
splice sites and UTRs, reflecting the EST alignment patterns in these
regions. N-SCAN EST is more accurate than N-SCAN while retaining the
ability to discover novel genes to which no ESTs align.
Human N-SCAN EST uses mouse (mm5),
rat (rn3), and chicken (galGal2) as informants.
Credits
Thanks to Michael Brent's Computational Genomics Group at Washington
University St. Louis for providing this data.
Special thanks for this implementation of N-SCAN to Aaron Tenney in
the Brent lab, and Robert Zimmermann, currently at Max F. Perutz
Laboratories in Vienna, Austria.
References
Gross SS, Brent MR.
Using
multiple alignments to improve gene prediction. In
Proc. 9th Int'l Conf. on Research in Computational Molecular Biology
(RECOMB '05):374-388 and J Comput Biol. 2006 Mar;13(2):379-93.
Korf I, Flicek P, Duan D, Brent MR.
Integrating genomic homology into gene structure prediction.
Bioinformatics. 2001 Jun 1;17(90001)S140-8.
van Baren MJ, Brent MR.
Iterative gene prediction and pseudogene removal improves
genome annotation.
Genome Res. 2006 May;16(5):678-85.
|