|
|
|
|
|
| Alignment Quality | | |
StatSigMA:
Statistical Significance of Multiple Alignments
StatSigMA computes the statistical significance of multiple
sequence alignments (of either nucleotide or amino acid
sequences), much as BLAST's E-values provide statistical
significance for pairwise alignments.
Download
If you use this software for your publications, please read and cite:
| | High Scoring Regions | | |
MSS:
Finding all Maximal Scoring Subsequences.
MSS is a practical, linear time algorithm to find, in a
sequence of numeric scores, those nonoverlapping, contiguous
subsequences having greatest total scores.
Download (C++ source code; contributed by Shane Neph).
If you use this software for your publications, please read
and cite:
- Ruzzo and Tompa: A Linear Time Algorithm for Finding
All Maximal Scoring Subsequences.
Seventh International Conference on Intelligent Systems
for Molecular Biology, Heidelberg, Germany, August
1999, pp. 234-241,
PMID: 10786306.
| | Microarrays | | |
Dapple:
Image analysis software for DNA microarrays
Dapple is a program for quantitating spots on a two-color DNA
microarray image. Given a pair of images from a comparative
hybridization, Dapple finds the individual spots on the image,
evaluates their qualities, and quantifies their total fluorescent
intensities.
Dapple Web Site
If you use this software for your publications, please read and cite:
| | Motif Discovery | | |
COSMO:
Binding sites in coding regions
COSMO is a program that detects putative binding sites in coding
regions.Given a set of orthologous mRNA sequences, it identifies
regions whose conservation cannot be explained solely by the selective
pressure on the protein encoded.
COSMO
If you use this software for your publications, please read and cite:
-
Blanchette, M.
A comparative analysis method for detecting binding sites in coding
regions. In Proceedings of the Seventh Annual International
Conference on
Computational Molecular Biology (RECOMB03), Berlin, 2003.
FootPrinter:
A program for phylogenetic footprinting
Phylogenetic footprinting is a method that identifies
putative regulatory elements in DNA sequences. It identifies
regions of DNA that are unusually well conserved across a
set of orthologous sequences.
Web Server (FootPrinter2.1)
Download (FootPrinter2.1)
Sample output
If you use this software for your publications, please read and cite:
- Blanchette, M. and Tompa, M.
FootPrinter: a program designed for phylogenetic footprinting.
Nucleic Acids
Research, vol. 31, no. 13, 2003, 3840-3842.
- Blanchette, M. and Tompa, M.
Discovery of Regulatory Elements by a Computational Method for
Phylogenetic Footprinting.
Genome Research,
vol. 12, no. 5, May 2002, 739-748
and
-
Blanchette, M., Schwikowski, B., and Tompa, M.
Algorithms for Phylogenetic Footprinting.
Journal of Computational Biology, vol. 9, no. 2,
2002, 211-223.
MicroFootPrinter:
A microbial front end for FootPrinter
MicroFootPrinter is a front end to the FootPrinter phylogenetic
footprinting program, but with specific focus on prokaryotic
genomes. You supply a prokaryotic species and gene of
interest. MicroFootPrinter will then find related
prokaryotes each containing a homologous gene, and run
FootPrinter to identify motifs in the regulatory region of
your chosen gene that are well conserved across these
homologous genes.
Web Server.
If you use this software for your publications, please read and cite:
PhyME:
Motif discovery in data sets that include both intraspecies
overrepresentation and interspecies conservation
PhyME discovers motifs by integrating two important aspects of the
motif's significance, overrepresentation and interspecies
conservation, into one probabilistic score. The algorithm is based on
multiple alignment and expectation-maximization.
Download
If you use this software for your publications, please read and cite:
Projection:
A motif discovery program based on random projections
Download
If you use this software for your publications, please read and cite:
-
Buhler, J. and Tompa, M.
Finding Motifs Using Random Projections.
Journal of Computational Biology, vol. 9, no. 2, 2002,
225-242.
-
Buhler, J. Provably sensitive indexing strategies for biosequence
similarity search. In Proceedings of the Sixth Annual International
Conference on
Computational Molecular Biology (RECOMB02) 90-99, Washington,
D.C., 2002.
YMF and FindExplanators:
An enumerative motif discovery program.
YMF identifies motifs (made of IUPAC symbols) that occur
unusually often in a given set of sequences. FindExplanators
extracts from that set of motifs a smaller set of
independent motifs.
Web Server
Download
If you use this software for your publications, please read and cite:
-
Sinha, S. and Tompa, M.,
YMF: a Program for Discovery of Novel Transcription Factor Binding
Sites by Statistical Overrepresentation.
Nucleic Acids
Research,
vol. 31, no. 13, July 2003, 3586-3588.
-
Sinha, S. and Tompa, M.
Discovery of Novel Transcription Factor Binding Sites by Statistical
Overrepresentation.
Nucleic Acids Research,
vol. 30, no. 24, December 2002, 5549-5560.
-
Sinha, S. and Tompa, M.
A Statistical Method for Finding Transcription Factor Binding Sites,
Eighth International Conference on Intelligent Systems for
Molecular Biology, San Diego, CA, August 2000, 344-354.
-
Blanchette, M. and Sinha, S.
Separating real motifs from their artifacts.
Bioinformatics,
vol. 17, 2001, S30-S38.
| | RNA | | |
CMfinder:
A covariance model based RNA motif finding algorithm
CMfinder is a tool to predict RNA motifs in unaligned
sequences. It is an expectation maximization algorithm using
covariance models for motif description, featuring novel
integration of multiple techniques for effective search of
motif space, and a Bayesian framework that blends mutual
information-based and folding energy-based approaches to
predict structure in a principled way.
Web site and supplementary information.
Web Server.
Download (C and Perl source code).
If you use this software for your publications, please read
and cite:
Multiperm:
Shuffling multiple sequence alignments while approximately
preserving dinucleotide frequencies.
Assessing the statistical significance of structured RNA
predicted from multiple sequence alignments relies on the
existence of a good null model. Multiperm is a random
shuffling algorithm that preserves not only the gap and local
conservation structure in alignments of arbitrarily many
sequences, but also the mono- and approximate dinucleotide
frequencies. The later characteristics have important effects
on the predicted thermodynamic stability of RNA structures.
Download (C++ source code).
If you use this software for your publications, please read
and cite:
-
Anandam, Torarinsson and Ruzzo.
Multiperm: shuffling multiple sequence alignments while
approximately preserving dinucleotide frequencies.
Bioinformatics. 2009 Jan 9. [Epub ahead of print]
PMID:
19136551.
RaveNnA:
Faster Search for Non-coding RNA Families Without Loss of Accuracy
Non-coding RNAs (ncRNAs) are functional RNA molecules that
do not code for proteins. Covariance Models (CMs) are a
useful statistical tool to find new members of an ncRNA gene
family in a large genome database, using both sequence and,
importantly, RNA secondary structure information.
Unfortunately, CM searches are slow. The RaveNnA software
package makes CMs faster while provably sacrificing none of
their accuracy (or faster still with little loss in
sensitivity, depending on parameter settings).
Download (C++ source code).
If you use this software for your publications, please read
and cite:
-
Weinberg and Ruzzo: Faster Genome Annotation of
Non-coding RNA Families Without Loss of Accuracy.
Eighth Annual International Conference on Research in Computational
Molecular Biology (RECOMB 2004)
, pp
243-251, March 2004, San Diego, CA.
-
Weinberg and Ruzzo: Exploiting Conserved Structure for
Faster Annotation of Non-coding RNAs Without Loss of
Accuracy.
Bioinformatics, 20 (suppl_1) i334-i341, 2004
PMID: 15262817.
-
Weinberg and Ruzzo: Sequence-based heuristics for faster
annotation of non-coding RNA families.
Bioinformatics, 2006, 22(1):35-39.
PMID: 16267089.
|
|