BINDIGO is a new, faster algorithm to calculate optimal binding of oligomeric RNA to an RNA target.
Many cellular processes rely on an RNA fragment binding to an RNA target; examples include small nuclear RNAs binding to splice junctions (the example in our paper), microRNAs, retrotransposons, and short interfering RNAs. (DNA microarrays also work on the same physical principle.)
To find the optimal RNA-RNA bound state of strings of lengths N1 and N2, a number of authors have modified RNA folding algorithms to compute the free energy. All of them take time O((N1+N2)³). Our BINDIGO does the same in time O(N1×N2) because it is designed for binding, not folding.
BINDIGO implements the Turner free energy rules for RNA-RNA binding in a dynamic programming algorithm. We have also introduced a new heuristic for calculating the large asymmetric loops.
Nathan O. Hodas and Daniel P. Aalberts, Efficient computation of optimal oligo-RNA binding, Nucleic Acids Research, 32, 6636-6642 (2004).
A common bioinformatics problem is how to determine whether a local primary sequence X is more like examples in real or decoy training data sets.
Many methods (Weight Matrix Method, Weight Array Method, Markov Models) have been developed to address this problem. A WMM-type approach uses the probabilities of inidividual letters [p(e)>p(t)>...>p(z)] at different positions. Correlations [e.g., qu or th] are included in more sophisticated models.
Our Primary Sequence Ranking methods take a complementary approach, more akin to referring to a dictionary to see whether the word is known. And if the real and decoy dictionaries are small (abridged), we also offer several approaches to enhance the lexicon by making substitution mutations. In the strange world of bioinformatics, local sequence X can appear in both real and decoy dictionaries. (In fact, in our paper we show that in the case of pre-mRNA donor splice signals, 96% of real 7 letter sequences also appear as decoys elsewhere where they are not spliced.)
The PSR approach is very simple. We rank order sequences by the likelihood X is true or '+':
To accomodate finite training data, we also provide ways of enhancing the data by including pseudocounts from nearest-neighbor X' and next-nearest-neighbor X'' sequences (one and two substition mutations from X) .

Daniel P. Aalberts, Eric G. Daub, and Jesse W. Dill, Quantifying optimal accuracy of local primary sequence bioinformatics methods, Bioinformatics in press.