logo renabi

DNA sequences - R'MES

Home :: Presentation :: Private area :: Links :: Contacts

R’MES is a set of programs to detect words that appear in a given DNA sequence with an unexpected frequency.

Two classes of Markov chain models are used for the sequence: either stationary Markov chains of order m (m >= 0) or Markov chains of order m with 3-periodic transition probabilities. This last class of models is particularly adapted for coding DNA sequences because the reading frame (phase) is taken into account.

A word W appears with an unexpected frequency in a sequence if the number of occurrences N(W) of W is significantly different from an estimator of the expected count under the considered model. A significant difference between these two counts is obtained by using a Gaussian approximation or a compound Poisson approximation (for rare words) of the distribution of N(W). In each case, R’MES provides a statistic indicating whether the word is under or over-represented. R’MES can also treat degenerated words or, more generally, families of words.

R’MES also provides a statistic that tests whether the number of clumps of W occurs with an unexpected frequency in the DNA sequence, using a Poisson approximation.


see online : homepage

last update : 31/08/2005

Search


News


Latest articles