Free Web Hosting by Netfirms
Web Hosting by Netfirms | Free Domain Names by Netfirms

Home Page    

 

Research with an automaton

 

Main features
  • builds the minimal deterministic automaton recognizing the language Sigma*x;
  • extra space in O(msigma) if the automaton is stored in a direct access table;
  • preprocessing phase in O(msigma) time complexity;
  • searching phase in O(n) time complexity if the automaton is stored in a direct access table, O(nlog(sigma)) otherwise.
Description

Searching a word x with an automaton consists first in building the minimal Deterministic Finite Automaton (DFA)  A(x) recognizing the language Sigma*x.

The DFA  A(x) =(Q, q0, T, E) recognizing the language Sigma*x is defined as follows:
  Q is the set of all the prefixes of x: Q={epsilon, x[0], x[0 .. 1], ... , x[0 .. m-2], x};
  q0=epsilon;
  T={x};
  for q in Q (q is a prefix of x) and a in Sigma, (q, a, qa) is in E if and only if qa is also a prefix of x, otherwise (q, a, p) is in E such that p is the longest suffix of qa which is a prefix of x.

The DFA  A(x) can be constructed in O(m+sigma) time and O(msigma) space.

Once the DFA  A(x) is build, searching for a word x in a text y consists in parsing the text y with the DFA  A(x) beginning with the initial state q0. Each time the terminal state is encountered an occurrence of x is reported.

The searching phase can be performed in O(n) time if the automaton is stored in a direct access table, in O(nlog(sigma)) otherwise.

The C code
void preAut(char *x, int m, Graph aut) {
   int i, state, target, oldTarget;

   for (state = getInitial(aut), i = 0; i < m; ++i) {
      oldTarget = getTarget(aut, state, x[i]);
      target = newVertex(aut);
      setTarget(aut, state, x[i], target);
      copyVertex(aut, target, oldTarget);
      state = target;
   }
   setTerminal(aut, state);
}


void AUT(char *x, int m, char *y, int n) {
   int j, state;
   Graph aut;

   /* Preprocessing */
   aut = newAutomaton(m + 1, (m + 1)*ASIZE);
   preAut(x, m, aut);

   /* Searching */
   for (state = getInitial(aut), j = 0; j < n; ++j) {
      state = getTarget(aut, state, y[j]);
      if (isTerminal(aut, state))
         OUTPUT(j - m + 1);
   }
}

The example

Preprocessing phase

DFA Automaton
The states are labelled by the length of the prefix they are associated with.
Missing transitions are leading to the initial state 0.

 

Back


user comments and suggestions are invited at KmailDrive@gmail.com