Friday, June 21, 2013

simple pattern matching

Pattern matching algorithms are very important in computer science. In this post we will discuss the simplest algorithm for pattern matching. Given some text and a pattern, we have to find where in the text, the pattern is found.

In this brute-force algorithm, we start matching the pattern from the beginning of the text. If a match is found we return the index. Otherwise we slide the matching window by one character and repeat the same process. For example if we are trying to find the pattern "ababb" in the text "abbabaababb"

The following depiction gives the steps involved in this algorithm

   a b b a b a a b a b b
   a b a
     a
       a
         a b a b
           a
             a b
               a b a b b -- pattern found


The following is the C++ implementation of the above algorithm. Note that this algorithm finds only the first occurrence of the pattern. This can be modified to find all the occurrences also.



#include <iostream>
#include <cstring>
#define MAXLEN 100
using namespace std;

int findPattern(char *t, char *p)
{
 int n = strlen(t); //length of the text
 int m = strlen(p); //length of the pattern
 int i,j; //loop counters
 
 //outer loop tries to match the pattern from index 0 to (n-m)
 for( i = 0 ; i <= n-m ; i++ )
 {
  //inner loop tries to match jth char in pattern
  //with (i+j)th char in text
  for( j = 0 ; j < m ; j++)
  {
   if( t[i+j] != p[j])
    break;
  }
  //pattern found; return the index
  if( j == m)
   return i;
 }
 //pattern not found
 return -1;
}

int main()
{
 char text[MAXLEN];
 char pattern[MAXLEN];

 cin>>text;
 cin>>pattern;
 cout<<"pattern found at: "<<findPattern(text,pattern);
 return 0;
}