Monday, September 1, 2014

Longest substring with unique characters

Given a string, how do we write a program to find the length of longest sub-string without any characters repeating in it.

Let us call it a unique sub string. For example consider the string "program" the maximum length of a unique sub-string is 5 for the sub string "ogram".

A method by checking all possible sub-strings for the given property is obvious, but it is not time efficient. A naive implementation takes O(n3) time.

Let us look at more efficient approach which runs in O(n) time. Let us assume that the string contains all possible 256 ASCII characters.

We take a Boolean array of size 256 to indicate if a character exists or not.

Take two indices ind1 and ind2 initially pointing to the start of the string. As long as we find a unique character at ind2 we keep on incrementing it. Whenever we see a duplicate character , we move ind1 one index beyond the first instance of that character, keeping track of the maximum length of the sub string found so far.

For example when we find 'r' for the second time, the indices will be adjusted like the following.
p  r  o  g  r  a  m
      |     |
    ind1   ind2

Here is the C++ implementation.