Monday, December 22, 2014

Number of ways of arranging 0, 1

Given 'N' zeros and 'M' ones, In how many ways can we arrange them to form a string of length (N+M)?

For example if we have 2 zeros and 2 ones, we can arrange them in 6 ways
0011, 1100, 0101, 1010, 1001, 0110

If there are N symbols to arrange, The number of ways to arrange them in a sequence is called a permutation problem. This number is simply N! (N factorial).

We have (N+M) symbols in total and N symbols are one type and M symbols are another type, So the number of permutations become (N+M)!/ N!*M!

Another way to look at the problem is using combinations. We have (N+M) places either
  • We have to choose N spaces to fill zeros, in the remaining places, we fill one
  • Or We have to choose M spaces to fill ones, in the remaining places, we fill zero
The number of ways to choose R objects from N objects is called a combination problem. This is represented by C(N,R) = N!/(N-R)!*R!

So for our problem there are (N+M) symbols and we need to either choose N zeros or M ones. So it becomes C(N+M,N) or C(N+M,M)

Both reduce to (N+M)!/N!*M!.

If we try to implement the above formula as it is in code, we have a problem of data overflow.

For example consider 21! = 51090942171709440000, we cannot represent in a long integer also. If we are calculating C(21,10), simply calculating factorials with built in data types result in data overflow even though the result can be represented in those types. C(21,10) = 352716.

In competitive programming, it is sometimes asked to reduce this value using a modulo operation. For example calculate C(n,r) modulo 10007 for arbitrarily large numbers say 1 <= n,r <= 1000. In these cases also, using the factorial formula does not serve the purpose.

We have another formula to calculate C(N,R) using recursive definition.
C(N, R) = C(N-1, R-1) + C(N-1, R)
You can easily understand this by using Pascal triangle.
If we implement the algorithm using this formula, we can find the result for a much better range of numbers .

Here is the Java implementation. For this problem I assumed the range of 1000 for N and M. Also it calculates the result modulo 109+7

Thursday, December 18, 2014

Number of ways to represent an integer as sum of two squares

Given a number, we have to find out the number ways to express it in the form of sum of two integer squares.

For example let us consider 25, it can be written as sum of the pairs (0,25) (9,16)

24 cannot be written as a sum of any two squares.

Mathematically this problem is to find the roots of the equation x2+y2-n = 0 where n is given.

The algorithm for this problem is simple. let us start i with 0 and check if (n-i*i) is a perfect square. We repeat this process until (n-i*i) is greater than i*i

Take the example of 25, Here is the trace of the algorithm

i          n-i*i      Result
0          25        - Yes
1          24        - No
2          21        - No
3          16        - Yes
4          9         - break

Here is the C++ code to do this.

Thursday, December 11, 2014

Counting triplets with given sum

Given an array of distinct numbers, how do we count the number of triplets with given sum?

For example let us consider the array {5, 2, 7, 8, 4, 3} and the sum is 14, 
There are 3 triplets with the given sum

A brute-force algorithm will take O(n3) time. But this problem can also be solved based on 2 Sum problem. Here is the approach.
  • Sort the array first.
  • Fix the first element in the triplet, and search for a pair of elements whose sum is (Sum - Fixed element)
  • Repeat the above step for all the elements
We are essentially searching for a pair of elements for each fixed element. So the time complexity of this solution would be O(n2).
One variation/follow-up of this problem could be to find out the number of triplets with sum less than or equal to the given value.
Considering the first example again, there are 8 triplets with sum less than or equal to 14.
2 3 4
2 3 5
2 3 7
2 3 8
2 4 5
2 4 7
3 4 5
3 4 7
The solution for this is similar to the first problem and uses the trick discussed in my previous post

The code for both is given in the following Java program.

Tuesday, December 9, 2014

Number of pairs with sum less than or equal to given value

Given an array of numbers, Count the number of pairs with sum less than or equal to K

For example consider the array {5, 1, 12, 3, 6}
pairs with sum <= 10 are {(1,3), (1,5), (1,6), (3,5), (3,6)}
So there are 5 pairs with the given constraint.

In one of the previous posts, we have seen how to find a pair with given sum in the given array. This problem is a variation of that.

Let us look at the algorithm.
  1. First sort the array in ascending order.
  2. Take two indices one(b) pointing to the start of the array, and the other(e) pointing to the last element of the array.
  3. If A[b]+A[e] <= K
    1. A[b] + A[i] <= K for b < i < e. Hence (e-b) elements satisfy the given property.
    2. Increment b and decrement e and repeat step 3
  4. Otherwise decrement e
  5. Repeat step 3 until b and e meet each other.

A[0] A[1] A[2]... A[N]
b-->            <--e

Considering the above example 
{1, 3, 5, 6, 12} let b = 0, e = 3 and K = 10

1  3  5  6  12
b        e


If we add 6 or 5 or 3 to 1, the sum is less than 10. At this point, we can increment the pair count in one shot.

Here is Java implementation of the above algorithm. This solution runs in O(n log n).

Friday, December 5, 2014

Maximum sum path in two sorted arrays

Given two sorted arrays of size M and N, Find a sequence of elements from both the arrays whose sum is maximum. The sequence can begin with any of the two arrays and can jump from one array to another array if there is a common element.

Let us consider the following example
Array1 = {1, 5, 9, 15}
Array2 = {7, 9, 12}

The maximum sum is 7 + 9 + 15 = 31

Similarly let us examine a little more complex example

Array1 = {1, 3, 7, 10, 15, 19}
Array2 = {2, 4, 6, 7,  9, 15, 20}

The maximum sum is 2 + 4 + 6 + 7 + 10 + 15 + 20 = 64

This problem is similar to finding the common elements between two sorted arrays and also similar to the merge procedure in merge sort. You can check out my previous post on this.

Start with two indices pointing to the beginning of two arrays. 

We maintain two variables sum1, sum2 to keep track of the partial sum. Compare the two elements, move the index with lesser element forward and add it to it's corresponding sum. 
We do this until we find an equal element; at this point we add the maximum of two sums to a global maxSum variable and repeat the same procedure.

Let us trace the algorithm for the first example

  • sum1 = 0, sum2 = 0, ind1 = 0, ind2 = 0, maxSum = 0
  • sum1 = 1, sum2 = 0, ind1 = 1, ind2 = 0, maxSum = 0
  • sum1 = 6, sum2 = 0, ind1 = 2, ind2 = 0, maxSum = 0
  • sum1 = 6, sum2 = 7, ind1 = 2, ind2 = 1, maxSum = 0
  • sum1 = 0, sum2 = 0, ind1 = 3, ind2 = 2, maxSum = 7
  • sum1 = 15, sum2 = 12, ind1 = 3, ind2 = 2, maxSum = 16
  • We reached the end; update maxSum = 16+15 = 31
Here is the C++ code for the same. It runs in O(m+n) time where m, n are the lengths of the two arrays respectively.


Wednesday, December 3, 2014

Finding the longest consecutive subset

Given an array of positive and negative numbers(distinct), How do we find the length of the longest consecutive subset?

For example consider the below array
{3, 6, 1, 2, 5, 7, 8}
It contains {5,6,7,8} as the longest consecutive subset. So the answer is 4.
Similarly for {2, 4, 6, 8}, the answer is 1 as there are no consecutive numbers.

Lets discuss two algorithms to solve this problem.

Method based on sorting: 
We sort all the elements in ascending order. This would take O(n log n) time. Then we compare the adjacent elements to check if they are consecutive. If any two elements are not consecutive, we break the sequence and update the maximum length. getMaxConsecutiveSubset1() method in the code implements this algorithm. The overall complexity of this algorithm is
O(n log n).
Method based on Hash map:
Let us take a Hash map which stores the all the elements with a flag for each which indicates if it is visited or not visited.
  • Initially we store all the elements to the map with visited flag as false. 
  • For each element do the following 
    • If it is not visited earlier 
      • Mark the element as visited 
      • Initiate a sequence of length 1 
      • Try to extend this sequence backward by marking the elements visited 
      • Try to extend this sequence forward by marking the elements visited 
      • Update the maximum sequence length

Here is the Java code which implements the above algorithm. It runs in O(n) time and O(n) space.

Monday, December 1, 2014

Finding increasing triplet in an array

Given an array of distinct numbers, we have to find if there is any triplet of type (A[i], A[j], A[k]) such that A[i] < A[j] < A[k] and i < j < k.

For example consider the array {6, 1, 5, 2, 4}

We have a triplet {1, 2, 4} which satisfies the given criteria.

Where as the array {4, 1, 5, 3, 2} does not have any such triplets.

Simple algorithm:

For each element we can try to find the lesser element to the left of it and find the greater element to the right of it. This will obviously give the required result. But this algorithms in O(n2).

printIncreasingTriplet1() method implements this approach.

Linear time algorithm:
We declare two arrays leftMin and rightMax. leftMin stores the minimum element seen so far up to the given index. Similarly rightMax stores the maximum element seen so far up to the given index but from reverse order.

We are essentially calculating the lesser elements to the left and greater elements to it's right for each element.  In the next step we need not search for the lesser and greater elements for all elements. They are simply available in leftMin and rightMax at the same index.

printIncreasigTriplet2() in the code implements this approach. This runs in O(n) time and takes O(n) space.

Tuesday, November 18, 2014

Clearing the right most set bit in a binary number

Given an integer, write a program to clear the right most set bit in it's binary representation.
For example, consider a number 54. It is represented in binary as  
1 1 0 1 1 0
The right most set bit is marked in bold. We need to convert this to 
1 1 0 1 0 0
Let us look at the approach step by step.

If you negate the number to -54, it is represented in computer's memory in 2's compliment.
0 0 1 0 1 0
Let us perform a bit-wise AND operation between these two.
1 1 0 1 1 0
0 0 1 0 1 0
0 0 0 0 1 0
Subtract this from the original number to get the result. So x - ( x & -x) gives the required answer.

Here is the simple program to test the above.

Wednesday, November 5, 2014

Number of characters appearing in all given strings

Given a set of n strings, we have to write a program to find out the number of characters appearing in all the strings.

For example consider the following strings
{"India", "China", "United states"}, the letters {a,i,n} appear in all the strings.

Let us think of the following solution.
  • We first add all the characters in the first string to a set. 
  • For each string we do the following.
    • For each letter in the set, we check if it can be found the string.
    • If it is not found, we delete it from the set
  • Finally the set contains only the elements which appear in all the given strings.
Here is the C++ code which implements the above approach. For simplicity this program assumes that all the strings contains lower case alphabets only.

Monday, October 27, 2014

Reversing a linked list in groups of given size

Given a single linked list, and an integer k, we need to reverse the linked list in groups of size k.
For example consider the following diagram of input and output linked lists for k = 3.

There are no tricks involved in the solution for this problem. It is just an implementation challenge where we have to carefully manipulate pointers to achieve the desired result.

Here is the strategy. The first step is to subdivide the list into chunks of size k. The last chunk maybe less than size k (depends on whether the length of the linked list is a multiple of k).

Keep a reference to the beginning of the sub-list, and traverse through k nodes.
Reverse the sub-list and re-arrange the begin and end pointers for the reversed sub-list.

The following diagram might give an overview of the process.
Here is the C++ implementation including recursive as well as iterative implementation.
We can consider the following test cases to test the program.
  • An empty/NULL list
  • List with just one node
  • List with multiple of k size
  • List with non-multiple of k size
  • Arbitrary list with k = 0, k= 1, k= length(list)
  • Any combination of the above.

Thursday, October 16, 2014

How to check if a given number is Fibonacci number

How to check if the given number is a Fibonacci number?

A Fibonacci series starts with 0, 1. Remaining elements are formed by adding previous two numbers.

Here are a few Fibonacci numbers 0, 1, 1, 2, 3, 5, 8, 13, 21, 34,...

If a number is given, a straight forward way to check if it is a Fibonacci number is to generate the numbers one by one until we get the required number or greater than that.

Here is the python code do that.

One more approach is based on a property of Fibonacci numbers
If 5*x2+4 or 5*x2-4 (or both) is a perfect square, we can conclude that it is a Fibonacci number. 

The C++ code is given below.

Tuesday, October 14, 2014

Finding equal sum index

Given an array of numbers, find an index such that the sum of numbers to the left is equal to the sum of elements to it's right. If there are no elements to the left or right, the sum is considered to be zero.

This problem is from Hackerrank. Click here to solve this problem on your own.

A simple example will be {1,2,1}, at index 1, the left and right sum is 1.

One obvious approach to solve this problem can be as follows.

For each index, find the left sum and right sum and check if they are equal. This approach takes O(n2) time.

Let us look at some other method which uses some extra space to reduce the time complexity.

Build a prefix sum array which keeps track of the sum all the elements up to each index. 

For the above example, the prefix sum array is {1,3,4}

The last element of this array indicates the sum of all elements in the original array.

Now, for each index i, we can find the left sum in prefix_sum[i-1] and right sum in total_sum - prefix_sum[i] which can be found in O(1) time. So this algorithm takes O(n) time.

Below Python/C++ code is provided which implements above. 

The base cases are

  • If the array contains zero elements, we cannot find such an index
  • If the array has just one element, we can always find such an index.

Wednesday, October 8, 2014

Number of deletions needed to make anagrams

Given two strings, how do we find the number of characters to be deleted from both the strings in order to make them anagrams.

Two words are said to be anagrams if one word can be formed by re-arranging the letters of another word. For example the words "TAN" and "ANT" are anagrams.

Given two words "clean" and "lion", we need to delete 5 characters from both the strings, {c,e,a} from the first and {i,o} from the second.

Let us discuss the approach to solve this problem.

If two words are anagrams, they both have same set of characters along with their counts. So to make two random words as anagrams, we have to delete all the characters that are present in one word but not in the other word.

To do this 
  • We first create the frequency table of each character present in the first word.
  • Next we iterate through each character in the second word, and try to find it in the frequency table.
    • If there is a non zero count for that character, we decrement it.
    • Otherwise we need to delete the character.
  • After processing all the characters in the second word, we need to delete if there are any extra characters present in the table.

Here is the Python program which implements the same.

Friday, September 19, 2014

Forming the nth binary sequence

A binary sequence contains 0 and 1 only

S.No  Binary sequence
1     0
2     1
3     00
4     01
5     10

Given a number k, we have to design an algorithm to print kth binary sequence.

Initial approach (not efficient):

Starting from the sequence 0, this algorithm iterates k times. 

To get the next sequence, in each iteration we do the following.
 check the last index of 0 in the number

  • If it is found change that to 1 and all the digits after that to 0
  • If it is not found, the next sequence contains all 0 which is one  digit longer than previous
Efficient approach:

Observe the following. we can have
'2' single digit numbers
'4' two digit numbers
'8' three digit numbers
'2^i' i digit numbers

We can find the number of digits in
kth sequence by calculating the power of 2 which is less than k .

For example if
k is 5, the smallest power (s) of 2 less thank k is 4. So kth sequence has 2 digits.

To form the the sequence we need to find the binary representation of 

k - ( 2^1 + 2^2 + ...+2^s)

This SPOJ problem is based on the above algorithm.

Here is the C++ program which implements this.

Wednesday, September 17, 2014

Building a binary tree using inorder and post/pre order traversals

Given the in-order, and pre-order or post-order sequences of a binary tree, how do we construct a binary tree from them?

Given only pre-order and post order traversals, we can not construct a unique binary tree from them.

For example given 
In-order : {1, 2, 3, 4, 5, 6, 7, 8}
Pre-order: {5, 3, 2, 1, 4, 8, 6, 7}

We have to create the following binary tree
We use recursive approach for constructing a binary tree from In-order and Post-order or Pre-order traversals .

Constructing from Inorder and Preorder:

In Pre order, we first traverse the root, then left sub tree and right sub tree. So the first element in the pre order is always the root. We search for this element in the in-order sequence. This index divides the in-order sequence into left subtree and right subtree.

For example consider the following binary tree
               /   \
              2     3
             / \   / \   
            4   5 6   7

Inorder:  {4, 2, 5, 1, 6, 3, 7}
Preorder: {1, 2, 4, 5, 3, 6, 7}

Create 1 as the root node, 
  • all the left elements to the left of it {4, 2, 5} form the left subtree
  • all the elements to the right of it {6, 3, 7} form the right subtree

We can use the index of 1 in Inorder (in this case 3) to calculate the size of the left and right subtrees.

Size(left subtree) = index
Size(right subtree) = size-index-1

We can also use this index to calculate the offset of Inorder and Post order sequences for the subtree.

To create the left subtree we pass Inorder:{4, 2, 5} Preorder:{2, 4, 5} as parameters to the recursive call.

To create the right subtree we pass Inorder:{6, 3, 7} Preorder:{3, 6, 7} as parameters.

Constructing from Inorder and Postorder:

In post order, we first traverse the left subtree, right subtree and visit the root. So the last element in the post order sequence is the root.
Similar to the above algorithm, this element divides the in-order sequence into left subtree and right subtree.

Considering the same example as above the Post order is {4, 5, 2, 6, 7, 3, 1}
Once we create the root with the last element i.e 1. 

We pass the following parameters to create the left and right subtrees.

                       Inorder                 Postorder
Left Subtree    {4, 2, 5}                {4, 5, 2}
Right Subtree   {6, 3, 7}                {6, 7, 3}

Here is the C++ implementation of the above two algorithms.

Monday, September 15, 2014

Gas station problem

There are some gas stations arranged in a circle. Each of them has some amount of gas. We are also given the distances between all the consecutive stations. Assume that we have a vehicle with very large fuel tank(infinite) and one unit of gas is needed to cover one unit of distance, the problem is to find if we can complete the circuit starting at any gas station.

For example assume that we have the following data

Gas : {3, 1, 2, 5, 4}
Distance: {4, 1, 1, 2, 3}

Station[0] has 3 units of gas, 
Station[1] has 1 unit of gas and so on...

Distance between the stations are interpreted as follows
Station[0] and Station[1] is 4, 
Station[1] and Station[2] is 1, 
Station[4] and Station[0] is 3 , etc...

Let us look at the algorithm for solving this.

We can start at any possible index ( [0, n-1] ), filling all the gas at each station, and move forward if it is possible to reach next station. If we are able to reach the starting position again, We can complete the circle.

What if we are not able reach next station at some point?

For example we start at station i and reached station j and we can not proceed to station (j+1). In a naive approach we start from station (i+1) and see if we can reach station (j+1). This approach works, but it takes O(n2) time.

The catch here is that we can actually start from (j+1) because of the following reason.
If we start at any index between [i+1,j] we can not reach (j+1) since at each step in the previous iteration we had zero or more gas but still we were unable to reach j+1.

For example consider the simple instance
Gas: { 1, 1, 4}
Cost:{ 1, 2, 3} 

If we start from 0 we can reach 1 but we can not reach 2( required: 2 units, present:1 unit). If we start from 1 also, we can not reach 2. So we need not check for this index. We can start from 2 and complete the circle.

So this approach takes O(n) time.

Here is the C++ implementation.

Friday, September 12, 2014

Maximum sum of a subsequence with no contiguous elements

Given an array of positive and negative numbers, find the maximum sum of any sub sequence such that no two elements are contiguous.

For example consider the array {3,-2, 1, 7}, maximum sum is 10 for {3,7}
For {3,-2, 1, 0, -4, 2} it is 6 for {3,1,2}

This problem can be efficiently solved in O(n) time using dynamic programming technique.

The approach here is simple. We take an auxiliary array with the same size as input. 

The entries in this array are calculated as follows.

Aux[i] = Aux[i] if i = 0
       = Max( Aux[0], Aux[1] ) if i = 1
       = Max( Aux[i-2]+arr[i], Aux[i-1] ) otherwise

Here is the C++ implementation of the above approach. It takes O(n) time and O(n) space.

Tuesday, September 2, 2014

Longest substring with equal number of 0s and 1s

Given a string consisting of only 0s and 1s, How do we write a program to find the longest sub string with equal number of 0s and 1s in it?

For example consider the string "01001", the longest string with equal number of 0s and 1s is "1001".

Let us look at the solution approaches.
Considering 0 as -1 and 1 as +1, at each index we maintain a cumulative "balance" value.
Here balance is the difference between the count of 1 and 0 appeared so far.

For example for the example string "01001", the balance values are as follows

Index:     0   1   2   3   4
Bit:       0   1   0   0   1
Balance:  -1   0  -1  -2  -1

In this balance array, the longest distance between any two equal values gives us the required result.
In the above example, balance value "-1" appear at index 0 and index 4. So 4-0 = 4 is the maximum length of the sub string with equal number of 0,1s.
and the sub string is 1001 ( sub string between 1 and 4 indices).

Let us look at two possible approaches which utilizes the above property.

Method 1: Simple O(n2) approach
This method iterates through all possible sub strings and finds out which one is the longest with equal number of 0, 1s.
To check if a sub string has equal number of 0,1, we check the balance value at each index. If it is zero, we will update the maximum length if required.

Method 2: Efficient O(n) approach
This method needs extra space to run in O(n) time. For a string of length n, the range of balance values are in the range (-n, n); -n when all entries are 0, and n when all entries are 1.

We will create an array of size 2n+1 which acts as a hash table for all possible balance values. This table stores the leftmost index in the array for each balance value.

Since the array is 0 indexed, table[n+b] stores the leftmost index with balance b. At each index we keep on updating the balance value.

  • If there is no entry in the table with the given balance, we store the index for b.
  • If the table contains a previous entry with the same balance value, we check if the sub string between this previous index and the current index is the longest sub string.
Below is the C++ program to which implements the two approaches discussed above.

Monday, September 1, 2014

Longest substring with unique characters

Given a string, how do we write a program to find the length of longest sub-string without any characters repeating in it.

Let us call it a unique sub string. For example consider the string "program" the maximum length of a unique sub-string is 5 for the sub string "ogram".

A method by checking all possible sub-strings for the given property is obvious, but it is not time efficient. A naive implementation takes O(n3) time.

Let us look at more efficient approach which runs in O(n) time. Let us assume that the string contains all possible 256 ASCII characters.

We take a Boolean array of size 256 to indicate if a character exists or not.

Take two indices ind1 and ind2 initially pointing to the start of the string. As long as we find a unique character at ind2 we keep on incrementing it. Whenever we see a duplicate character , we move ind1 one index beyond the first instance of that character, keeping track of the maximum length of the sub string found so far.

For example when we find 'r' for the second time, the indices will be adjusted like the following.
p  r  o  g  r  a  m
      |     |
    ind1   ind2

Here is the C++ implementation.

Wednesday, August 27, 2014

Finding all Pythagorean triples in an array

Given an array of unique elements, Write a program to print all Pythagorean triples in it.

A Pythagorean triple is a set of three numbers such that square of one number is the sum of squares of the other two.

For example (3,4,5) is Pythagorean triple because 52 = 32 + 42

Consider the array {5, 6, 3, 10, 4, 2, 1 , 8, 7} 
It contains two Pythagorean triples (3,4,5) and (6,8,10)

An obvious solution to this problem could be examine all possible triples in the array and check if it satisfies the above property.
But this solution takes O(n3) time.

We can do better than this if we first square all the elements and sort them in descending order.

Then this problem is simplified to finding three indices in the array i,j,k such that arr[i] = arr[j]+arr[k]

The algorithm is as follows.

For each array[i] 0 <= i < n-2

     index1 <- i+1
     index2 <- n-1
     While index1 < index 2
        If array[i] = array[index1] + array[inde2]
           Print Triple
        If array[i] <  array[index1] + array[inde2]

Here is the implementation of the above in C++. It takes O(n log n) for sorting and O(n2) for the actual algorithm. So the effective time complexity for this problem is

Solution 2:
Alternatively you can use a Hash set to store all the square values. Then examine all possible pairs in the array to check if sum of their squares exists in the set.
This approach also takes O(n2) time but uses O(n) extra space. 
Note: The hash_set implementation is available only in Visual studio 10 compiler and above. In C++ 11 standard it is mentioned as unordered_set. If we use normal set, the time complexity will become O(n2 log n).

Tuesday, August 26, 2014

Finding the depth of n-ary tree given parent id representation

Given an a rooted n-ary tree represented as a parent id array, Write a program to find the depth of it.

The depth of a tree is defined as the maximum length of the path from root to any leaf node.

For example the following tree has a depth of 2 (for the path 0-2-4)

It is represented in parent id array as follows
index:    0  1  2  3  4
parent:  -1  0  0  0  2

This can be solved by first finding the depth of each node and returning the maximum among them. We can find the depth of each node by traversing the parent ids until we reach the root node. But this method takes O(n2) time if we naively calculate the depth of each node.

For example we calculate the depth of node 0 every time when we calculate the depths of it's descendants (1,2,3,4).

We need to use a simple memorization (Dynamic Programming) to store the depths of nodes already calculated.

For example if we already have the depth of 2, then the depth(4) is depth(2)+1.

This method runs in O(n) time because each node is visited exactly once. space complexity is O(n).

The following is a C++ implementation of the above DP algorithm.

Wednesday, August 20, 2014

Counting the number of inversions in a list

Given an array of numbers, how do we count the number of inversions in it?

An inversion is an ordered pair if indices (i,j) such that i < j and A[i] > A[j] .

For example consider the array {3, 2, 1, 5, 4}
The inversions are {(0,1), (0,2), (1,2), (3,4)} totaling 4.

We can always use the brute force approach of comparing all possible pairs of numbers to check if each pair is an inversion or not. This takes O(n2) time.

How do we solve this efficiently?

We can modify the merge sort procedure to count the number of inversions also!

Please check out my earlier post on Merge sort to understand how it works.

Here are the steps.
  • First count the inversions in the left part
  • Count the inversions in the right part
  • Count the inversions if they are merged using modified merge procedure
  • Adding the three counts gives the actual result 
We need to change the merge procedure to count the inversions. The merge procedure works like the following.

We have two sorted arrays to merge into another array.

{1, 3, 5, 7} + {2, 4, 6, 8} = {1, 2, 3, 4, 5, 6, 7, 8}
  • Take two pointers to start of the two arrays.
  • Compare the two elements to see which element goes to the sorted list first.
  • Increment the corresponding index.
  • Repeat the above steps until we reach the end of either of them.
  • Add the remaining elements if any.
While comparing the two elements if the left element is lesser than right element, we simply proceed.
If the right element is lesser than left elements, then it is also lesser than all elements to the right of left element. So increment the inversion counter accordingly.

For example
{.... 5, 9, 12, 15, ...}

3 Appears in the right and 5 appears in left. So we can count all the inversions (5,3), (9,3), (12,3)... in this step itself.

Here is the C++ code which implements the above algorithm. The time complexity of this implementation is O(n log n).

Friday, August 8, 2014

Maximum number of overlapping intervals

Given a set of intervals, how do we find the maximum number of intervals overlapping at any point of time.

For example let us consider the following intervals.

{ (0,2), (3, 7), (4,6), (7,8), (1,5) }

The maximum number of intervals overlapped is 3 during (4,5).

This can be asked in many forms in a programming competition. This CodeChef problem is one such an example. Go and test your knowledge if you have some idea about how to solve this.

Let us look at the solution approaches.

We first consider all the start and end point as a single entity and sort them in ascending order according the following criteria.
  • If the times are different, just sort according to time.
  • If the times are equal, we first place the end of interval.
Then we take a counter and a maximum counter. 

Iterate through all the points
  • If the point is start of interval we increment the counter.
    • Keep track of maximum counter
  • Otherwise we decrement the counter
At the end of this, we have the required result in maximum counter.

Another alternative solution could be to use the merge procedure used in merge sort. The steps are similar to the merge procedure.
  • Sort all the starting points and ending points in ascending order
  • Try to merge them into a single list
  • Use two variables overlap, and max_overlap to keep track of the current overlap and maximum overlap we have seen so far.
  • As long as we take element from starting point of the interval, increment overlap and update max_overlap.
  • If we are taking en element from ending point of the interval, decrement overlap.
At the end of this procedure, we have the answer in max_overlap.

Below is the simple C++ implementation of the above algorithms.