Tuesday, October 15, 2013

Printing last K lines from a file

Given a large file containing many lines of text, How do we write an efficient program to print last k lines from that file.

One obvious approach is to read the entire file first to find the number of lines (n). In the second iteration skip first (n-k) lines, and print the remaining k lines. 

However, this approach might be prohibitively costly as the file is very big and disk operations are slow. More over, we are reading the file twice.

We can design an efficient solution if we take a buffer of size k and keep filling the buffer with the lines read from the file in a circular fashion. This is a type of page replacement algorithm often used in operating systems. If the buffer is full, we replace the oldest entry in the buffer with the new entry.

Once k lines are read from the file, we wrap around, and go back to the beginning of the buffer and overwrite the previous entries as they are no longer the last  k lines. 

Following is the java implementation of the above algorithm.