Got my google interview next week, so found these questions.
Here are my versions of what I think could be right.
------------------------------------------------------------------------------------------------------------------
Question #1) Given k sorted streams where each stream could possibly be infinite in length, describe an efficient algorithm
to merge the k streams into a new stream (also in sorted order).
ANS: Take first element in each of the 'k' streams and identify the smallest among these. Stick this one (if there are more
than 1 with the same value pick the one from the stream with the smallest id) at the next slot in the merged list. This
should be possible in O(k). Repeat this till all elements are exhausted. If a stream has been exhausted dont consider it in
any more comparisons. Total time complexity O(k.n). Generally k would be small compared to 'n', so this will scale reasonably
well. Of course if k=n then you have hit the worst case. ;-)
BTW this is a typical database question. Look up any database design book and you will see what I mean. They use this
extensively in merge-sort for 'join' operations in databases.
------------------------------------------------------------------------------------------------------------------
Question #2) Given a set of KEY->VALUE pairs such that each KEY is unique, describe a method of storing these pairs on disk,
and a method for accessing the corresponding VALUE given a KEY. Assume that RAM is fixed at 1gb and the set of pairs requires
40gb.
ANS: Ok, now it totally depends on how huge the number of 'key->value' pairs are present.
If number of 'key->value' pairs is small enough to fit completely in memory and the number of queries i.e. one is expected to
make a lot of 'key' queries then it makes sense to just use a hash. With a hash solution all that you need to do is store the
mappings in memory. If you want persistence write it to disk sequentially and next time when the queries are to be make read
all the pages into memory. Obviously this solution will work only if the number of queries to follow are so huge that it will
offset the intial cost of reading the pages one by one from disk into memory. Also important to note is that if on the disk
you write the pages in continguous sectors the reads will be fast (only transfer time and no seek & rotation latency for the
disk head). So this solution will be good if the size is small. Read in this case will be O(1) and no disk access at all.
However most likely what the interviewer is trying to test is the other case where the number of key->value pairs is so huge
that it wont fit in memory completely. Again this is a classic database problem. The idea here is to use B+ trees. So with a
B+ tree you are creating a 'n' way tree where 'n' is the minimum number of records in one node of the B+ tree. Look up any
book for how B+ trees are implemented. Since B+ trees are balanced a look up is just log N to base 'n'. In practice the look
up just takes 3 to 4 disk accesses.
------------------------------------------------------------------------------------------------------------------
Question #3) Given N computers networked together, with each computer storing N integers, describe a procedure for finding
the median of all of the numbers. Assume that a computer can only hold O(N) integers (i.e. no computer can store all N^2
integers). Also assume that there exists a computer on the network without integers, that we can use to interface with the
computers storing the integers.
Ok I really dont know if my solution for this one is good enough. Also I am going to assume that it is possible to overwrite
the values on the machines. Then all that you need to do is sort them as a whole using the free computer. This can be done in
O(N.logN). Then go to the middle computer and pick the middle element for the answer.
Obvious problems I see:
1. Assumption is too hard. May be the numbers are read only.
2. Sorting is expensive. May be theres a better solution.
------------------------------------------------------------------------------------------------------------------
Question #4) Given the sequence S1 = {a,b,c,d,...,x,y,z,aa,ab,ac.... } and given that this sequence corresponds (term for
term) to the sequence S2 = {1,2,3,4,....}
Write code to convert an element of S1 to the corresponding element of S2. Write code to convert an element of S2 to the
corresponding element of S1.
S1 to S2 is easy. This is just like the atoi function. Except that we have a radix of 26 and not 10.
unsigned int S1toS2(char *str)
{
unsigned int value = 0;
for (int i=0;str
='\0';i++)
{
value = value*26 + (str-'a'+1);
}
return value;
}
Assumptions:
1. str is a legitamate string i.e. has only characters from a-z.
2. The corresponding value will be something that can be stored in an 'int' i.e. 0 < value < 2^32
3. As from the previous line -> no negative numbers allowed.
S2 to S1 is also easy. Just he opposite though.
char* S2toS1(int value)
{
char *str = malloc(sizeof(char)*(MAXVALUE+1)); // Assume we know the maximum number of characters required
if (str == null) return null;
for (int i=0;value>0;i++)
{
int t = value%26;
value = value/26;
str = t+'a'-1;
str[i+1] = '\0'; // I am lazy, you can move this out of the loop
}
strrev(str); // Reverse the string.
return str;
}
------------------------------------------------------------------------------------------------------------------
Question #5) Given a binary tree with the following constraints:
a) A node has either both a left and right child OR no children
b) The right child of a node is either a leaf or NULL
// Call this function initially as -
// node *newRoot = invert(root,null);
// Assumption: root is non-null
node* invert(node *me, node *parent)
{
node *newRoot = null;
if (me->lchild != null)
{
newRoot = invert(me->lchild,me);
}
// Switch the position of the nodes
if (parent != null)
{
// Only for the original root the parent will be null
me->rchild = parent->rchild;
parent->lchild = null;
}
me->lchild = parent;
// For the last node newRoot will be null
if (newRoot == null)
{
newRoot = me;
}
return newRoot;
}
------------------------------------------------------------------------------------------------------------------
Question #6) Given a square with side length = 1, describe all points inside square that are closer to the center of the
square than to the edge of the square.
I am not sure about this but here is my approach:
1. Consider only the triangle formed by the center of the square and one side of the square (say right side).
The locus of points which are equidistant from the side and the center would be
sqrt(x^2 + y^2) = 1/2 - x (assume that the center of the square is at (0,0) then side is the line x=1/2)
So this gives the equation for this side of the square (so this is y^2+4x=1).
Similarly we can derive for the other sides too. I am guessing it will be
x^2+4y=1
x^2-4y=1
and y^2-4x=1.
So we have the 4 curves(parabolae) which can be joined to give the final set of points.
I dont really like this because it does not sound very simple or intuitive. I am wondering how I would have solved something
like this on the spot. ;-)
------------------------------------------------------------------------------------------------------------------
Question #7) How many 0's are at the end of N!
To do this you just need to see some examples. Say N = 100
for 0's you just need to catch the number of 5's. Since number of 2's will always be greater than number of 5's - each 5 will
result in a 0.
So if N=100
100/5 = 20
20/5 = 4
4/5 = 0
Sum = 24...
So just do this for any other N.
------------------------------------------------------------------------------------------------------------------
Question #8) Given an array A[string], an array of strings where each string represents a word in a text document. Also given
3 search terms T1, T2, and T3 and 3 corresponding sorted sequences of integers S1, S2, and S3 where each integer in Si
represents an index in A where search term Ti occured (i.e. S1, S2, and S3 contain the locations of the search terms in our
array of words). Now find a minimal subarray of A that contains all of the search terms T1, T2, and T3. Extend this algorithm
for an arbitrary number of search terms.
hint: think of the brute force algorithm first
... brute force is rather intuitive... dont feel like typing it out... 
------------------------------------------------------------------------------------------------------------------
Question #9) Design a data structure that supports push(), pop(), and min() all in O(1) time
Ok, all that you need for this is a modified version of stack (could be implemented as link list) but with an additional
pointer for every node + 1 global pointer.
So here is how a NODE will look:
struct node
{
int value;
int *next; //Next node in the stack
int *nextMin; //Next node in the stack which had the smallest value
};
and a global int *min which points to the smallest number in the stack.
So here is how the operations would look for an input like:
push 4, push 3, MIN, push 5, pop, MIN
Initially stack: NULL (min=null)
push 4: NULL,4:NULL (min=pointer to node with value 4)
push 3: NULL,4:NULL,3:1 (min= pointer to node with value 3)
MIN: just return the value of min pointer
push 5: NULL,4:NULL,3:1,5:NULL (min still points to value 3)
pop: Just remove 5 (no need to adjust min pointer because anyway the pointer from 5 in NULL)
Had the pop been before we pushed 5 - we need to pop 3 but adjust the min pointer to the node 4.