Administrative Information
Below is some administrative information about our tests.
 Interview dates are 4,5,6 Dec 2014.
 Shortlisting criteria and list of shortlisted candidates would be posted later.
 Format is: 1 hour written test, followed by interviews. Candidates selected in the first round would have to appear for a 2nd round of interviews in the afternoon (tentatively at 4 PM), so should plan to be in the dept. the whole day. They are also encouraged to look up the research interests of faculty by visiting their web pages, and also meet relevant faculty members individually in the afternoon, before the 2nd round.
 Results will be posted on the dept. web site. Please do not email us for results.
Sample Questions
Here are some sample questions we have asked students in the past. Many of these questions are quite simple, but students falter still, mainly because much of the education in our society is based on memorization rather than reasoning. We have also tried to provide sketches of expected solutions. Please do not email us for solutions. You can be sure your email will be ignored.
First Principles Questions
 Suppose 8 of us are sitting around a table in random order. What is the probability you will sit next to me?
 Each of 15 red balls and 15 green balls is marked with an integer between 1 and 100 inclusive;
no integer appears on more than one ball. The value of a pair of balls is the sum of the
numbers on the balls. Show there are at least two pairs, consisting of one red and one green
ball, with the same value. Show that this is not true if there are 13 balls of each color.
 There is a bag containing 5 white and 5 black balls. You repeat the following experiment till you see a white ball : take a ball uniformly at random out of the bag. If it is white, stop. Otherwise, put it back in the bag. What is the expected number of times you will need to draw a ball from the bag ?
Algorithms and Data Structures Questions
 Derive the running time of the binary search algorithm. If I modify binary search to break the interval size into 1/3, 2/3 rather than 1/2, 1/2, then what is the worst case running time?
 You have a mixed pile of N nuts and N bolts and need to quickly find the corresponding pairs of nuts and bolts. Each nut matches exactly one bolt, and each bolt matches exactly one nut. By fitting a nut and bolt together, you can see which is bigger. But it is not possible to directly compare two nuts or two bolts. Given an efficient method for solving the problem.
 You are given a heap containing N elements. Write a procedure which takes as input a parameter k, and outputs the k'th smallest number in the heap. The running time of the procedure must depend on k alone.
 Suppose we use a hash function h to hash n distinct keys into an array T of length m. Assuming simple uniform hashing, what is the expected number of collisions?
Database Questions
 Describe what happens when a query is submitted to the DBMS.
Expected answer: A precise and technical description of parsing of the query, conversion into a query plan (physical and logical query plans), query optimization and execution.
 What is transaction management? How can ACID properties be enforced?
Expected answer: A precise and technical description of what a transaction is, and various methods (locking, time stamps, etc.) for enforcing it.
Please note the words, "technical" and "precise".
 Rewriting relational algebra expressions, writing SQL queries and query plans.
 Analysis of index structures and algorithms for relational algebra operators.
Networks Questions
 Which layers do the following protocols belong to: CSMA, TCP, IP, HTTP, SMTP, DNS, UDP, TDMA, NRZI?
 Why is there is a minimum frame size defined in Ethernet?
 Suppose a network path runs over a series of 3 links with available bandwidths 512Kbps, 256Kbps, and 128Kbps. The RTT for this path is 300ms. What would be a congestion window that can work on this network?
 Suppose you are given a wired network with lossy links. A packet transmitted on link "i" is dropped on that link with probability "p_i". The network does not try to retransmit lost packets. Given a pair of source and destination nodes for a particular packet, how would you find the optimum path on which to transmit the packet so that the probability of the packet getting lost on that path is the least among all paths between the source and destination?
 What is the "Additive Increase Multiplicative Decrease (AIMD)" principle used in TCP Reno? What purpose does it serve?
Architecture Questions
 Consider a uniprocessor computer system A running a 1 GHz CPU. In
system B, we replace the CPU with a 2 GHz CPU, all other components
being the same. The logic design of the two CPUs is identical, with
the difference in speed caused only by faster devices. Do you expect
system B to execute programs at twice the speed of A? Why or why not?

A Finite State Machine (FSM) generates outputs 100, 200, and 300 in
a cyclical order, one output in every clock cycle: 100, 200, 300, 100,
200, 300, 100, 200,...,etc. How many flipflops are needed to
implement this FSM? Why?
Operating Systems Questions
 Consider the following service provided by a bank to transfer one
rupee from account A to account B, implemented by the following
software function:
void transfer(account *a, account *b)
{
if (a>money > 0) {
a>money;
b>money++; }
}
The transaction could be initiated by users through an ATM machine
that would invoke this function at a central server. When there was
only one ATM machine, things were working fine. However, when multiple
ATM machines were installed and users could simultaneously invoke the
transfer() function, problems started happening due to software
concurrency issues.
a. What is the problem if multiple users at multiple ATM machines
could invoke this function simultaneously?
b. How can this problem be solved? Recall that this code is running at
the central server. Your solution should involve changing this code so
that it becomes correct even in the presence of multiple simultaneous
requests.
c. Followup questions: is your solution efficient? How would you
ensure that the maximum number of users can be served simultaneously?
How would you ensure deadlock freedom?
(Possible solution: use locks; use finegrained locks for efficiency;
use global lock ordering to avoid deadlocks)
 On modern computers, how much time does it take on average for the
following operations (just provide a rough estimate, e.g.,
1nanosecond, 10nanoseconds, 100nanoseconds, 1microsecond,
10microseconds, 100microseconds, 1millisecond, 10milliseconds,
100milliseconds, 1second, ....)
a. to execute an instruction on the CPU
b. to read a disk block
c. to change one byte on the disk
d. to overwrite 10 disk blocks.
e. to handle a page fault in virtual memory to implement demand paging
f. to context switch between two processes
g. to execute "grep R hello /usr/bin" (searches for the string
"hello" in all files and subdirectories of "/usr/bin" recursively)
 A computer system is expected to run longrunning computational
jobs like multiplication of large matrices, etc. and is also used for
interactive services like web browsing, terminal input, etc. What
should be the desirable characteristics of the CPU scheduler to ensure
that all these activities can occur simultaneously with a good user
experience and high overall throughput?
 What is a filesystem? What happens when you format a disk? Why do
some OSes run a longrunning filesystem check on startup if the
computer was previously shutdown abruptly (due to power failure, for
example).
