CSL862: Assignment 2 on Programming in Cloud Environment
Implement a webpage that offers the following services:
- Calculate the Nth prime
- Calculate the number of primes not exceeding 'x'
- Generate a random prime number
- Upload an image with it's tag. This should upload the image, and display the thumbnail of the image below. All the thumbnails of all the images that have been uploaded in this session must be displayed. The image should be stored in the persistent database forever.
- Content-based image search: The user uploads an image. The output is a ranking of all the images in the database thus far, based on their similarity to the uploaded image. The uploaded image does not get added to the database. Here is a
very simple algorithm to calculate the similarity metric on two images:
You can restrict the range of prime numbers to a large-enough value (e.g., 10^16). You could use the algorithm given here, but your range has to be larger so that a single-user query takes around 1 second.
- Given two images A and B.
- Let Diff(A,B) be defined as the sum(abs(Ai - Bi)) where Ai and Bi are pixel values of the i'th pixel, and the sum is done over all the pixels
- First find the best location of A in B. This involves finding a coordinate
P = (x,y) such that when A is placed at P and B is placed at (0,0), Diff(A,B) is minimum. Use this minimum Diff value as the similarity metric between A and B.
- Rank the database images based on the similarity metric with the uploaded image.
For your tests, use naive algorithms so that the problems become compute/IO-intensive. We will judge you not on the smartness of the algorithm, but on the smartness of your load-balancing and allocation scheme to make the problem scalable and efficient.
- To be done in groups of two or three.
- Implement on CSC cloud, MS Azure environment, and Google AppEngine
- The last date of submission is Oct 1
- You will need to write automated clients for testing with large number of requests
- Request the instructor for Virtual Machines on CSC cloud as needed
- Report the ratio of the number of requests to the number of instances required. Also mention the algorithm used. Use of smarter algorithms will not count (as that is not the purpose of this assignment). Also report the corresponding response times. The response times should be reasonable e.g., for prime number computation, the response time should never exceed 3 seconds.
- Extra credit: Setup an MS-Azure environment in CSC cloud and share the public VM with us. We could even contribute this VM as a template to CSC in future. Here are instructions on setting up MS Azure