Due date: 25 October 2024
This assignment deals with image transforms and compression. As in Assignment 2, you are allowed to use single-component (intensity) images for all problems unless otherwise stated.
In this problem, we will experiment with the cosine basis for representing images. Similar to Assignment 2 problem 3, choose an image \(f\) that has large smooth regions as well as regions with strong edges and fine detail. Make sure its length and width are multiples of 16, otherwise crop it to such a size.
Implement functions to compute the discrete cosine transform and inverse discrete cosine transform of a 1D sequence of arbitrary length. Use these to perform the DCT of \(f\) in a separable manner, and verify that applying the IDCT to the resulting coefficient array \(t\) recovers the original image. If you have access to a built-in DCT, also verify that your implementation gives identical results; after that, you may use the built-in DCT for subsequent parts.
Plot the histograms of \(f\) and \(t\).
Suppose we retain only the 1/16th largest (in absolute value) coefficients in \(t\) and zero out the rest. Show the resulting approximation of \(f\), and report its MSE. In your report, prove that for any orthogonal transform, such a procedure results in an MSE equal to \(1/(MN)\) times the sum of squares of the deleted coefficients, and verify this numerically for this example.
Divide the image into blocks of size \(16\times16\) and take the DCT of each block. Consider two approximation strategies: (i) in each block, keeping only the top 1/16 coefficients; (ii) keeping the top 1/16th of all coefficients across all blocks. For both, show the resulting image and the MSE.
The DCT is good for compression but not so useful for denoising. As we will see, wavelet transforms do a good job for the latter as well. You may use the same image \(f\) as in problem 1, or a different one.
Implement the 2D discrete wavelet transform for a user-specified number of levels (not necessarily going all the way to a \(1\times1\) image). You may restrict attention to Haar wavelets. Create a “zone plate” image similar to the one in lecture 6, and verify that you get detail coefficients corresponding to the horizontal, vertical, and diagonal features at different scales in the image.
Add Gaussian noise \(\eta\) of some chosen variance to the test image \(f\) and take the DWT. Verify that the noise in the transform domain, \({\rm DWT}\{f + \eta\} - {\rm DWT}\{f\}\), remains Gaussian with the same variance, by plotting its histogram and computing its statistics.
Denoising can be performed by zeroing out detail coefficients of small magnitude. For a given threshold \(t>0\), one can perform hard thresholding or soft thresholding: \[d_{\rm hard}=\begin{cases}0&-t\le d\le t\\d&\text{otherwise,}\end{cases}\qquad d_{\rm soft}=\begin{cases}d+t&d<-t\\0&-t\le d\le t\\d-t&d>t.\end{cases}\] For both cases, try different values of \(t\), plot the MSE, and show the best reconstructed image.
(Optional, no extra marks) Extend your DWT implementation to support the Daubechies db2 wavelets, recalling that all you need are the scaling function coefficients \(h_\phi\) (given as \(g_0\) in lecture 17 and in the textbook) and the wavelet function coefficients \(h_\psi\) (which can be obtained from \(h_\phi\)). Repeat task (c) with the db2 wavelets and discuss the results.
Now that we have a block DCT, we can implement a toy JPEG-like codec (encoder/decoder). We will not exactly match the JPEG file interchange format, but just get all the main stages of the compression process working. For each stage, also implement a decoder and verify that it can successfully decode the encoder output. Use an \(8\times8\) DCT here instead of \(16\times16\). Note: For each subtask, some evaluation components (i.e. test with such-and-such parameters and report various metrics) will be added later, but more implementation tasks will not be added.
After computing the block DCT coefficients, the next step is to quantize them using a quantization matrix \(Z\). Implement this stage for a user-specified \(Z\) and test it using (i) a constant matrix with all entries equal, and (ii) the JPEG quantization matrix given in the lecture slides.
Perform zigzag ordering of the quantized DCT coefficients. JPEG uses RLE only for the zeroes in the AC coefficients, i.e. the coefficients are stored as follows: value of DC coeff, length of run of zeroes (can be 0), value of next nonzero AC coeff, …, length of run of zeroes, value of next nonzero AC coeff, “end of block” symbol for last run of zeroes.
Build one Huffman code for the DC coefficients of all blocks, and similarly one for the nonzero AC coefficients, and one for the run lengths. Encode all the coefficients of the image as a single stream of 0s and 1s. Now, do it again but storing only the difference between DC coefficients of consecutive blocks.
Implement a function that puts it all together, taking an image and a quantization table as input and producing a tuple containing the Huffman codes and the stream of encoded bits. Conversely, the decoder should take the Huffman codes, the quantization table, and the bit stream as input and reconstruct the image.
Submit a zip file that contains (i) all your code for the assignment, and (ii) a PDF report that includes your results for each of the assignment problems. In the report, each output image must be accompanied with a brief description of the procedure used to create it, and the values of any relevant parameters.
All images should ideally be saved in PNG format, which is lossless and so does not cause any information loss (other than quantization if the intensities are not already in 8-bit format). JPEG is permitted if your submission PDF is becoming too big for the upload limit.
Your assignment should be submitted on Moodle before midnight on the due date. Only one person in a group needs to submit. Late days are counted with a quantization of 0.5 days: if you cannot finish the assignment by midnight, get some sleep and submit by noon the following day.
Separately, each of you will individually submit a short response in a separate form regarding how much you and your partner contributed to the work in this assignment.