CSE 5361 - Assignments - Assignment 3 (Programming 3)

Due date

Tuesday, March 25, 2008, 11:59pm.

Task

Use Hidden Markov Models (HMMs) for the task of optical character recognition.

Loading and Processing Data

In this assignment it is strongly recommended that you use Matlab. Code and instructions will be provided only for the Matlab environment.

Download and unzip package mnist.zip. Then, type the following:

data_dir = '.';
load_test;

After you execute those commands, you have the following variables defined in your environment: test_digits, and test_labels. Variable test_digits contains 10,000 images of handwritten digits. Each image is a 28x28 matrix, whose entries have values between 0 and 255. To define a variable equal to the i-th image, to see the i-th image type, and to see the class label of the i-th image, type

i = 432; %example value for i
var = test_digits(:,:,i);
imshow(var);
test_labels(i);

In order to use HMMs to recognize digits, you will need to represent each image of a digit as a sequence. To make life simple, you will convert every image to a sequence of 28 numbers, such that the j-th number is the sum of intensities of all pixels in the j-th row of the image.

Given this representation, you will build an HMM for each of the 10 digit classes. Each HMM will consist of 28 states. Each state will correspond to a row of the image. The observation density for that state has to be appropriate for images of the class that that particular HMM represents. You can use a Gaussian to represent each density, but you can use other types of distributions if you prefer. However, the actual parameters of each distribution (e.g., the mean and variance of each Gaussian) have to be learned from training data. Use the first thousand images of test_digits as training data to learn those distributions.

Then, after you construct your 10 HMMs, you will use them to classify images. Report results on the second thousand images (images 1001-2000) stored in test_digits. Report both the classification error rate, and the confusion matrix (a 10x10 matrix whose (i,j) entry says how many test images of class i were classified as class j).

Hints

If you do it the right way, your state transition matrices will be very simple, and your calculation of the probability of each image given a model will be very simple (you will not need to implement the general Viterbi algorithm). This assignment is not hard, talk to the instructor if you think otherwise, and before you spend a lot of time coding things that may turn out to be unnecessary.

Grading

The assignment will be graded out of 100 points: 30 points for estimating the observation densities, 20 points for constructing the HMMs correctly, 25 points for implementing the classification algorithm correctly, and 25 points for submitting a well presented report of exactly how you implemented every step of the homework, how to run your code (including both the learning stage and the classification stage), and what results you obtained. Your instructions for how to run the code should be sufficient to allow anyone to obtain exactly the same results that you report.

Extra Credit

Use HMMs in a way different (in any way you like) than what is described above, to get better classification accuracy. Depending on the cleverness/originality of your method, and the quality of the results, up to 20 extra points will be awarded.

Submission guidelines

You can implement your solution in any language you want, and any platform you want. Matlab is probably going to be the most convenient environment for this assignment. E-mail your submission to both the instructor and the TA.