CSE 4310 - Assignments - Tentative Assignment 5

The assignment should be submitted via Canvas. Submit a file called assignment5.zip, containing the following files:

The source files implementing your solutions to the programming tasks.
Any additional source files that are needed to run your code. If your code needs any code files available on the course website, please include those files with your submission.
A README.txt file containing the name and UTA ID number of the student. No other information is needed for README.txt.

We try to automate the grading process as much as possible. Not complying precisely with the above instructions and naming conventions causes a significant waste of time during grading, and thus points will be taken off for failure to comply, and/or you may receive a request to resubmit.

Please only include source code in your submissions. Do not include data files.

Code must run in the Matlab or Python versions that are specified in the syllabus.

The submission should be a ZIP file. Any other file format will not be accepted.

Task 1 (15 points)

Write a function distance = euclidean_distance(image1, image2)that takes in as arguments two grayscale images A (not filenames), and returns the Euclidean distance of those two grayscale images. Obviously, you should vectorize the two images before computing their Euclidean distance.

Task 2 (20 points)

Write a function class_label = nnc_euclidean(test_image) that recognizes the digit shown in test_image using nearest neighbor classification under the Euclidean distance. Argument test_image is a grayscale image, not a filename.

You can test your code with these test images, which can be downloaded as a single ZIP file. Each of those images comes from the popular MNIST dataset of handwritten digits. The ZIP file provides 100 test images. Each image shows one of the digits, from 0 to 9. There are ten examples from each class. Each filename follows the format labelX_testY.png, where X is the class for that image, and Y is a number from 1 to 10.

To recognize the digit, you should use nearest neighbor classification (hence the initials nnc in the function name), based on the Euclidean distance. Your function should measure (using your solution to Task 1) the Euclidean distance between test_image and every one of the training examples provided in the digits_training directory. You can download the entire training directory as a single ZIP file. The training directory provides a 15 examples for each class, for a total of 150 training examples. Each filename in the training directory follows the format labelX_trainingY.png, where X is the class for that image, and Y is a number from 1 to 15.

After your code measures the Euclidean distance between test_image and all training examples, it should identify the training example with the smallest distance, and return the class label of that training example. If there are multiple training examples tied for the smallest distance, you can return any of the class labels of those examples, we will not care how you break ties.

It is OK to hardcode in your function the following info:

The training examples are under a directory called digits_training, which is a subdirectory of the current directory.
Each filename in the training directory follows the format labelX_trainingY.png, where X is the class for that image, and Y is a number from 0 to 9. This way your code will know the ground truth for each training example.
There are 15 training examples per class.

Task 3 (15 points)

Write a function [accuracy, confusion_matrix = nnc_euclidean_stats() that measures the classification accuracy of nearest neighbor classification using the Euclidean distance. Your function should evaluate this accuracy using all 100 images in the digits_test directory. Note that this function takes no arguments. All the information that it needs should be hardcoded, as described below.

It is OK for your function to hardcode the same information about training examples that you hardcode for Task 2. It is also OK to hardcode the following information about the test examples:

The test examples are under a directory called digits_test, which is a subdirectory of the current directory.
Each filename in the test directory follows the format labelX_testY.png, where X is the class for that image, and Y is a number from 0 to 9. This way your code will know the ground truth for each test example.
There are 10 training examples per class.

Your function returns two values. The first one is accuracy, which is a real number between 0 and 1, equal to the percentage of test images that were classified correctly. The second one is confusion_matrix, which is a 10x10 matrix where each value confusion_matrix(i,j)is a number between 0 and 1 indicating the percentage of test images of class i that were classified as belonging to class j. For the confusion matrix, treat class label "0" as class label "10". This way, the results for that class show up in the 10th row and the 10th column of the confusion matrix.

As an example, confusion_matrix(3,7) should be the percentage of test images whose real class is "3", and for which your nnc_euclidean function returned a class label of "7". Similarly, confusion_matrix(10,2) should be the percentage of test images whose real class is "0", and for which your nnc_euclidean function returned a class label of "2".

Task 4 (15 points)

Write a function distance = chamfer_distance(image1, image2)that takes in as arguments two grayscale images A (not filenames), and returns the chamfer distance of those two images. You should return the symmetric chamfer distance, which is the sum of the two directed chamfer distances (from image1 to image2, and from image2 to image1).

The chamfer distance is a distance between two sets of points. The set of points corresponding to each image should be the set of pixels in that image that have NON-ZERO values.

Task 5 (20 points)

Write a function class_label = nnc_chamfer(test_image) that recognizes the digit shown in test_image using nearest neighbor classification under the chamfer distance. Argument test_image is a grayscale image, not a filename.

You can test your code with these test images, which can be downloaded as a single ZIP file (same as in Task 2).

To recognize the digit, you should use nearest neighbor classification (hence the initials nnc in the function name), based on the chamfer distance. Your function should measure (using your solution to Task 4) the chamfer distance between test_image and every one of the training examples provided in the digits_training directory (again, same as in Task 2).

After your code measures the chamfer distance between test_image and all training examples, it should identify the training example with the smallest distance, and return the class label of that training example. If there are multiple training examples tied for the smallest distance, you can return any of the class labels of those examples, we will not care how you break ties.

It is OK to hardcode in your function the same info that you also needed to hardcode for Task 2.

Task 6 (15 points)

Write a function [accuracy, confusion_matrix = nnc_chamfer_stats() that measures the classification accuracy of nearest neighbor classification using the chamfer distance. Your function should evaluate this accuracy using all 100 images in the digits_test directory. Note that this function takes no arguments. All the information that it needs should be hardcoded, as described below.

It is OK for your function to hardcode the same information about training examples that you hardcode for Tasks 2 and 5. It is also OK to hardcode the information about the test examples that you needed to hardcode for Task 3.

Your function returns two values, accuracy and confusion_matrix. The instructions for those two return values are the same as in Task 3, except that (obviously) here the numbers should correspond to nearest neighbor classification using the chamfer distance.

CSE 4310 - Assignments - Assignment 5