CSE 4311 - Assignments - Assignment 3

List of assignment due dates.

The assignment should be submitted via Canvas. Submit a file called assignment3.zip, containing the following files: These naming conventions are mandatory, non-adherence to these specifications can incur a penalty of up to 20 points.

Your name and UTA ID number should appear on the top line of all documents.


Assignment Preview

In this assignment you will implement the training process and the test process for neural networks. There are three programming tasks. The first two tasks ask you to implement simpler cases (single perceptrons in Task 1, two-layer neural networks in Task 2) . The third task ask you to implement backpropagation for a fully connected neural network with any number of hidden layers.


Task 1 (40 points, programming)

File perceptron_base.py contains incomplete code that implements the training and evaluation process for a perceptron that does binary classification. When completed, the code will train a perceptron using some specified training set, and will then evaluate the accuracy of the perceptron on some specified test set.

Training uses gradient descent as described in the slides on training perceptrons. You should follow exactly the formulas and specifications on those slides.

For testing (NOT for training), you should take into account that we will only test your code with datasets where the target output is either 0 or 1. When you compute the output of the perceptron (again, this is only when testing), an output less than 0.5 corresponds to a class label equal to 0. An output greater than or equal to 0.5 corresponds to a class label equal to 1. So:

To complete the code, you must create a file called perceptron_solution.py, where you implement the following Python function:

The function arguments provide the following information: The function does not return anything. Its job is to train the perceptron on the training set, and evaluate it on the test set, as explained below.

The training and test files will follow the same format as the text files in the synthetic directory. The format is simple: each row represents a training or test example. All values in that row, except for the very last value, specify the input vector. The last value specifies the class label. Function read_uci_file in uci_load.py handles reading these files and extracting the appropriate vectors and class labels. The read_uci_file function is used in perceptron_base.py to read the data before calling your function.

Training

In your implementation, you should use these guidelines: There is no need to output anything for the training phase.

Evaluation

For each test object you should print a line containing the following info: To produce this output in Python in a uniform manner, use:
print('ID=%5d, predicted=%10s, true=%10s, accuracy=%4.2f\n' % 
           (object_id, str(predicted_class), str(true_class), accuracy));
After you have printed the results for all test objects, you should print the overall classification accuracy, which is defined as the average of the classification accuracies you printed out for each test object. To print the classification accuracy in a uniform manner, use:
print('classification accuracy=%6.4f\n' % (classification_accuracy));

You can test your code with different datasets and different number of training rounds by modifying the directory, dataset and training_rounds variables in perceptron_base.py. In your answers.pdf document, please provide ONLY THE LAST LINE (the line printing the classification accuracy) of the output by the test stage, for the following cases:

Expected Classification Accuracy

In general, you may get different classification accuracies when you run your code multiple times with the same input arguments. This is due to the fact that weights are initialized randomly. However, for the synth1, synth2, and synth3 datasets and 5 rounds, I always got the same result (I ran my code 10 times for each case): It is possible that sometimes you get results different than those values, but most of the time that you run your code with the parameters specified above you should be getting accuracies listed above.


Task 2 (30 points, programming)

File nn_2l_base.py contains incomplete code that implements the training and evaluation process for a 2-layer neural network (no hidden layers, just input layer and output layer) that does multiclass classification. When completed, the code will train a 2-layer neural network using some specified training set, and will then evaluate the accuracy of the network on some specified test set.

For training, you can use the gradient descent method as described in the slides on training perceptrons to train each individual perceptron (remember, each perceptron is trained using as target output a different dimension of the one-hot vector describing the class label). You can also use the backpropagation method, as described in the backpropagation slides. Both approaches are mathematically equivalent for a 2-layer neural network. Whichever method you choose, you should follow exactly the formulas and specifications given in the slides for that method.

For testing, you should make sure that you convert the output of the network (which is a vector) to the appropriate class label, by finding the output unit that has the highest output, as described in the slides.

To complete the code, you must create a file called nn_2l_solution.py, where you implement the following Python function:

The function arguments provide the following information: Note that the code in nn_2l_base.py puts the appropriate values in labels_to_ints and ints_to_labels, so that is not something that you need to handle.

The function does not return anything. Its job is to train the 2-layer network on the training set, and evaluate it on the test set, as explained below.

The training and test files will follow the same format as the text files in the UCI datasets directory. This is actually the same format that we used in Task 1 for the datasets in the synthetic directory. Again, each row represents a training or test example. All values in that row, except for the very last value, specify the input vector. The last value specifies the class label. Function read_uci_file in uci_load.py handles reading these files and extracting the appropriate vectors and class labels. The only difference between datasets in the synthetic directory and UCI datasets is that the datasets in the synthetic directory only have class labels that are 0 or 1. The UCI datasets have arbitrary class labels, which can be strings or non-consecutive integers, and can represent more than two classes.

A description of the datasets and the file format can be found on this link. For each dataset, a training file and a test file are provided. The name of each file indicates what dataset the file belongs to, and whether the file contains training or test data. Your code should also work with ANY OTHER training and test files using the same format as the files in the UCI datasets directory. This includes the files in the synthetic directory.

Training

In your implementation, you should use the guidelines provided in Task 1 regarding: There is no need to output anything for the training phase.

Evaluation

For each test object you should print a line containing the same information and formatted the same way as in Task 1. However, in printing that information, there are two additional issues that we must consider in Task 2, that were not applicable in Task 1. First, now the class labels are not just 0 or 1, they can be any number or string. You should use the ints_to_labels dictionary to map from the integer class labels used in the neural network to the original class labels. The second issue is that here we must use a more complicated definition of accuracy, to handle possible ties between classes. A tie occurs when two or more output units tie for the highest output. For Task 2 and Task 3, accuracy is defined as follows: You should format this output in Python the same way as for Task 1, including the overall classification accuracy at the end.

You can test your code with different datasets and different number of training rounds by modifying the directory, dataset and training_rounds variables in nn_2l_base.py. In your answers.pdf document, please provide ONLY THE LAST LINE (the line printing the classification accuracy) of the output by the test stage, for the following cases:

Expected Classification Accuracy

In general, you may get different classification accuracies when you run your code multiple times with the same input arguments. This is due to the fact that weights are initialized randomly. These are the results I got for the test cases listed above (I ran my code 10 times for each case): It is possible that sometimes you get results different than those values, but most of the time that you run your code with the parameters specified above you should be getting accuracies listed above.

Task 3 (30 points, programming)

File nn_base.py contains incomplete code that implements the training and evaluation process for a neural network with fully connected layers (including zero or more hidden layers) that does multiclass classification. When completed, the code will train a neural network using some specified training set, and will then evaluate the accuracy of the network on some specified test set.

Training should be done using the backpropagation algorithm described in these slides. You should follow exactly the formulas and specifications on those slides. For testing, as in the previous task, you should make sure that you convert the output of the network (which is a vector) to the appropriate class label, by finding the output unit that has the highest output, as described in the slides.

To complete the code, you must create a file called nn_solution.py, where you implement the following Python function:

The function arguments provide the following information:

The function does not return anything. Its job is to train the network on the training set, and evaluate it on the test set, as explained below.

The training and test files will follow the same format as in Task 2. As in Task 2, you can test your code with files from the UCI datasets directory as well as the synthetic directory.

Training

In your implementation, you should use the guidelines provided in Task 1 regarding:

The number of layers in the neural network is specified by the parameters.num_layers argument. If parameters.num_layers = L, then the network has L layers as follows:

There is no need to output anything for the training phase.

Evaluation

For each test object you should print a line containing the same info as in Task 2, including the definition of accuracy in Task 2.

You can test your code with different datasets and different number of training rounds by modifying the directory, dataset and parameters variables in nn_base.py. In your answers.pdf document, please provide ONLY THE LAST LINE (the line printing the classification accuracy) of the output by the test stage, for the following cases:

Expected Classification Accuracy

As in Tasks 1 and 2, you may get different classification accuracies when you run your code multiple times with the same input arguments. These are the results I got for the test cases listed above, with 4 layers, 20 training rounds, 20 units for the first hidden layer, 15 units for the second hidden layer (I ran my code 10 times for each case): It is possible that sometimes you get results different than those values, but most of the time that you run your code with the parameters specified above you should be getting accuracies listed above.


Task 3b (Extra Credit, maximum 10 points).

A maximum of 10 extra credit points will be given to the submission or submissions that identify the parameters achieving the best test accuracy for any of the three test datasets. These parameters, and the attained accuracy, should be reported in answers.pdf, under a clear "Task 3b" heading. These results should be achievable using the code that you submit for Task 3. In our tests, your code should achieve the reported accuracy in at least five out of 10 test runs.


Task 3c (Extra Credit, maximum 10 points).

In this task, you are free to change any implementation options that you are not free to change in Task 3. Examples of such options include choice of activation function, initial distribution of weights, learning rate, or any other implementation choices. You can submit a file called nn_opt.py, that implements your modifications. Your function should have the same name as in Task 3:
nn_train_and_test(tr_data, tr_labels, test_data, test_labels, labels_to_ints, ints_to_labels, parameters)
A maximum of 10 points will be given to the submission or submissions that, according to the instructor and GTA, achieve the best improvements (on any of the three datasets) compared to the specifications in Task 3. In your answers.pdf document, under a clear "Task 3c" heading, explain:


CSE 4311 - Assignments - Assignment 3