CSE 4311 - Assignments - Assignment 3

List of assignment due dates.

The assignment should be submitted via Canvas. Submit a file called assignment3.zip, containing the following files:

answers.pdf, for the output that the programming tasks ask you to include. Only PDF files will be accepted. All text should be typed, and if any figures are present they should be computer-generated. Scans of handwriten answers will NOT be accepted.
All files containing your Python code for the programming tasks. Feel free to implement your solutions in multiple files, and to reuse code from those files in multiple tasks, as long as the naming conventions specified in each task are followed.

These naming conventions are mandatory, non-adherence to these specifications can incur a penalty of up to 20 points.

Your name and UTA ID number should appear on the top line of all documents.

Assignment Preview

In this assignment you will implement the training process and the test process for neural networks. There are three programming tasks. The first two tasks ask you to implement simpler cases (single perceptrons in Task 1, two-layer neural networks in Task 2) . The third task ask you to implement backpropagation for a fully connected neural network with any number of hidden layers.

Task 1 (40 points, programming)

File perceptron_base.py contains incomplete code that implements the training and evaluation process for a perceptron that does binary classification. When completed, the code will train a perceptron using some specified training set, and will then evaluate the accuracy of the perceptron on some specified test set.

Training uses gradient descent as described in the slides on training perceptrons. You should follow exactly the formulas and specifications on those slides.

For testing (NOT for training), you should take into account that we will only test your code with datasets where the target output is either 0 or 1. When you compute the output of the perceptron (again, this is only when testing), an output less than 0.5 corresponds to a class label equal to 0. An output greater than or equal to 0.5 corresponds to a class label equal to 1. So:

If the perceptron outputs a value less than .5 for some test input, and the true class label for that test input is 0, then we consider that the output is correct.
If the perceptron outputs a value greater than or equal to .5 for some test input, and the true class label for that test input is 1, then we consider that the output is correct.
Otherwise, we consider that the output of the perceptron is incorrect.

To complete the code, you must create a file called perceptron_solution.py, where you implement the following Python function:

perceptron_train_and_test(tr_data, tr_labels, test_data, test_labels, training_rounds)

The function arguments provide the following information:

tr_data: the training inputs. This is a 2D numpy array, where each row is a training input vector.
tr_labels: a numpy column vector. That means it is a 2D numpy array, with a single column. tr_labels[i,0] is the class label for the vector stored at tr_data[i].
test_data: the test inputs. This is a 2D numpy array, where each row is a test input vector.
test_labels: a numpy column vector. That means it is a 2D numpy array, with a single column. test_labels[i,0] is the class label for the vector stored at test_data[i].
training_rounds: An integer greater than or equal to 1, specifying the number of training rounds that you should use. Each training round consists of using the whole training set exactly once (i.e., using each training example exactly once to update the weights).

The function does not return anything. Its job is to train the perceptron on the training set, and evaluate it on the test set, as explained below.

The training and test files will follow the same format as the text files in the synthetic directory. The format is simple: each row represents a training or test example. All values in that row, except for the very last value, specify the input vector. The last value specifies the class label. Function read_uci_file in uci_load.py handles reading these files and extracting the appropriate vectors and class labels. The read_uci_file function is used in perceptron_base.py to read the data before calling your function.

Training

In your implementation, you should use these guidelines:

For each dataset, for all training and test vectors in that dataset, you should normalize the values in all dimensions, by dividing them with the MAXIMUM ABSOLUTE value over all dimensions over all training vectors for that dataset. This is a single MAXIMUM value over the entire training set. You should NOT use a different maximum value for each dimension. In other words, for each dataset, you need to find the single highest absolute value across all dimensions and all training vectors. Every value in every dimension of every training and test object should be divided by that highest value.
All weights (including the bias) should be initialized to random values, between -0.05 and 0.05. The probability of each random value should follow a uniform distribution between -0.05 and 0.05.
You should initialize your learning rate η to 1 for the first training round, and then multiply it by 0.98 for each subsequent training round. So, the learning_rate used for training round r should be 0.98^{r - 1}.
Your stopping criterion should simply be the number of training rounds, which is specified as the fifth argument. Your training should stop exactly at the specified number of rounds.

There is no need to output anything for the training phase.

Evaluation

For each test object you should print a line containing the following info:

Object ID. This is the line number where that object occurs in the test file. Start with 0 in numbering the objects, not with 1.
Predicted class (the result of the classification). In this task this is either 0 (if the perceptron output is less than 0.5) or 1 (if the perceptron output is greater than or equal to 0.5).
True class (from the last column of the test file). In this task this is either 0 or 1.
Accuracy. This is defined as follows:
- If the predicted class is correct, the accuracy is 1.
- If the predicted class is incorrect, the accuracy is 0.

To produce this output in Python in a uniform manner, use:

print('ID=%5d, predicted=%10s, true=%10s, accuracy=%4.2f\n' % 
           (object_id, str(predicted_class), str(true_class), accuracy));

After you have printed the results for all test objects, you should print the overall classification accuracy, which is defined as the average of the classification accuracies you printed out for each test object. To print the classification accuracy in a uniform manner, use:

print('classification accuracy=%6.4f\n' % (classification_accuracy));

You can test your code with different datasets and different number of training rounds by modifying the directory, dataset and training_rounds variables in perceptron_base.py. In your answers.pdf document, please provide ONLY THE LAST LINE (the line printing the classification accuracy) of the output by the test stage, for the following cases:

Training and testing on the synth1 dataset, 5 training rounds.
Training and testing on the synth2 dataset, 5 training rounds.
Training and testing on the synth3 dataset, 5 training rounds.

Expected Classification Accuracy

In general, you may get different classification accuracies when you run your code multiple times with the same input arguments. This is due to the fact that weights are initialized randomly. However, for the synth1, synth2, and synth3 datasets and 5 rounds, I always got the same result (I ran my code 10 times for each case):

synth1: Classification accuracy was 1.0. Each run took less than 1 second on my computer.
synth2: Classification accuracy was 0.915. Each run took less than 1 second on my computer.
synth3: Classification accuracy was 0.89. Each run took less than 1 second on my computer.

It is possible that sometimes you get results different than those values, but most of the time that you run your code with the parameters specified above you should be getting accuracies listed above.

Task 2 (30 points, programming)

File nn_2l_base.py contains incomplete code that implements the training and evaluation process for a 2-layer neural network (no hidden layers, just input layer and output layer) that does multiclass classification. When completed, the code will train a 2-layer neural network using some specified training set, and will then evaluate the accuracy of the network on some specified test set.

For training, you can use the gradient descent method as described in the slides on training perceptrons to train each individual perceptron (remember, each perceptron is trained using as target output a different dimension of the one-hot vector describing the class label). You can also use the backpropagation method, as described in the backpropagation slides. Both approaches are mathematically equivalent for a 2-layer neural network. Whichever method you choose, you should follow exactly the formulas and specifications given in the slides for that method.

For testing, you should make sure that you convert the output of the network (which is a vector) to the appropriate class label, by finding the output unit that has the highest output, as described in the slides.

To complete the code, you must create a file called nn_2l_solution.py, where you implement the following Python function:

nn_2l_train_and_test(tr_data, tr_labels, test_data, test_labels,
                     labels_to_ints, ints_to_labels, training_rounds)

The function arguments provide the following information:

tr_data, tr_labels, test_data, test_labels, training_rounds are as in Task 1.
labels_to_ints: a Python dictionary, that maps original class labels (which can be ints or strings) to consecutive ints starting at 0 (that your code then has to map to one-hot vectors).
ints_to_labels: a Python dictionary, that reverses the mapping of labels_to_ints, and maps int labels to original class labels (which can be ints or strings). This is useful when your code prints test results, so that it can print the original class label, and not the integer that it was mapped to.

Note that the code in nn_2l_base.py puts the appropriate values in labels_to_ints and ints_to_labels, so that is not something that you need to handle.

The function does not return anything. Its job is to train the 2-layer network on the training set, and evaluate it on the test set, as explained below.

The training and test files will follow the same format as the text files in the UCI datasets directory. This is actually the same format that we used in Task 1 for the datasets in the synthetic directory. Again, each row represents a training or test example. All values in that row, except for the very last value, specify the input vector. The last value specifies the class label. Function read_uci_file in uci_load.py handles reading these files and extracting the appropriate vectors and class labels. The only difference between datasets in the synthetic directory and UCI datasets is that the datasets in the synthetic directory only have class labels that are 0 or 1. The UCI datasets have arbitrary class labels, which can be strings or non-consecutive integers, and can represent more than two classes.

A description of the datasets and the file format can be found on this link. For each dataset, a training file and a test file are provided. The name of each file indicates what dataset the file belongs to, and whether the file contains training or test data. Your code should also work with ANY OTHER training and test files using the same format as the files in the UCI datasets directory. This includes the files in the synthetic directory.

Training

In your implementation, you should use the guidelines provided in Task 1 regarding:

Normalizing the values of all vectors in the dataset.
Initializing weights (including the bias weights).
Initializing and updating the learning rate η.
The stopping criterion (again, it is simply the number of training rounds).

There is no need to output anything for the training phase.

Evaluation

For each test object you should print a line containing the same information and formatted the same way as in Task 1. However, in printing that information, there are two additional issues that we must consider in Task 2, that were not applicable in Task 1. First, now the class labels are not just 0 or 1, they can be any number or string. You should use the ints_to_labels dictionary to map from the integer class labels used in the neural network to the original class labels. The second issue is that here we must use a more complicated definition of accuracy, to handle possible ties between classes. A tie occurs when two or more output units tie for the highest output. For Task 2 and Task 3, accuracy is defined as follows:

If there were no ties in your classification result, and the predicted class is correct, the accuracy is 1.
If there were no ties in your classification result, and the predicted class is incorrect, the accuracy is 0.
If there were ties in your classification result, and the correct class was one of the classes that tied for best, the accuracy is 1 divided by the number of classes that tied for best.
If there were ties in your classification result, and the correct class was NOT one of the classes that tied for best, the accuracy is 0.

You should format this output in Python the same way as for Task 1, including the overall classification accuracy at the end.

You can test your code with different datasets and different number of training rounds by modifying the directory, dataset and training_rounds variables in nn_2l_base.py. In your answers.pdf document, please provide ONLY THE LAST LINE (the line printing the classification accuracy) of the output by the test stage, for the following cases:

Training and testing on the pendigits dataset, 10 training rounds.
Training and testing on the satellite dataset, 10 training rounds.
Training and testing on the yeast dataset, 10 training rounds.

Expected Classification Accuracy

In general, you may get different classification accuracies when you run your code multiple times with the same input arguments. This is due to the fact that weights are initialized randomly. These are the results I got for the test cases listed above (I ran my code 10 times for each case):

pendigits: the classification accuracy was 0.8659 every time. Each run took about 3 seconds on my computer.
satellite: Classification accuracy ranged from 0.5630 to 0.5640. Each run took about 3 seconds on my computer.
yeast: Classification accuracy ranged from 0.5165 to 0.5186. Each run took about 1 second on my computer.

Task 3 (30 points, programming)

File nn_base.py contains incomplete code that implements the training and evaluation process for a neural network with fully connected layers (including zero or more hidden layers) that does multiclass classification. When completed, the code will train a neural network using some specified training set, and will then evaluate the accuracy of the network on some specified test set.

Training should be done using the backpropagation algorithm described in these slides. You should follow exactly the formulas and specifications on those slides. For testing, as in the previous task, you should make sure that you convert the output of the network (which is a vector) to the appropriate class label, by finding the output unit that has the highest output, as described in the slides.

To complete the code, you must create a file called nn_solution.py, where you implement the following Python function:

nn_train_and_test(tr_data, tr_labels, test_data, test_labels,
                  labels_to_ints, ints_to_labels, parameters)

The function arguments provide the following information:

tr_data, tr_labels, test_data, test_labels, are as in Task 1 and Task 2.
labels_to_ints, ints_to_labels are as in Task 2.
parameters is an object of class hyperparameters, defined in nn_base.py, that includes the following variables:
- num_layers: specifies the number of layers in the network. The number of layers cannot be smaller than 2, since any neural network should have at least an input layer and an output layer.
- units_per_layer: should be a list of the number of units in the hidden layers. The length of this list should be num_layers - 2. For example, units_per_layer[0] is the number of units in the first hidden layer (assuming num_layers >= 3), units_per_layer[1] is the number of units in the second hidden layer (assuming num_layers >= 4), and so on.
- training_rounds: specifies the number of training rounds, as in Task 1 and Task 2.

The function does not return anything. Its job is to train the network on the training set, and evaluate it on the test set, as explained below.

The training and test files will follow the same format as in Task 2. As in Task 2, you can test your code with files from the UCI datasets directory as well as the synthetic directory.

Training

In your implementation, you should use the guidelines provided in Task 1 regarding:

Normalizing the values of all vectors in the dataset.
Initializing weights (including the bias weights).
Initializing and updating the learning rate η.
The stopping criterion (again, it is simply the number of training rounds).

The number of layers in the neural network is specified by the parameters.num_layers argument. If parameters.num_layers = L, then the network has L layers as follows:

The minimum legal value for parameters.num_layers is 2, since every network has an input layer and an output layer.
Layer 1 is the input layer, that contains no perceptrons, it just specifies the inputs to the neural network.
Layer L is the output layer, containing as many perceptrons as the number of classes.
If L > 2, then layers 2, ..., L-1 are the hidden layers. Each of these layers has as many perceptrons as specified in the appropriate position in the parameters.units_per_layer list. If L = 2, then parameters.units_per_layer is ignored.
Each perceptron at layers 2, ..., L receives as inputs the outputs of ALL units at the previous layer. So, overall, all layers (except for the input layer) are fully connected.

There is no need to output anything for the training phase.

Evaluation

For each test object you should print a line containing the same info as in Task 2, including the definition of accuracy in Task 2.

You can test your code with different datasets and different number of training rounds by modifying the directory, dataset and parameters variables in nn_base.py. In your answers.pdf document, please provide ONLY THE LAST LINE (the line printing the classification accuracy) of the output by the test stage, for the following cases:

Training and testing on the pendigits dataset, with 4 layers, 20 training rounds, 20 units for the first hidden layer, 15 units for the second hidden layer.
Training and testing on the satellite dataset, with 4 layers, 20 training rounds, 20 units for the first hidden layer, 15 units for the second hidden layer.
Training and testing on the yeast dataset, with 4 layers, 20 training rounds, 20 units for the first hidden layer, 15 units for the second hidden layer.

Expected Classification Accuracy

As in Tasks 1 and 2, you may get different classification accuracies when you run your code multiple times with the same input arguments. These are the results I got for the test cases listed above, with 4 layers, 20 training rounds, 20 units for the first hidden layer, 15 units for the second hidden layer (I ran my code 10 times for each case):

pendigits: the classification accuracy ranged from 0.8939 to 0.9451. Each run took about 11-12 seconds on my computer.
satellite: the classification accuracy ranged from 0.5580 to 0.7580. Each run took about 7-8 seconds on my computer.
yeast: the classification accuracy was 0.3017 every time. Each run took about 2 seconds on my computer.

Task 3b (Extra Credit, maximum 10 points).

A maximum of 10 extra credit points will be given to the submission or submissions that identify the parameters achieving the best test accuracy for any of the three test datasets. These parameters, and the attained accuracy, should be reported in answers.pdf, under a clear "Task 3b" heading. These results should be achievable using the code that you submit for Task 3. In our tests, your code should achieve the reported accuracy in at least five out of 10 test runs.

Task 3c (Extra Credit, maximum 10 points).

In this task, you are free to change any implementation options that you are not free to change in Task 3. Examples of such options include choice of activation function, initial distribution of weights, learning rate, or any other implementation choices. You can submit a file called nn_opt.py, that implements your modifications. Your function should have the same name as in Task 3:

nn_train_and_test(tr_data, tr_labels, test_data, test_labels, labels_to_ints, ints_to_labels, parameters)

A maximum of 10 points will be given to the submission or submissions that, according to the instructor and GTA, achieve the best improvements (on any of the three datasets) compared to the specifications in Task 3. In your answers.pdf document, under a clear "Task 3c" heading, explain:

What modifications you made.
What results you achieved. We should be able to achieve the reported accuracy at least five out of 10 times in our test runs of your code.
What parameters we should call your function with in order to obtain those results.
How long it took your program to run with those arguments.

CSE 4311 - Assignments - Assignment 3