CSE 4311 - Assignments - Assignment 5

List of assignment due dates.

The assignment should be submitted via Canvas. Submit a file called assignment5.zip, containing the following files: These naming conventions are mandatory, non-adherence to these specifications can incur a penalty of up to 20 points.

Your name and UTA ID number should appear on the top line of both documents.

For all programming tasks, feel free to reuse, adapt, or modify, any code that is posted on the slides or the class website.


Task 1 (40 points, programming)

File cifar2mnist_base.py contains incomplete code that aims to do transfer learning from the CIFAR10 dataset to the MNIST dataset and compare it to not using transfer learning.

To complete that code, you must create a file called cifar2mnist_solution.py, where you implement the following Python functions:

Here is a description of what each function should do:

You can see how these three functions are used in cifar2mnist_base.py, to verify that you understand what arguments they take and what they are supposed to do. When grading, we reserve the right to test your functions with different code (instead of cifar2mnist_base.py), so your functions should not refer to any global variables defined in cifar2mnist_base.py.

Program Output

The code in cifar2mnist_base.py prints all the output that is needed. Your functions do not need to produce any additional output, it is fine whether they do produce output or not.

In your answers.pdf document, please provide ONLY THE LAST LINE (the line printing the classification accuracy) of the output by the test stage, when you run cifar2mnist_base.py with your solution.

Some Results

Here is some information to help you test your code. You can use, as a debugging tool, the cifar10_e20_b128.keras model that my train_model function created (it took train_model about 18 minutes on my computer to train that model). Using that specific cifar10_e20_b128.keras model, I ran 10 times my load_and_refine and evaluate_my_model functions. The classification accuracy on the MNIST test set ranged from 93.09% to 93.57%.


Task 2 (40 points, programming)

File month_base.py contains incomplete code that trains and evaluates a neural network model (a fully-connected model) that predict the month of the year that a specific moment belongs to, based on a time series of weather observations from the week before and the week after that moment. The Jena Climate dataset is used for training and testing the models. You can download the Jena Climate dataset as a CSV file from here: jena_climate_2009_2016.csv.

To complete that code, you must create a file called month_solution.py, where you implement the following Python functions:

Here is a description of what each function should do:

  • The confusion_matrix function computes the confusion matrix of the model saved on the given filename, using the given test inputs and test_targets. The confusion matrix is a 12x12 matrix, where at row i and column j (with row and column indices starting at 0, Python-style) it stores the number of test objects that had true class label i and were classified by the model as having class label j. You can see how these functions are used in month_base.py, to verify that you understand what arguments they take and what they are supposed to do. When grading, we reserve the right to test your functions with different code (instead of month_base.py), so your functions should not assume the existence of any global variables defined in month_base.py.

    Program Output

    The code in month_base.py prints all the output that is needed. Your functions do not need to produce any additional output, it is fine whether they do produce output or not.

    From the output that you get when you run month_base.py with your solution, please put in your answers.pdf document ONLY THE two LINES printing the classification accuracy on the validation and test set for the dense model.

    Some Results

    Here is some information to help you test your code.


    Task 2b (Extra Credit, maximum 10 points)

    In the previous task, the instructions for build_and_train_dense specify the initial layers that must be used. At the same time, the instructions leave it up to each implementation to decide what layers to add after those initial layers. 10 points will be given to the implementation of build_and_train_dense that gives the best test accuracy. Our experiments will use the same setup (amount of training data, length of time series, sampling rate) as in month_base.py.

    To qualify for consideration, you need to run your solution 10 times, and report a summary of results in answers.pdf. Your result summary should follow the exact same format in which I summarize my results in the section titled "Some Results" at the end of Task 2.


    Task 3 (10 points)

    This question refers to the time series slides. Slide 52 shows results from the naive method. Slides 65-66 shows results from a fully connected network. Slide 71 shows results from a recurrent network. As we see, for the fully connected network and the recurrent network, the best training error was significantly lower than the best validation error. On the other hand, for the naive method, the validation error was lower than the training error.

    Part a: For the naive method, if we applied it on different datasets (with features similar to the dataset that we used, but collected from different locations), would you expect that in most experiments the training errors would be lower than validation errors? Describe how you would expect training and validation errors to compare in each experiment, and justify your answer.

    Part b: Answer the same question as in part a, but for the fully connected network instead of the naive method.

    Part c: Answer the same question as in part a, but for the recurrent network instead of the naive method.


    Task 4 (10 points)

    This question refers to the time series slides. Slide 60 defines a fully connected network where the output layer uses the identity activation function. What would go wrong if we used the tanh activation function for the output layer in this case? Justify your answer.
    CSE 4311 - Assignments - Assignment 5