CSE 4311 - Assignments - Tentative Assignment 6

List of assignment due dates.

The assignment should be submitted via Canvas. Submit a file called assignment6.zip, containing the following files: These naming conventions are mandatory, non-adherence to these specifications can incur a penalty of up to 20 points.

Your name and UTA ID number should appear on the top line of both documents.

For all programming tasks, feel free to reuse, adapt, or modify, any code that is posted on the slides or the class website.


Task 1 (60 points, programming)

File authors_base.py contains an incomplete program that trains and evaluates a neural network model that predicts the author of a short piece of text (about 40 to 300 words). The training and test data for this program are stored at file authors_dataset.zip, which you should download and unzip.

The dataset contains text from three authors: Charles Dickens, C. S. Lewis, and Mark Twain. The dataset folder is structured as follows:

To complete that code, you must create a file called authors_solution.py, where you implement the following Python function:

model = learn_model(train_files)
The learn_model function takes as argument train_files, which is a list of lists of filenames. Element train_files[i] is a list of filenames specifying the text files storing books to be used as training data for one specific author. The train_files variable is already defined in authors_base.py.

Your function should somehow (how exactly is up to you) use the text files stored in train_files to create an appropriate training set, and then it should use that training set to train a neural network model. The function returns the trained model.

Here are some recommendations, that you may choose to follow or not:

You can see how the learn_model function is used in authors_base.py, to verify that you understand what it is supposed to do. When grading, we reserve the right to test your solution with different code (instead of authors_base.py), so your solution should comply with the specifications given above and should not assume the existence of any global variables defined in authors_base.py.

Some Additional Information

Here is some additional information about my solution.


Task 1b (Extra Credit, maximum 10 points)

10 points will be given to each of the three solutions that give the best test accuracy.

To qualify for consideration, you need to run your solution 10 times (or more, if you want), and report a summary of results in answers.pdf. Your result summary report smallest, largest, mean, and median of the test accuracies that your solution achieved. You should also document in answers.pdf the design choices that you made.


Task 1c (Extra Credit, maximum 10 points)

Here, you are allowed to add more books to the training data, and/or do transfer learning from other models that you may find or build yourself. 10 points will be given to each of the three solutions that give the best test accuracy. One restriction: you cannot add books that are prequels or sequels to any of the books in our test data. The only books that I can think of that fall under this restriction are the other books from C. S. Lewis's Space Trilogy, but there may be other books I am not aware of that the restriction would apply to.

To qualify for consideration, you need to run your solution 10 times (or more, if you want), and report a summary of results in answers.pdf. Your result summary report smallest, largest, mean, and median of the test accuracies that your solution achieved. You should also document in answers.pdf what books you added (if you added books), and what base models you used for transfer learning (if you used transfer learning).


Task 2 (40 points, written)

import numpy as np
from tensorflow import keras

input_shape = (3,100)
model1 = keras.Sequential([keras.Input(shape=input_shape),
                          keras.layers.SimpleRNN(3),
                          keras.layers.Dense(1),])

input_shape = (10,100)
model2 = keras.Sequential([keras.Input(shape=input_shape),
                          keras.layers.SimpleRNN(3),
                          keras.layers.Dense(1),])

input_shape = (3,100)
model3 = keras.Sequential([keras.Input(shape=input_shape),
                          keras.layers.Bidirectional(keras.layers.SimpleRNN(3)),
                          keras.layers.Dense(1),])
 
The code above defines three different RNN models.

Part a: What is the length of the time series that model1 is designed to take as input? What is the number of dimensions of each element of that time series?

Part b: What is the length of the time series that model2 is designed to take as input? What is the number of dimensions of each element of that time series?

Part c: Is the number of trainable parameters (trainable weights) of model1 greater than, equal to, or less than the number of trainable parameters of model2? Justify your answer. One acceptable justification is code (that you can just provide in answers.pdf) that demonstrates that your answer is correct, as long as you explain how that code proves your answer.

Part d: Is the number of trainable parameters (trainable weights) of model1 greater than, equal to, or less than the number of trainable parameters of model3? Justify your answer. One acceptable justification is code (that you can just provide in answers.pdf) that demonstrates that your answer is correct, as long as you explain how that code proves your answer.


CSE 4311 - Assignments - Assignment 6