CSE 4311 - Assignments - Tentative Assignment 7

List of assignment due dates.

The assignment should be submitted via Canvas. Submit a file called assignment7.zip, containing all source code files needed to run your solutions for the programming tasks. Your Python code should run on Google Colab, unless permission is obtained via e-mail from the instructor or the teaching assistant.

All specified naming conventions for files and function names are mandatory, non-adherence to these specifications can incur a penalty of up to 20 points.

Your name and UTA ID number should appear on the top line of answers.pdf and all source code files.


Task 1 (50 points, programming)

Files you need to download for this task: In this task, you will implement a system that solves the following problem: the input is a piece of text (a sentence), that is either in the correct form, or it contains the words in reverse order. The probability of either case is 50%. The output should be the sentence with its words in the right order.

The training, validation and test data for this program are stored at file reverse_dataset.zip, which you should download and unzip. File reverse_test.txt contains 1205 examples of inputs and target outputs for your system. Every line in that file contains an example input, the <TAB> ("\t") character, and then the target output. Here are two examples:

Note that in reverse_test.txt the input sentence is in standardized form (lower case, no punctuation), whereas the target sentence is in non-standardized form. Your system can produce its output in standardized form, it is NOT expected to produce upper case characters and punctuation. Your system's output will be compared with the standardized form of the target output.

File reverse_train.txt contains 5736 sentences, and reverse_validation.txt contains 1214 example sentences. You should use these sentences to obtain the training and validation sets for training your model. It is entirely up to you how you will produce training and validation sets based on the sentences in these files. Every line in these files is a single sentence, with words in the correct order.

All sentences in this dataset come from the Spanish-English translation dataset from http://www.manythings.org/anki/. The dataset that you are given contains sentences that were selected from the original dataset based on these criteria:

File reverse_base.py contains an incomplete program that trains and evaluates neural network models for this task. These models can be used to convert an input sentence to the corresponding output sentence where words appear in the correct order.

To complete that code, you must create a file called reverse_solution.py, where you implement the following Python functions:

Some Additional Information

Here is some information about my solution, that you may choose to use or not in your solution:

Expected Results and Grading

40 points will be given for correct implementation of the Encoder-Decoder model. Your model should reach or exceed 70% word accuracy on the test set, to match the word accuracy (70%-74%) that I got for my Encoder-Decoder model. We will also look at your code, to ensure that you implemented an Encoder-Decoder model and not some other type of model. If your implementation does not achieve 70% word accuracy, as long as it does implement an Encoder-Decoder model, it will get at least 2 points for every percentage point by which its word accuracy exceeds 50%. For example, if the word accuracy is 58%, you will get at least 16 points. If the word accuracy is under 50%, partial credit will be given on a case-by-case basis, based on how close or far your implementation was from being correct.

10 points (and possibly six extra credit points) will be given for your implementation of the "best" model. Obviously, here you have a lot more freedom in how to design it. For every extra percentage point over 74% in word accuracy you receive half a points. If your word accuracy is worse than 74%, zero points will be given. If your word accuracy is over 94%, you get extra credit points, one extra credit point per percentage point over 94%.


Task 1b (Extra Credit, maximum 10 points)

10 points will be given to each of the three solutions that give the best word accuracy on the test set for the Encoder-Decoder model.

To qualify for consideration, you need to run your solution 10 times (or more, if you want), and report a summary of results in answers.pdf. Your result summary should report smallest, largest, mean, and median of the word accuracies on the test set that your solution achieved. You should also document in answers.pdf the design choices that you made.


Task 1c (Extra Credit, maximum 10 points)

Here, you are allowed to use more training data, and/or do transfer learning from other models that you may find or build yourself, and/or do anything else you like to achieve the best accuracy. 10 points will be given to each of the three solutions that give the best word accuracy on the test set. Obviously, you are not allowed to use your test data as training or validation data.

To qualify for consideration, you need to run your solution 10 times (or more, if you want), and report a summary of results in answers.pdf. Your result summary should report smallest, largest, mean, and median of the word accuracies on the test set that your solution achieved. You should also document in answers.pdf if you added additional training data (and what that data was), what base models you used for transfer learning (if you used transfer learning), and any other design choices you made.


Task 2 (50 points, programming)

Files you need to download for this task: In this task, you will implement a system that solves the following problem: the input is a piece of text (a sentence), that is either a question (class label 1) or NOT a question (class label 0). The system aims to estimate whether the input is a question or not.

The training, validation and test data for this task are stored at file questions_dataset.zip, which you should download and unzip. The code in questions_base.py reads the dataset and defines appropriately the input and labels for the training set, validation set, and test set. All sentences are in standardized form (lower case, no punctuation). The sentences in this dataset are actually the same as in the reverse_dataset of Task 1. However, here they are saved in a slightly modified format, to reflect the different goal in this task. As a reminder:

File questions_base.py contains an incomplete program that trains and evaluates a transformer-based model for this task. To complete that code, you must create a file called questions_solution.py, where you implement the following Python functions:

Some Additional Information

Here is some information about my solution, that you may choose to use or not in your solution:

Expected Results and Grading

Your model should reach or exceed 98% classification accuracy on the test set, to match the accuracy (98.3%-98.9%) that I got. We will also look at your code, to ensure that you implemented a transformer model with positional embeddings, and not some other type of model (other models will be penalized by up to 20 points). If your implementation does not achieve 98% accuracy, your test accuracy will be rounded down to the nearest integer, and you will lose 2 points for every percentage point below 98%. For example, if the classification accuracy on the test set is 90.5%, you will lose 16 points. If the accuracy is under 85%, partial credit will be given on a case-by-case basis, based on how close or far your implementation was from being correct.


Task 2b (Extra Credit, maximum 10 points)

10 points will be given to each of the three solutions that give the best accuracy on the test set for a transformer model with positional embeddings.

To qualify for consideration, you need to run your solution 10 times (or more, if you want), and report a summary of results in answers.pdf. Your result summary should report smallest, largest, mean, and median of the accuracies on the test set that your solution achieved. You should also document in answers.pdf the design choices that you made.


Task 2c (Extra Credit, maximum 10 points)

Here, you are allowed to use more training data, and/or do transfer learning from other models that you may find or build yourself, and/or do anything else you like to achieve the best accuracy, including using models not based on transformers and positional embeddings. 10 points will be given to each of the three solutions that give the best accuracy on the test set. Obviously, you are not allowed to use your test data as training or validation data.

To qualify for consideration, you need to run your solution 10 times (or more, if you want), and report a summary of results in answers.pdf. Your result summary should report smallest, largest, mean, and median of the accuracies on the test set that your solution achieved. You should also document in answers.pdf if you added additional training data (and what that data was), what base models you used for transfer learning (if you used transfer learning), and any other design choices you made.


Task 3: Submit Course Evaluations (Collective Extra Credit, maximum 20 points)

This is a collective extra credit opportunity for the entire class. The goal is to encourage participation in the course evaluations. There are 20 students in the class. Every student will be given the same amount of extra credit points, which will be equal to 1/5 of the percentage of students who submit course evaluations for this class. The amount of extra points will be rounded up to the nearest integer, and added to the score of this assignment.

For example, if the class receives 9 course evaluations, every student will receive 9 extra credit points (9 = 9/20 * 100 / 5, rounded up). If the class receives 16 evaluations, 16 extra credit points will be given to every student. In the maximum case, if every single student submits a course evaluation, then 20 extra credit points will be given to every student.

Please note that course evaluations are anonymous, there will be no record of who has submitted and who has not. The same number of extra credit will be given uniformly to every student in the class, regardless of whether they have submitted the course evaluation or not.


CSE 4311 - Assignments - Assignment 7