Due dates:
Interim report: Tuesday, November 17, 2009, 11:59pm
Full assignment: Monday November 23, 11:59pm.
Summary
The goal in this assignment is to get practice on designing Bayesian networks, estimating probability distributions in Bayesian networks, and implementing Bayesian networks.
Part 1: Designing a Bayesian network graph
20 points
George doesn't watch much TV in the evening, unless there is a baseball game on. When there is baseball on TV, George is very likely to watch. George has a cat that he feeds every night, although he forgets every now and then. He's much more likely to forget when he's watching TV. He's also very unlikely to feed the cat if he has run out of cat food (although sometimes he gives the cat some of his own food).
Design a Bayesian network for modeling the relations between these four events:
- baseball_game_on_TV
- George_watches_TV
- out_of_cat_food
- George_feeds_cat
Your task is to connect these nodes with arrows pointing from causes to effects. No programming is needed for this part, just include an electronic document (PDF, Word file, or OpenOffice document) showing your Bayesian network design.
Part 2: Learning Probabilities from Training Data
20 points
For the Bayesian network of Part 1, the text file at this link contains training data from every evening of an entire year. Every line in this text file corresponds to an evening, and contains four numbers. Each number is a 0 or a 1. In more detail:
- The first number is 0 if there is no baseball game on TV, and 1 if there is a baseball game on TV.
- The second number is 0 if George does not watch TV, and 1 if George watches TV.
- The third number is 0 if George is not out of cat food, and 1 if George is out of cat food.
- The fourth number is 0 if George does not feed the cat, and 1 if George feeds the cat.
Based on the data in this file, determine the probability table for each node in the Bayesian network you have designed for Part 1. You need to submit these four tables by e-mail, as well as the code/script that computes these probabilities.

Figure 1: A Bayesian network establishing relations between events on the burglary-earthquake-alarm domain, together with complete specifications of all probability distributions.
Part 3: Implementing a Bayesian Network
60 points
For the Bayesian network of Figure 1, implement a program that computes and prints out the probability of any combination of events. For example, if the executable is called bnet, an example invocation of the executable would be:
bnet 0 1 0 1 1
In general, bnet takes exactly 5 (no more, no fewer) command line arguments, and each argument is either 0 or 1. The arguments provide to the program the following information:
- The first argument is 1 if Burglary=true, and 0 if Burglary=false.
- The second argument is 1 if Earthquake=true, and 0 if Earthquake=false.
- The third argument is 1 if Alarm=true, and 0 if Alarm=false.
- The fourth argument is 1 if JohnCalls=true, and 0 if JohnCalls=false.
- The fifth argument is 1 if MaryCalls=true, and 0 if MaryCalls=false.
The correct implementation will not contain hardcoded values for all 32 combinations of arguments, but instead will use the tables shown on Figure 1 and the appropriate formulas to evaluate the probability of the specified event. More specifically, for full credit, the code should include and use:
- A Bayesian network class. The nodes of the network will be stored in a member variable (a list, vector, or array) of the Bayesian network class. The class should include a member function called computeProbability(b, e, a, j, m), where the five arguments are of type Boolean and correspond to the five command line arguments. The probability value printed out by the program should be the result of this function. This function should call (with appropriate arguments) the computeProbability function of each node in the network (see description in the next bullet), and determine the overall result based on the results of the computeProbability function of the nodes.
- A node class. Each node of the Bayesian network will be an object (or pointer) belonging to the node class. The parents of a node X will be stored as member variables of the object representing X. Class node must also implement a member function called computeProbability, that will take as an argument an array (or list, or vector) of Boolean arguments specifying values for all parents of the node, and returns the probability of the node being true given the values for its parents.
Grading
Each part will be graded as follows:
- Part 1
- 8 points: Establishing a correct correspondence between nodes in the Bayesian network and events in the problem description.
- 8 points: Establishing correct connections between nodes in the Bayesian network, according to the problem description.
- 4 points: Using the correct direction for each arrow in the Bayesian network.
- Part 2:
- 20 points: Estimating correctly each probability table from the training data.
- Part 3:
- 20 points: Creating an executable that provides the correct output for each input.
- 20 points: Correctly implementing the Bayesian network class.
- 20 points: Correctly implementing the node class.
Late submissions incur an initial penalty of 10 points, plus
additional 10 points for each full 24 hours between the deadline and
the submission. NO SUBMISSIONS WILL BE ACCEPTED AFTER 11:59pm of December 07.
How to submit
Implementations in LISP, C, C++, and Java will be accepted. If you would like
to use another language, please first check with the instructor via
e-mail. Points will be taken off for failure to comply with this
requirement.
Submit a ZIPPED directory called programming5.zip (no other
forms of compression accepted, contact the instructor or TA if you do
not know how to produce .zip files) via
e-mail to BOTH THE INSTRUCTOR AND THE TA, with subject "CSE 4308 - Programming Assignment
5", or "CSE 5360 - Programming Assignment 5", depending on the course you are
actually registered for. THE ATTACHMENT SHOULD NOT EXCEED 800KB in
size (contact the instructor if for some reason you find it hard to
comply with the 800KB limit). The directory should
contain source code, the answer for part 1 in a document, the answer (and code) for part 2, and the code for part 3. Including binaries that work on omega (for Java and C++) is optional
and encouraged. The submission should also contain a file called
readme.txt,
which should specify precisely:
- Name and UTA ID of the student.
- Where the answers are for part 1 and part 2.
- What programming language is used.
- How the code is structured.
- How to run the code, including very specific compilation
instructions, if compilation
is needed. Instructions such as "compile using g++" are NOT
considered specific. Providing all the command lines that are needed to complete the compilation on omega is specific.
Insufficient or unclear instructions will be penalized by up to 20 points.
Code that does not run on omega machines gets AT MOST half credit (50 points).
Submission checklist
- DID YOU INCLUDE the answer for part 1, answer AND code for part 2, and the code for part 3?
- Is the code running on omega?
- Is the implementations in LISP, C, C++, or Java? If not, have you
obtained written consent from the instructor?
- Is the attached zipped file called programming5.zip?
- Has the submission been e-mailed to both the instructor and the TA?
- Was the attachment under 800KB?
- Was the subject of the e-mail message "CSE 4308 - Programming Assignment 5" or
"CSE 5360 - Programming Assignment 5", accordingly?
- Does the attachment include a readme.txt file, as specified?