Assignment 9

Programming and Written Assignment - Bayesian Networks &  Decision Trees

The assignment should be submitted via Blackboard.

NOTE: This assignment is for 120 points.

Task 1 

10 points

George doesn't watch much TV in the evening, unless there is a baseball game on. When there is baseball on TV, George is very likely to watch. George has a cat that he feeds most evenings, although he forgets every now and then. He's much more likely to forget when he's watching TV. He's also very unlikely to feed the cat if he has run out of cat food (although sometimes he gives the cat some of his own food). Design a Bayesian network for modeling the relations between these four events:

Your task is to connect these nodes with arrows pointing from causes to effects. No programming is needed for this part, just include an electronic document (PDF, Word file, or OpenOffice document) showing your Bayesian network design.


Task 2 

10 points

For the Bayesian network of Task 1, the text file at this link contains training data from every evening of an entire year. Every line in this text file corresponds to an evening, and contains four numbers. Each number is a 0 or a 1. In more detail:

Based on the data in this file, determine the probability table for each node in the Bayesian network you have designed for Task 3. You need to include these four tables in the drawing that you produce for question 3. You also need to submit the code/script that computes these probabilities.


Task 3

15 points.
 
BayesianNetwork2

Figure 1: Yet another Bayesian Network.


Part a:
 On the network shown in Figure 1, what is the Markovian blanket of node L?

Part b: On the network shown in Figure 1, what is P(H, C)? How is it derived?

Part c: On the network shown in Figure 1, what is P(M, not(C) | H)? How is it derived?

Hint: Part b,c have easier ways to arrive at the answer other than Inference by enumeration.


Task 4 

50 points

Earthquake_Burglary Net as discussed in Class

Figure 2: A Bayesian network establishing relations between events on the burglary-earthquake-alarm domain, together with complete specifications of all probability distributions.


For the Bayesian network of Figure 2, implement a program that computes and prints out the probability of any combination of events given any other combination of events. If the executable is called bnet, here are some example invocations of the program:

  1. To print out the probability P(Burglary=true and Alarm=false | MaryCalls=false).
    bnet Bt Af given Mf
  2. To print out the probability P(Alarm=false and Earthquake=true).
    bnet Af Et
  3. To print out the probability P(JohnCalls=true and Alarm=false | Burglary=true and Earthquake=false).
    bnet Jt Af given Bt Ef
  4. To print out the probability P(Burglary=true and Alarm=false and MaryCalls=false and JohnCalls=true and Earthquake=true).
    bnet Bt Af Mf Jt Et
In general, bnet takes 1 to 6(no more, no fewer) command line arguments, as follows: The implementation should not contain hardcoded values for all combinations of arguments. Instead, your code should use the tables shown on Figure 2 and the appropriate formulas to evaluate the probability of the specified event. It is OK to hardcode values from the tables on Figure 1 in your code, but it is not OK to hard code values for all possible command arguments, or probability values for all possible atomic events. More specifically, for full credit, the code should include and use a Bayesian network class. The class should include a member function called computeProbability(b, e, a, j, m), where each argument is a boolean, specifying if the corresponding event (burglary, earthquake, alarm, john-calls, mary-calls) is true or false. This function should return the joint probability of the five events.

Note: Sample output for some scenarios  is given
here.


Task 5

20 points

Decisiontree

Figure 3: A decision tree for estimating whether the patron will be willing to wait for a table at a restaurant.

Part a (5 points): Suppose that, on the entire set of training samples available for constructing the decision tree of Figure 1, 80 people decided to wait, and 20 people decided not to wait. What is the initial entropy at node A (before the test is applied)?

Part b (5 points): As mentioned in the previous part, at node A 80 people decided to wait, and 20 people decided not to wait.

What is the information gain for the weekend test at node A? 

Part c (5 points): In the decision tree of Figure 1, node E uses the exact same test (whether it is weekend or not) as node A. What is the information gain, at node E, of using the weekend test?

Part d (5 points): We have a test case of a hungry patron who came in on a rainy Sunday. Which leaf node does this test case end up in? What does the decision tree output for that case?


Task 6

15 points

  Class     A     B     C  
X 1 2 1
X 2 1 2
X 3 2 2
X 1 3 3
X 1 2 2
Y 2 1 1
Y 3 1 1
Y 2 2 2
Y 3 3 1
Y 2 1 1

We want to build a decision tree that determines whether a certain pattern is of type X or type Y. The decision tree can only use tests that are based on attributes A, B, and C. Each attribute has 3 possible values: 1, 2, 3 (we do not apply any thresholding). We have the 10 training examples, shown on the table (each row corresponds to a training example).

What is the information gain of each attribute at the root? Which attribute achieves the highest information gain at the root?



Other Instructions