Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

MDPs, Dynamic Bayes Nets, and Naive Bayes Classifiers, Exams of Artificial Intelligence

Various topics in artificial intelligence, including markov decision processes (mdps), dynamic bayes nets, and naive bayes classifiers. It covers mdp solutions, dynamic bayes net structures, and the equivalence of naive bayes classifiers with linear classifiers for binary-valued features.

Typology: Exams

2012/2013

Uploaded on 04/02/2013

shalin_p01ic
shalin_p01ic 🇮🇳

4

(7)

86 documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS 188 Introduction to
Spring 2009 Artificial Intelligence Final Exam
INSTRUCTIONS
You have 3 hours.
The exam is closed book, closed notes except two crib sheets, double-sided.
Please use non-programmable calculators only.
Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a
brief explanation. All short answer sections can be successfully answered in a few sentences at most.
Last Name
First Name
SID
Login
GSI
Section Time
All the work on this
exam is my own.
(please sign)
For staff use only
Q. 1 Q. 2 Q. 3 Q. 4 Q. 5 Q. 6 Total
/16 /14 /21 /10 /17 /22 /100
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download MDPs, Dynamic Bayes Nets, and Naive Bayes Classifiers and more Exams Artificial Intelligence in PDF only on Docsity!

CS 188 Introduction to

Spring 2009 Artificial Intelligence Final Exam

INSTRUCTIONS

  • You have 3 hours.
  • The exam is closed book, closed notes except two crib sheets, double-sided.
  • Please use non-programmable calculators only.
  • Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. All short answer sections can be successfully answered in a few sentences at most.

Last Name

First Name

SID

Login

GSI

Section Time

All the work on this exam is my own. (please sign)

For staff use only

Q. 1 Q. 2 Q. 3 Q. 4 Q. 5 Q. 6 Total

  1. (16 points) True/False

For the following questions, a correct answer is worth 2 points, no answer is worth 1 point, and an incorrect answer is worth 0 points. Circle true or false to indicate your answer.

a) (true or false) Inverse reinforcement learning (which makes helicopters fly by themselves) is primarily concerned with learning an expert’s transition model. Solution: False. Inverse reinforcement learning focuses on learning an expert’s reward function. Experts and lousy agents alike all use the same transition model.

b) (true or false) A∗^ search will always expand fewer search nodes than uniform cost search. Solution: False. A heuristic can lead the search algorithm astray. Consider states S, 1, 2, 3 & G where S leads to 1 and 2 with cost 1, 1 leads to G with cost 1, and 2 leads to 3 with cost 2. UCS will expand S, 1, 2, G. For a heuristic that is zero everywhere except h(1) = 3, then A* will expand S, 2, 3, 1, G.

c) (true or false) K-means is a clustering algorithm that is guaranteed to converge. Solution: True. The convergence proof was given in class.

d) (true or false) A MIRA classifier is guaranteed to perform better on unseen data than a perceptron. Solution: False. MIRA provides no such guarantee.

e) (true or false) For a two-player zero-sum game tree of depth 4 or greater, alpha-beta pruning must prune at least one node. Solution: False. As a simple counterexample, if every player only has one legal move at each turn, then there will be no pruning.

f) (true or false) A CSP with only boolean variables can always be solved in polynomial time. Solution: False. For instance, a boolean CSP can encode a 3-SAT problem, which is NP-complete.

g) (true or false) Sampling from a Bayes net using likelihood weighting will systematically overestimate the posterior of a variable conditioned on one of its descendants. Solution: False. Likelihood weighting is an unbiased sampling procedure.

h) (true or false) Every discrete-valued dynamic Bayes net is equivalent to some hidden Markov model. Solution:

  1. (14 points) MDP: Walk or Jump? Consider an MDP with states { 4 , 3 , 2 , 1 , 0 }, where 4 is the starting state. In states k ≥ 1, you can walk (W) and T (k, W, k − 1) = 1. In states k ≥ 2, you can also jump (J) and T (k, J, k − 2) = T (k, J, k) = 1/2. State 0 is a terminal state. The reward R(s, a, s′) = (s − s′)^2 for all (s, a, s′). Use a discount of γ = 1/2.

(a) (3 pt) Compute V ∗(2). Solution: We compute the value function from state 0 to state 2.

V ∗(0) = 0

V ∗(1) = max{1 + γV ∗(0),

(1 + γV ∗(0)) +

γV ∗(1)}

= max{ 1 ,

V ∗(1)}

= max{ 1 ,

V ∗(2) = max{1 + γV ∗(1),

(4 + γV ∗(0)) +

γV ∗(2)}

= max{

V ∗(2)}

= max{

where 1243 comes from supposing that if 12 + 14 V ∗(1) were the maximum, then

V ∗(1) =

V ∗(1)

V ∗(1) =

V ∗(1) =

and similarly for 2 43.

(b) (3 pt) Compute Q∗(4, W )? Solution:

V ∗(3) =

V ∗(1)) +

V ∗(3)

V ∗ 3

Q∗(4, W ) = 1 +

V ∗(3) =

NAME: 5

(c) (4 pt) Now consider the same MDP, but with infinite states { 4 , 3 , 2 , 1 , 0 , − 1 , ...} and no terminal states. Like before, T (k, J, k − 2) = T (k, J, k) = 1/2 and T (k, W, k − 1) = 1. R(s, a, s′) = (s − s′)^2. Compute V ∗(2). Solution: By symmetry, V ∗(s) is constant. This implies:

V ∗(2) = max{1 + γV ∗(1),

(4 + γV ∗(0)) +

γV ∗(2)}

= max{1 + γV ∗(2),

4 + γV ∗(2)}

=

1 − γ = 4

(d) (4 pt) In the infinite MDP above, an agent acts randomly at each time step. With probability p, it walks. With probability 1 − p, it jumps. What is its expected utility in terms of p, starting in state 4? Solution: Here, the expected utility is the expected sum of discounted rewards. We have a modified Bellman equation for this stochastic policy:

V p(4) = p(1 + 0. 5 V p(3)) + (1 − p) ∗ (0. 5 ∗ (4 + 0. 5 ∗ V p(2)) + 0. 5 ∗ (0. 5 ∗ V p(4)))

But again all states have the same value because of symmetry, so the above equation simplifies to:

V p(4) = p(1 + 0. 5 V p(4)) + (1 − p) ∗ (0. 5 ∗ 4 + 0. 5 ∗ V p(4))

, so V p(4) = 4 − 2 p.

NAME: 7

For the remaining parts of this question, consider V 4 where each X(i,j) takes values true and false. (d) (2 pt) Given the factors below, fill in the table for P (X(1,1)|¬x(3,1)), or state “not enough information.”

X(1,1) P (X(1,1))

x(1,1) 1/ ¬x(1,1) 2/

X(1,1) X(3,1) P (X(3,1)|X(1,1))

x(1,1) x(3,1) 1/ x(1,1) ¬x(3,1) 2/ ¬x(1,1) x(3,1) 0 ¬x(1,1) ¬x(3,1) 1

X(1,1) X(3,1) P (X(1,1)|¬x(3,1))

x(1,1) ¬x(3,1)

¬x(1,1) ¬x(3,1) Solution: Using Bayes’ rule P (x(1,1)|¬x(3,1)) = 1/4, P (¬x(1,1)|¬x(3,1)) = 3/4. (e) (3 pt) In V 4 , if P (X(1,h) = true) = 13 for all h, and the CPT for X(i,j) is defined below for all i > 1, what is the joint probability of all variables when X(i,j) is true if and only if i + j is even.

X(i,j) X(i− 1 ,j) X(i− 1 ,j+1) P (X(i,j)|X(i− 1 ,j), X(i− 1 ,j+1)) true true true 1 true false true 1/ true true false 1/ true false false 1/

Clarification: Compute P (X(1,1) = true, X(1,2) = false, X(1,3) = true, ..., X(4,1) = false) Solution: ( 13 )^2 · ( 23 )^2 · ( 12 )^6 = 12961

(f ) (4 pt) Given the definition of V 4 above, formulate a CSP over variables X(i,j) that is satisfied by only and all assignments with non-zero joint probability. Solution: Variables: X(i,j); Values: true, false; Constraint: (X(i,j), X(i− 1 ,j), X(i− 1 ,j+1)) 6 = (false, true, true) for all i > 1. Only the constraint is necessary for full credit, and the i > 1 condition is optional.

(g) (2 pt) Does there exist a binary CSP that is equivalent to the CSP you defined? Solution: Yes. All finite-valued CSPs are equivalent to some binary CSP.

  1. (10 points) Linear Naive Bayes Recall that a naive Bayes classifier with features Fi (i = 1,... , n) and label Y uses the classification rule:

arg max y P (y|f 1 ,... , fn) = arg max y P (y)

∏^ n

i=

P (fi|y)

And a linear classifier (for example, perceptron) uses the classification rule:

arg max y

∑n

i=

wy,i · fi where f 0 is a bias feature that is always 1 for all data

(a) (8 pt) For a naive Bayes classifier with binary-valued features, i.e. fi ∈ { 0 , 1 }, prove that it is also a linear classifier by defining weights wy,i (for i = 0,... , n) such that both decision rules above are equivalent. The weights should be expressed in terms of the naive Bayes probabilities — P (y), P (Fi = 1|y), and P (Fi = 0|y). Assume that all these probabilities are non-zero. Solution:

arg max y P (y)

∏^ n

i=

P (fi|y) = arg max y log P (y) +

∑^ n

i=

log P (fi|y)

= arg max y log P (y) +

∑^ n

i=

[fi log P (Fi = 1|y) + (1 − fi) log P (Fi = 0|y)]

= arg max y log P (y) +

∑^ n

i=

log P (Fi = 0|y) +

∑^ n

i=

fi log P (Fi = 1|y) P (Fi = 0|y)

This is clearly equivalent to a linear classifier with the weights:

wy, 0 = log P (y) +

∑^ n

i=

log P (Fi = 0|y)

wy,i = log P (Fi = 1|y) P (Fi = 0|y)

For i = 1,... , n

(b) (2 pt) For the training set below with binary features F 1 and F 2 and label Y , name a smoothing method that would estimate a naive Bayes model that would correctly classify all four data, or answer “impossible” if there is no smoothing method that would give appropriate distributions P (Y ) and P (Fi|Y ). Briefly justify your answer. F 1 F 2 Y 0 0 0 0 1 1 1 0 1 1 1 0 Solution: The data is clearly not linearly separable. Since we know from the previous problem that a naive Bayes classifier on binary-valued features can be re-written as a linear classifier, no Naive Bayes classifier can exist for this data.

Now, assume that you have the following 10 particles in your particle filter after two rounds of moves, where the count indicates a number of duplicate particles for the same game state. Your utility is 1 if you win on the next move, and 0 if you do not. Note: even if you win eventually, your utility is 0 unless you win in one move.

count = 3 count=1^ count=2 count= (b) (2 pt) According to your particle filter, what is your belief that it is possible to win in the next move? Solution: A total of 2 + 4 = 6 particles out of 10 are winnable, so you believe that you can possibly win with probability 0.6.

(c) (3 pt) What is your maximum expected utility? Solution: The left and right columns are the only moves which have a non-zero probability of winning. Out of them, the left-most column has a greater probability of winning. MEU = .4.

(d) (4 pt) What is the value of peeking at the board before moving? Solution: The odds that you are in a winning position is 2+4 10. Hence your expected MEU after peeking will be .6. The value of peeking is thus. 6 − .4 = .2.

NAME: 11

  1. (22 points) The Last Dot

Pacman is one dot away from summer vacation. He just has to outsmart the ghost, starting in the game state to the right. Pacman fully observes every game state and will move first. The ghost only observes the starting state, the layout, its own positions after each move, and whether or not Pacman is in a middle row (2 and 3) or an edge row (1 and 4). The ghost probabilistically tracks what row pacman is in, and it moves around in column 2 to block Pacman. 1 2 3

1

2

3

4

Starting state

Details of the ghost agent:

  • The ghost knows that Pacman starts in row 2. It assumes Pacman changes rows according to:
    • In rows 1 and 4, Pacman stays in the same row with probability 109 and moves into the adjacent center row otherwise.
    • In rows 2 and 3, Pacman moves up with probability 12 , down with probability 38 , and stays in the same row with probability 18.
  • After each of Pacman’s turns (but before the ghost moves), the ghost observes whether Pacman is in a middle row (2 and 3) or an edge row (1 and 4).
  • Each turn, the ghost moves up, moves down or stops
  • The ghost moves toward the row that most probably contains Pacman according to its model and obser- vations. If the ghost believes Pacman is most likely in its current row, then it will stop.

(a) (3 pt) According to the ghost’s assumed transition and emission models, what will be the ghost’s belief distribution over Pacman’s row if Pacman moves up twice, starting from the game state shown?

Row 1 Row 2 Row 3 Row 4

Solution: The ghost believes Pacman is in row 4 with probability 1619 , and in row 1 with probability 193.

(b) (3 pt) If Pacman stops twice in a row (starting from the start state), where will the ghost be according to its policy? With what probability does the ghost believe it is in the same row as Pacman?

Ghost position Ghost belief probability

Solution: After one observation that Pacman is in the middle, the ghost’s belief distribution will be B(2) = (^1) /8+1^1 /^8 / 2 = 1 5 and^ B(3) =^

1 / 2 1 /8+1/ 2 =^

4

  1. After a second observation, we have^ B(2) =^

4 / 5 ∗ 3 /8+1/ 5 ∗ 1 / 8 4 / 5 ∗ 3 /8+1/ 5 ∗ 1 /8+4/ 5 ∗ 1 /8+1/ 5 ∗ 1 / 2 , which works out to 1321. Thus, the ghost will be in position (2,2) (row 2), because 1321 > 12.