Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Admissible Heuristic - Introduction to Artificial Intelligence - Solved Exams, Exams of Artificial Intelligence

Gujarat Ayurved University Artificial Intelligence

Main points of this past exam are: Admissible Heuristic, Simultaneously Move, Formally State, Search Problem, Branching, Maximum Branching, State Space, Maximum Size, Finicky Feast, Vegetarian Options

Typology: Exams

2012/2013

Uploaded on 04/02/2013

shalin_p01ic 🇮🇳

(7)

86 documents

1 / 12

This page cannot be seen from the preview

Don't miss anything!

CS 188 Introduction to

Fall 2008 Artificial Intelligence Midterm Exam

INSTRUCTIONS

•You have 80 minutes. 70 points total. Don’t panic!

•The exam is closed book, closed notes except a one-page crib sheet, non-programmable calculators only.

•Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a

brief explanation. All short answer sections can be successfully answered in a few sentences at most.

•Question 0: Fill out the following grid and write your name, SID, login, and GSI at the top of

each subsequent page. (-1 points if done incorrectly!)

Last Name

First Name

SID

GSI

All the work on this

exam is my own.

(please sign)

For staff use only

Q. 1 Q. 2 Q. 3 Q. 4 Q. 5 Total

/12 /11 /15 /17 /15 /70

Partial preview of the text

Download Admissible Heuristic - Introduction to Artificial Intelligence - Solved Exams and more Exams Artificial Intelligence in PDF only on Docsity!

CS 188 Introduction to

Fall 2008 Artificial Intelligence Midterm Exam

INSTRUCTIONS

You have 80 minutes. 70 points total. Don’t panic!
The exam is closed book, closed notes except a one-page crib sheet, non-programmable calculators only.
Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. All short answer sections can be successfully answered in a few sentences at most.
Question 0: Fill out the following grid and write your name, SID, login, and GSI at the top of each subsequent page. (-1 points if done incorrectly!)

Last Name

First Name

SID

GSI

All the work on this exam is my own. (please sign)

For staff use only

Q. 1 Q. 2 Q. 3 Q. 4 Q. 5 Total

THIS PAGE INTENTIONALLY LEFT BLANK

(11 points.) CSPs: Finicky Feast

You are designing a menu for a special event. There are several choices, each represented as a variable: (A)ppetizer, (B)everage, main (C)ourse, and (D)essert. The domains of the variables are as follows:

A: (v)eggies, (e)scargot B: (w)ater, (s)oda, (m)ilk C: (f)ish, (b)eef, (p)asta D: (a)pple pie, (i)ce cream, (ch)eese Because all of your guests get the same menu, it must obey the following dietary constraints:

(i) Vegetarian options: The appetizer must be veggies or the main course must be pasta or fish (or both). (ii) Total budget: If you serve the escargot, you cannot afford any beverage other than water. (iii) Calcium requirement: You must serve at least one of milk, ice cream, or cheese.

(a) (3 points) Draw the constraint graph over the variables A, B, C, and D.

C D

(b) (2 points) Imagine we first assign A=e. Cross out eliminated values to show the domains of the variables after forward checking. A [ e ] B [ w s m ] C [ f b p ] D [ a i ch ] Answer: The values s, m, and b should be crossed off. “s” and “m” are eliminated due to being incompatible with “e” based on constraint (ii). “b” is eliminated due to constraint (i).

(c) (3 points) Again imagine we first assign A=e. Cross out eliminated values to show the domains of the variables after arc consistency has been enforced. A [ e ] B [ w s m ] C [ f b p ] D [ a i ch ] Answer: The values s, m, b, and a should be eliminated. The first three are crossed off for the reasons above, and “a” is eliminated because there is no value for (B) that is compatible with “a” (based on constraint (iii)).

(d) (1 point) Give a solution for this CSP or state that none exists. Answer: Multiple solutions exist. One is A=e, B=w, C=f, and D=i.

(e) (2 points) For general CSPs, will enforcing arc consistency after an assignment always prune at least as many domain values as forward checking? Briefly explain why or why not. Answer: Two answers are possible: Yes. The first step of arc consistency is equivalent to forward checking, so arc consistency removes all values that forward checking does. No. While forward checking is a subset of arc consistency, after any assignment, arc consistency may have already eliminated values in a previous step that are eliminated in that step by forward checking. Thus, enforcing arc consistency will never leave more domain values than enforcing forward checking, but on a given

NAME: SID#: Login: GSI: 5

step, forward checking might prune values than arc consistency by pruning values that have already been pruned by arc consistency.

NAME: SID#: Login: GSI: 7

(c) (3 points) In each node, write UA(s), the utility of that state for player A, assuming that B is a balancer. Answer: Displayed above.

(d) (3 points) Write pseudocode for the functions which compute the UA(s) values of game states in the general case of multi-turn games where B is a balancer. Assume you have access to the following functions: successors(s) gives the possible next states, isTerminal(s) checks whether a state is a terminal state, and terminalValue(s) returns A’s utility for a terminal state. Careful: As in minimax, be sure that both functions compute and return player A’s utilities for states – B’s utility can always be computed from A’s utility.

Answer: Below. Note that for balanceValue(s), we must return the utility the maximizer’s perspective.

def maxValue(s): // compute $U_A(s)$ assuming that A is next to move.

if isTerminal(s): return terminalValue(s)

return max([balanceValue(succ) for succ in successors(s)])

def balanceValue(s): // compute $U_A(s)$ assuming that B is next to move.

if isTerminal(s): return terminalValue(s)

maxVal = -infty

for succ in successors(s):

val = maxValue(succ)

if math.abs(val) < math.abs(maxVal): maxVal = val

return maxVal

(h) (2 points) Consider pruning children of a B node in this scenario. On the tree on the bottom of the previous page, cross off any nodes which can be pruned, again assuming left-to-right ordering.

Answer: Answers above.

(i) (2 points) Again consider pruning children of a B node s. Let α be the best option for an A node higher in the tree, just as in alpha-beta pruning, and let v be the UA value of the best action B has found so far from s. Give a general condition under which balanceValue(s) can return without examining any more of its children.

Answer: |v| < α.

(17 points.) MDPs and RL: Wandering Merchant There are N cities along a major highway numbered 1 through N. You are a merchant from city 1 (that’s where you start). Each day, you can either travel to a neighboring city (actions East or West) or stay and do business in the current city (action Stay). If you choose to travel from city i, you successfully reach the next city with probability pi, but there is probability 1 − pi that you hit a storm, in which case you waste the day and do not go anywhere. If you stay to do business in city i, you get ri > 0 in reward; a travel day has reward 0 regardless of whether or not you succeed in changing cities. The diagram below shows the actions and transitions from city i. Solid arrows are actions; dashed arrows are resulting transitions labeled with their probability and reward, in that order.

(a) (2 points) If for all i, ri = 1, pi = 1, and there is a discount γ = 0.5, what is the value V stay^ (1) of being in city 1 under the policy that always chooses stay? Your answer should be a real number. Answer: for all cities (states) i = 1,... , N , we have that the optimal value behaves as follows:

V stay^ (i) = ri + γV stay^ (i)

(remember, this is like the Bellman equation for a fixed policy). Plugging in values, we get V stay^ (i) = 1 +

5 V stay^ (i). Now we can just solve for V stay^ (i) using algebra to obtain V stay^ (i) = 2. In particular, V stay^ (1) = 2. (b) (2 points) If for all i, ri = 1, pi = 1, and there is a discount γ = 0.5, what is the optimal value V ∗(1) of being in city 1? Intuitive Answer: since all the cities offer the same reward (ri = 1), there is no incentive to move to another city to do business, so the optimal policy is to always stay, yielding V ∗(1) = 2. More Formal Answer: For all cities (states) i = 1,... , N , writing out the Bellman equations:^1

V ∗(i) = max{ri + γV ∗(i) ︸︷︷︸ stay

, piγV ∗(i − 1) + (1 − pi)γV ∗(i) ︸︷︷︸ left

, piγV ∗(i + 1) + (1 − pi)γV ∗(i) ︸︷︷︸ right

Since pi = 1, this drastically simplifies:

V ∗(i) = max{ri + γV ∗(i) ︸︷︷︸ stay

, γV ∗(i − 1) ︸︷︷︸ left

, γV ∗(i + 1) ︸︷︷︸ right

From this, we see that V ∗(i) is the same for all i, so the max is obtained always with the stay action. (c) (2 points) If the ri’s and pi’s are known positive numbers and there is almost no discount, i.e. γ ≈ 1, describe the optimal policy. You may define it formally or in words, e.g. “always go east,” but your answer (^1) For i = 1, omit the left action; for i = N , omit the right action.

After (1, S, 4 , 1), we update Q(1, S) ← 0 .5[4 + 1 · 2] + 0.5(2) = 4.

Circle true or false; skipping here is worth 1 point per question. (g) (2 points) (True/False) Q-learning will only learn the optimal q-values if actions are eventually selected according to the optimal policy. Answer: False. As long as the policy used explores all the states (even a random policy will work), Q-learning will find the optimal q-values. (h) (2 points) (True/False) In a deterministic MDP (i.e. one in which each state / action leads to a single deterministic next state), the Q-learning update with a learning rate of α = 1 will correctly learn the optimal q-values. Answer: True. Remember that the learning rate is only there because we are trying to approximate a summation with a single sample. In a deterministic MDP where s′^ is the single state that always follows when we take action a in state s, we have Q(s, a) = R(s, a, s′) + maxa′ Q(s′, a′), which is exactly the update we make.

NAME: SID#: Login: GSI: 11

(15 points.) Probability and Utilities: Wheel of Fortune

You are playing a simplified game of Wheel of Fortune. The objective is to correctly guess a three letter word. Let X, Y, and Z represent the first, second, and third letters of the word, respectively. There are only 8 possible words: X can take on the values ‘c’ or ‘l’, Y can be ‘a’ or ‘o’, and Z can be ‘b’ or ‘t’.

Before you guess the word, two of the three letters will be revealed to you. In the first round of the game, you choose one of X, Y or Z to be revealed. In the second round, you choose one of the remaining two letters to be revealed. In the third round, you guess the word. If you guess correctly, you win. The utility of winning is 1, while the utility of losing is 0.

You watch the game a lot and determine that the eight possible words occur with the probabilities shown on the right. Your goal is to act in such a way as to maximize your chances of winning (and thereby your expected utility).

(a) (3 points) What is the distribution P(Y, Z)? Your answer should be in the form of a table. Answer:

P(X=c,Y=a)=0. P(X=c,Y=o)=0. P(X=l,Y=a)=0. P(X=l,Y=o)=0. (b) (2 points) Are the second and third letters (Y and Z) independent? Show a specific computation that supports your claim. Answer: No, since P(X=c) = 0.6, P(Y=a) = 0.4 but P(X=c,Y=a)=0.2 which is not P(X=c)P(Y=a)=0. (other counterexamples exist too) (c) (2 points) Are the second and third letters (Y and Z) independent if you know the value of the first letter (X)? Show a specific computation that supports your claim. Answer: Yes. P (Y = a, Z = b|X = c) = P (X = c, Y = a, Z = b)/P (X = c) = 1/ 6. P (Y = a|X = c) = (0.1 + 0.1)/ 0. 6 P (Z = b|X = c) = (0.1 + 0.2)/(0.6) = 1/ 2. Thus, P (Y = a, Z = b|X = c) = 1/6 = P (Y = a|X = c)P (Z = b|X = c). To be certain, you have to also check for all pairs (not required for full credit). Alternatively, you can show that P (Y |X, Z) = P (Y |X)

Admissible Heuristic - Introduction to Artificial Intelligence - Solved Exams, Exams of Artificial Intelligence

Related documents

Partial preview of the text

Download Admissible Heuristic - Introduction to Artificial Intelligence - Solved Exams and more Exams Artificial Intelligence in PDF only on Docsity!

CS 188 Introduction to

Fall 2008 Artificial Intelligence Midterm Exam

INSTRUCTIONS

SID

GSI

Q. 1 Q. 2 Q. 3 Q. 4 Q. 5 Total

THIS PAGE INTENTIONALLY LEFT BLANK

NAME: SID#: Login: GSI: 5

NAME: SID#: Login: GSI: 7

def maxValue(s): // compute $U_A(s)$ assuming that A is next to move.

if isTerminal(s): return terminalValue(s)

return max([balanceValue(succ) for succ in successors(s)])

def balanceValue(s): // compute $U_A(s)$ assuming that B is next to move.

if isTerminal(s): return terminalValue(s)

maxVal = -infty

for succ in successors(s):

val = maxValue(succ)

if math.abs(val) < math.abs(maxVal): maxVal = val

return maxVal

NAME: SID#: Login: GSI: 11