Consider the tic-tac-toe board state shown in Figure 1. Draw
the full
minimax search tree starting from this state, and ending in terminal
nodes. Show the utility value for each terminal and non-terminal node.
Also show which move the Minimax algorithm decides to play.
Utility values are +1 if X wins, 0 for a tie, and -1 if O wins. Assume
that X makes the next move (X is the MAX player).
Problem 2
Max: [4308: 13 Points,
5360: 10 Points]
Figure 2. A game search tree.
a. (4308: 20 points, 5360: 15 points)
In the game search tree of Figure 2, indicate what nodes will be pruned
using alpha-beta search, and what the estimated utility values are for
the rest of the nodes. Assume that, when given a choice, alpha-beta
search expands nodes in a left-to-right order. Also, assume the MAX
player plays first. Finally incidcate which action the Minmax algorithm
will pick to exectute.
b. (4308: 5 points, 5360: 5 points) This question is also on
the game search tree
of Figure 2. Suppose we are given some additional knowledge about the
game: the maximum utility value is 10, i.e., it is not mathematically
possible for the MAX player to get an outcome greater than 10. How can
this knowledge be used to further improve the efficiency of alpha-beta
search? Indicate the nodes that will be pruned using this improvement.
Again, assume that, when given a choice, alpha-beta search expands
nodes in a left-to-right order, and that the MAX player plays first.
Problem 3
Max: [4308: 10 Points,
5360: 8 Points]
Suppose that you want to implement an algorithm tht will compete on a
two-player deterministic game of perfect information. Your opponent is
a supercomputer called DeepGreen. DeepGreen does not use Minimax. You
are given a library function DeepGreenMove(S), that takes any state S
as an argument, and returns the move that DeepGreen will choose for
that state S (more precisely, DeepGreenMove (S) returns the state
resulting from the opponent's move).
Write
an algorithm in pseudocode (following the style of the Minimax
pseudocode) that will always make an optimal decision given the
knowledge we have about DeepGreen. You are free to use the library
function DeepGreenMove(S) in your pseudocode. What advantage would this
algorithm have over Minimax? (if none, Justify).
Problem 4
Max: [4308: 12 Points,
5360: 10 Points]
Figure 3: An Expectiminmax tree.
Find
the value of every non-terminal node in the expectiminmax tree given
above. Also indicate which action will be performed by the algoirithm
Problem 5 (Extra Credit for 4308, Required for 5360)
Max: [4308: 10 Points EC,
5360: 10 Points]
Figure 4: Yet another game search tree
Consider
the MINIMAX tree above. Suppose that we are the MAX player, and we
follow the MINIMAX algorithm to play a full game against an opponent.
However,we
do not know what algorithm the opponent uses.
Under
these conditions, what is the best possible outcome of playing the full
game for the MAX player? What is the worst possible outcome for the MAX
player? Justify your answer.
NOTE:
the question is not asking you about what MINIMAX will compute for the
start node. It is asking you what is the best and worst outcome of acomplete
gameunder
the assumptions stated above.
Problem 6
Max: [4308: 50 Points,
5360: 50 Points]
The task in this
programming part is to implement an agent that
plays the Max-Connect4 game using search. Figure 5 shows the first few
moves of a game. The game is played on a 6x7 grid, with six rows and
seven columns. There are two players, player A (red) and player B
(green). The two players take turns placing pieces on the board: the
red player can only place red pieces, and the green player can only
place green pieces.
It is best to think of the board as standing upright. We will assign a
number to every row and column, as follows: columns are numbered from
left to right, with numbers 1, 2, ..., 7. Rows are numbered from bottom
to top, with numbers 1, 2, ..., 6. When a player makes a move, the move
is completely determined by specifying the COLUMN where the piece will
be placed. If all six positions in that column are occupied, then the
move is invalid, and the program should reject it and force the player
to make a valid move. In a valid move, once the column is specified,
the piece is placed on that column and "falls down", until it reaches
the lowest unoccupied position in that column.
The game is over when all positions are occupied. Obviously, every
complete game consists of 42 moves, and each player makes 21 moves. The
score, at the end of the game is determined as follows: consider each
quadruple of four consecutive positions on board, either in the
horizontal, vertical, or each of the two diagonal directions (from
bottom left to top right and from bottom right to top left). The red
player gets a point for each such quadruple where all four positions
are occupied by red pieces. Similarly, the green player gets a point
for each such quadruple where all four positions are occupied by green
pieces. The player with the most points wins the game.
Your program will run in two modes: an interactive mode, that is best
suited for the program playing against a human player, and a one-move
mode, where the program reads the current state of the game from an
input file, makes a single move, and writes the resulting state to an
output file. The one-move mode can be used to make programs play
against each other. Note that THE PROGRAM MAY BE EITHER THE RED OR THE
GREEN PLAYER, THAT WILL BE SPECIFIED BY THE STATE, AS SAVED IN THE
INPUT FILE.
As part of this assignment, you will also need to measure and report
the time that your program takes, as a function of the number of moves
it explores. All time measurements should report CPU time, not total
time elapsed. CPU time does not depend on other users of the system,
and thus is a meaningful measurement of the efficiency of the
implementation.
Figure 5: Sample Max-Connect Game (15 moves in)
Interactive Mode
In the interactive mode, the game should run from the command line with
the following arguments (assuming a Java implementation, with obvious
changes for C++ or other implementations):
Argument interactive specifies that the program runs in
interactive
mode.
Argument [input_file] specifies an input file that contains
an initial
board state. This way we can start the program from a non-empty board
state. If the input file does not exist, the program should just create
an empty board state and start again from there.
Argument [computer-first/human-first] specifies whether the
computer
should make the next move or the human.
Argument [depth] specifies the number of moves in advance
that the
computer should consider while searching for its next move. In other
words, this argument specifies the depth of the search tree.
Essentially, this argument will control the time takes for the computer
to make a move.
After reading the input file, the program gets into the following loop:
If computer-next, goto 2, else goto 5.
Print the current board state and score. If the board is
full, exit.
Choose and make the next move.
Save the current board state in a file called computer.txt
(in same
format as input file).
Print the current board state and score. If the board is
full, exit.
Ask the human user to make a move (make sure that the move
is valid,
otherwise repeat request to the user).
Save the current board state in a file called human.txt (in
same format
as input file).
Goto 2.
One-Move Mode
The purpose of the one-move mode is to make it easy for programs to
compete against each other, and communicate their moves to each other
using text files. The one-move mode is invoked as follows:
In this case, the program simply makes a single move and terminates. In
particular, the program should:
Read the input file and initialize the board state and
current score,
as in interactive mode.
Print the current board state and score. If the board is
full, exit.
Choose and make the next move.
Print the current board state and score.
Save the current board state to the output file IN EXACTLY THE SAME
FORMAT THAT IS USED FOR INPUT FILES.
Exit
Sample code
The sample code needs an input file to run. Sample input files that you
can download are input1.txt
and input2.txt.
You are free to make other
input files to experiment with, as long as they follow the same format.
In the input files, a 0 stands for an empty spot, a 1 stands for a
piece played by the first player, and a 2 stands for a piece played by
the second player. The last number in the input file indicates which
player plays NEXT (and NOT which player played last). Sample code is
available in:
The sample code implements a system playing max-connect4 (in one-move
mode only) by making random moves. While the AI part of the sample code
leaves much to be desired (your assignment is to fix that), the code
can get you started by showing you how to represent and generate board
states, how to save/load the game state to and from files in the
desired format, and how to count the score (though faster
score-counting methods are possible).
Measuring Execution Time
You can measure the execution time for your program on omega by
inserting the word "time" in the beginning of your command line. For
example, if you want to measure how much time it takes for your system
to make one move with the depth parameter set to 10, try this:
time java maxconnect4 one-move red_next.txt green_next.txt 10
Your output will look something like:
real 0m0.003s
user 0m0.002s
sys 0m0.001s
Out of the above three lines, the user
line is what you should report.
Grading
The programming section will be graded out of 50 points.
20 points: Implementing plain minimax.
12 points: Implementing alpha-beta pruning (if correctly
implemented, will algo get points for plain minimax, you don't need to
have separate implementation for it)
10 points: Implementing the depth-limited version of
minimax (if
correctly implemented, and includes alpha-beta pruning, you also get
the points for plain minimax and the points for alpha-beta
search,
you don't need to
have separate implementations for those).
For full
credit, you obviously need to come up with a reasonable evaluation
function to be used in the context of depth-limited search.
A
"reasonable" evaluation function is defined to be an evaluation
function that allows your program to consistently beat a player who
just plays randomly.
3 points: Include a file, eval_explanation.txt (can also
use .pdf),
that explains the evaluation function used for depth-limited search.
5 points: Include in your submission an
accurate table
of depth limit vs CPU runtime (for making a single move using one move
mode) when the board is
empty. Document the number of measurements for each entry on the table.
All measurements should be performed on omega. Your table should
include every single depth, until (and including) the first depth for
which the time exceeds one minute.
How to submit
For Written part:
The answers can be typed as a document or handwritten and
scanned.
Name files as
assignment2_written_<net-id>.<format>
Accepted document formats .pdf. If you are using
Word, OpenOffice or LibreOffice, make sure
to
save as .pdf
Please do not submit
.txt files.
If
you are scanning handwritten documents make sure to scan it at a
minimum of 600dpi and save as a .pdf or .png file. Do not
insert images in word document and submit.
If there are multiple files in your submission, zip them
together as assignment2_written_<net-id>.zip.
For Programming part: Implementations in C, C++, Java, and Python will
be accepted. Points will be taken off
for failure to comply
with this requirement.
Create a ZIPPED
directory called assignment2_code_<net-id>.zip (no other
forms
of compression
accepted, contact the instructor or TA if you do not know how to
produce .zip files). The directory should contain source code.The
folder should also contain a file called readme.txt, which
should specify precisely:
Name and UTA ID of the student.
What programming language is used.
How the code is structured.
How to run the code, including very specific compilation
instructions,
if compilation is needed. Instructions such as "compile using g++" are
NOT considered specific.
Insufficient or unclear instructions will be penalized by
up to 10
points.
Code that
does not run on omega machines gets AT MOST 30 points.
The assignment should be submitted via Blackboard.
Zip all the files for both the programming and written files together
into assignment2_<net-id>.zip and submit it.
Submission checklist
Is the code running on omega?
Does the submission include a readme.txt file, as specified?
Have scanned all the documents for the written section as specified?