artificial intelligence
Module 4: Adversarial search
Executive Summary
- Topics:
- Search with adversaries (e.g. games)
- Minimax search
- Alpha-beta pruning
- Handling large and more challenging games
- Length: This module will take two weeks to complete
- Assigned chapters: Chapter 5
- Project: Project 3 (will stay open one week beyond this module)
Adversarial search
Many games can be fully solved through search techniques but the problem formulation and the search approaches that we have studied so far cannot handle searching in a strategic situation with multiple agents. In this module, we will discuss adversarial search techniques, starting from simple games that can be fully searched and moving to more advanced games where we must use approximations to the full search.
Hybrid machine learning and search approaches have been in the news lately especially with the AlphaZero series of games (see link and photo to the right). Although we will discuss the foundational search approaches for these methods, we will not cover the machine learning methods until a later module.
Real World search
Photo from this DeepMind blog post about AlphaZero
reading and videos for module 4
Topic 1: Optimal search for games
Before we can dive into search methods for games, we need to look at how we formulate search for an adversarial situation as it is different than what we have been doing so far. We will then look at MiniMax search, which can give us an approach to playing games optimally, in many situations.
- (30 min) Reading
- Read Sections 5.1-5.2.2 of Chapter 5 (Game Theory through Optimal Decisions in Games but don’t read about alpha-beta pruning)
- (30 min) Adversarial search
- First watch the video on problem formulation for games.
- Copy of the slides
- Then watch the video on Minimax for 2 players
- Copy of the slides
- If minimax is confusing still, wikipedia has some additional examples that may be helpful!
- All students: Complete the exercise on minimax and problem formulation
- CS 5013 students only: Complete the graduate assignment on minimax
- First watch the video on problem formulation for games.
Topic 2: alpha-beta pruning
MiniMax will give the optimal strategy but it cannot handle large search spaces. We will discuss multiple ways to handle larger (and more difficult/interesting) games. In this topic, we start with alpha-beta pruning which is a strategy to prune a MiniMax tree while still guaranteeing optimality.
Remember, if you get stuck, please chat in slack and ask your fellow students plus the TA and Dr McGovern for help! Go to #general for help in general!
- (30 min) Reading
- Read Section 5.2.3-5.3 of Chapter 5 (alpha-beta pruning through Heuristic alpha-beta tree search)
- (60 min) Pruning
- Watch the video on Alpha-Beta pruning
- Copy of the slides
- If you are confused or need extra examples, here is an additional example of alpha-beta pruning step-by-step
- Complete the exercise on alpha-beta pruning
- Watch the video on other approaches to searching in larger spaces. This includes a discussion of evaluation functions, which can replace utility functions if the search is terminated before the game ends.
- Copy of the slides
- Watch the video on Alpha-Beta pruning
Topic 3: monte-carlo tree search
Although Alpha-Beta pruning is efficient and guaranteed to still give an optimal answer, it is not sufficient to enable search in very large games such as Go. Here, we examine the approach used by Alpha Go called Monte Carlo Tree Search. Although I draw on material from the book in this section, I also reference a paper (linked below) which gives an excellent overview of MCTS.
- (30 min) Reading
- Read Section 5.4 of Chapter 5 (Monte Carlo Tree Search)
- Optional: If you are eagerly wanting to learn more about MCTS, I highly recommend this overview paper. It is a free PDF download.
- (60 min) MCTS
- Watch the video on MCTS
- Copy of the slides
- Complete the exercise on MCTS
- What is the game solution that has been in the news lately that has excited you the most? Post on #random and share!
Topic 4: stochastic games and partial observability
In our final topic for this module, we discuss a method called rollouts that is related to MCTS (it is based on Monte Carlo sampling). It is not discussed much in the book but it is simple to implement and is very useful for large-scale games. We end with a discussion of a theoretical method that works well in tiny games with chance but is of limited practical use.
projects for module 4
suggested schedule for module 4
week 1 (Sep 18 – Sep 25)
- Complete Topic 1 by Tuesday
- Work on Project 2 throughout the week
- Complete Topic 2 by Friday
- Finish Project 2 by Sunday
- Don’t forget bug reports to the #bug-reports channel! First to find & post verified bugs in the module gets extra credit!
Week 2 (Sep 25 – Oct 2)
- Complete Topic 3 by Wednesday
- Begin Project 3
- Complete Topic 4 by Friday
Week 3 (Oct 2 – Oct 9)
- Complete Project 3