Project 5: learning project
This is the final project in spacesettlers and it will focus on machine learning. This project is estimated to be 15-20 hours of coding but will take additional time to actually produce the learning curves. Note, as with all of the projects, graduate students have additional requirements.
By the end of this project, you will have accomplished the following objectives.
- Use machine learning inside your spacesettlers agent effectively
- Reused your code from the previous projects as needed (primarily to navigate around the environment)
One quick note
Since I release the project page well in advance of the project being assigned, there is one thing I’ll have to change on the page when the project is assigned! At the end of Project 4, we will do a class vote on if project 5 will be in the capture the flag environment or the regular (projects 1-3) spacesettlers environment. I will update this page at that time!
Update: you all voted overwhelmingly to go back to regular spacesettlers so that is what we will do!
Project 5 task
This project will focus on learning, specifically methods from modules 6-9. This is why we waited until you started module 8 to assign it – I wanted you to be partway through the learning modules (and through the ones I think you are most likely to use).
- Your job is to create a spacesettlers agent that uses machine learning to control some aspect of its behavior. You can choose a task and appropriate method from modules 6-9 or generic algorithms/evolutionary computation (that was covered module 3, even though it is learning, it is also a form of intelligent search so we put it in that module since that’s where the book had it).
- I realize this is VERY open-ended and some of you may not be comfortable with that. Please see some suggestions below!!
Quick learning ideas to get you started
- Instead of using a reflex agent to choose your actions (e.g. BeaconCollector or HeuristicAsteroidCollector), use learning!
- You could build a decision tree to predict the success of different actions available to your agent and then choose the action with the highest predicted success.
- You could use genetic algorithms (GAs, module 3) to adjust the thresholds for each of the action choices (e.g. make each action choice a variable and then the fitness is based on the score over a number of time steps, over time you will learn good thresholds)
- Use genetic algorithms or reinforcement learning (module 9) for movement:
- Navigate successfully to a goal (e.g. reward for achieving the goal, penalize for bumping into things). If you choose this, be very careful to setup a state space that is generalizable and not using x,y coordinates.
- Train an agent to successfully point and shoot at another ship
- Note: RL is easiest if you have a discrete state space. If you choose to do RL, you should propose an approach to discretizing the state space in your proposal. This approach can be very simple (gridding the variables you need for the task you chose) or it could rely on other methods such as clustering.
- Use regression to predict your fuel usage which could then be fed into A* as your heuristic (I’m not honestly sure this will do better than just using the straight line distance but it is worth a shot if you really want to use regression).
- Use genetic algorithms or reinforcement learning to choose your multi-agent strategy (e.g. give it a high-level state space and actions and it learns to choose appropriate ones for each agent)
- Build a decision tree to predict the success of firing at an enemy ship – you can track the bullet for its full lifetime and see if it succeeded in getMovementEnd by examining the ship hits or kills. Then choose your probability of firing based on success. Do NOT do this if you play in cooperative!
- The ideas are endless! You do not have to choose from this list, it is just to help you get thinking.
Things not to do
- You cannot choose to implement clustering or KNN as your single algorithm. Both are extremely simply algorithms and if you want to use them, you will need to combine them with other methods (e.g. cluster to discretize a space then feed that space into a decision tree or something else).
- You cannot choose neural networks or deep learning. They are simply much too complicated and well beyond the scope of the class.
- To keep you from going astray on something that likely will not work, you have two due dates: the first is to propose the learning method(s) and task(s) and the second is the regular project deadline.
- First due date is Nov 6 11:59pm: you MUST propose your project idea by this date.
- Second due date is Nov 20 11:59pm: regular project deadline. This one you turn in the regular way (canvas & submit as outlined below)
- Note that CS 5033 students must propose and implement two learning methods.
HOw DO I … FAQ and other helpful videos
I made a FAQ and filmed some videos (and have some notes!) to help make your project more successful! Please watch before you go implement!
- How do I choose a task and learning method to match the task? How do I know the task will be successful?
- Short answer: Try to pick a task where you can’t already solve it by search (e.g. A* already does optimal search so why replace that?). Pick a task where learning can find a strategy that you can’t simply hard-code.
- I talk a bit about this in last year’s video below but I’ve added a new one this year also.
- Ok, I picked a task and I need to figure out how to implement learning. How do I get started? How do I save data to an external file so I can be more successful at learning?
- In order to learn anything, you will need to collect a LOT of data. Use the initialize() function to open a file handle, save whatever data you need for learning during the agent’s lifetime, and then use the shutdown() function to close the file handle. There is a video on how to do this below!
- I shot this video last year as part of a sequence of videos but it is very applicable here and talks about how to save the files and gives you examples in the code.
- How do I run my agent enough times to collect data? It is SOOOOOO slow to run with graphics and I need thousands of examples!
- Fear not! You can use the ladder!
- Now that I saved data, how do I write my learning agent?
- You can implement the learning offline, outside of the spacesettlers system so long as you can read in your model back into your agent and use it.
- Once I have trained my agent, how do I use the learning agent inside my spacesettlers agent? This is both technical (how do I load it back in?) and philosophical (how do I use it?).
- Can I have multiple ships?
- While not required, you are allowed to have multiple ships this project if you want!
- I’m overwhelmed – can I have an example using a decision tree?
- Yes! I shot this video sequence last year but it very much still applies! Ignore the parts about “I’ve already approved your proposals” because this was for last year’s class. But if you want to think about how to do learning, this is a good start.
- I want to do trees but I have no idea how to do a real-valued tree. Help!
- This is again part of last year’s video sequence but it should help a lot with real-valued trees! Note there are two videos to watch for this sequence, first choosing the best attribute and second an example.
- Feel free to ask more FAQs in slack. I will reply there and update this document and even add extra videos if they are needed.
- The extra credit ladders remain the same as with all previous projects. You are welcome to choose a different ladder path than you chose for either of the previous projects. The class-wide ladders will start on Nov 7, 2022.
- The extra credit opportunities for being creative and finding bugs remain the same as in Project 1. Remember you have to document it in your writeup to get the extra credit!
Part 1: Due Nov 6 11:59 PM
Turn in a ONE paragraph project proposal on canvas here. This proposal must fully specify your proposed method (or methods if you are a graduate student, you may have one paragraph per method). It should say what kind of data you will collect and how (e.g. supervised method needs to collect “right” answers and a semi-supervised method needs to collect fitness, specify how you will collect the data and what your fitness function or correct answers will be.)
Part 2: new due date: Dec 2 11:59 PM
- Update your code from the last project. You can update your code at the command line with “git pull”. If you did not get the code checked out for project 0, follow the instructions to check out the code in Project 0.
- Note: if the class votes to go back to regular spacesettlers, the directories for config files will change BACK to the files we used for projects 1-3 for this project! If the class votes to stay with capture the flag, the directories will stay the same as project 4.
- Write your learning code as described above
- Build and test your code using the ant compilation system within eclipse or using ant on the command line if you are not using eclipse (we highly recommend eclipse or another IDE!). Make sure you use the spacesettlers.graphics system to draw your graph on the screen as well as the path your ship chose using your search method. You can write your own graphics as well but the provided classes should enable you to draw the graph quickly.
- Submit your project on spacesettlers.cs.ou.edu using the submit script as described below. You can submit as many times as you want and we will only grade the last submission.
- Submit ONLY the writeup to the correct Project 5 on canvas: Project 5 for CS 4013 and Project 5 for CS 5013
- Copy your code from your laptop to spacesettlers.cs.ou.edu using the account that was created for you for this class (your username is your 4×4 and the password that you chose in project 0). You can copy using scp or winscp or pscp.
- ssh into spacesettlers.cs.ou.edu
- Make sure your working directory contains all the files you want to turn in. All files should live in the package 4×4. Note: The spacesettlersinit.xml file is required to run your client!
- Submit your file using one of the following commands (be sure your java files come last). You can submit to only ONE ladder. If you submit to both, small green monsters will track you down and deal with you appropriately.
/home/spacewar/bin/submit --config_file spacesettlersinit.xml \ --project project5_coop \ --java_files *.java
/home/spacewar/bin/submit --config_file spacesettlersinit.xml \ --project project5_compete \ --java_files *.java
- After the project deadline, the above command will not accept submissions. If you want to turn in your project late, use:
/home/spacewar/bin/submit --config_file spacesettlersinit.xml \ --project project5_coop_late \ --java_files *.java
/home/spacewar/bin/submit --config_file spacesettlersinit.xml \ --project project5_compete_late \ --java_files *.java
Rubric – Part 1 Due Nov 6 11:59pm
- 10 points for project proposal
10 points for turning in a ONE paragraph project proposal on canvas here
- 0 points for not turning in a proposal (You WANT to turn in a proposal – you want to not pick a project that will not work!)
Rubric – Part 2 new due date Dec 2 11:59pm
- 40 points for correctly implementing the learning method that you proposed and got feedback on (if you were told to choose a different method, you need to implement the method you were told to adjust to). A correct learner uses learning in a way to improve performance and learning will be demonstrated in the writeup (though the curve is graded separately) using a learning curve. Learning code should be well documented to receive full credit.
35 points if there is only one minor mistake.
30 points if there are several minor mistakes or if documentation is missing.
25 points if you have one major mistake.
10 points if you accidentally implement a learning algorithm other than what you intended and it at least moves the ships around the environment in an intelligent manner.
- 10 points for correctly drawing graphics (or using printouts) that enable you to debug your learning and that help us to grade it.
- 7 points for drawing something useful for debugging and grading but with bugs in it
- 3 points for major graphical/printing bugs
- CS 5013 students only: You must implement a second learning method and document it in the writeup
- 20 points for correctly implementing a second learning method and documenting with a learning curve and paragraph describing it in the writeup
- 10 points if you implement it but do not give a second learning curve
- 5 points for bugs
- Good coding practices: We will randomly choose from one of the following good coding practices to grade for these 10 points. Note that this will be included on every project. Are your files well commented? Are your variable names descriptive (or are they all i, j, and k)? Do you make good use of classes and methods or is the entire project in one big flat file? This will be graded as follows:
- 10 points for well commented code, descriptive variables names or making good use of classes and methods
- 5 points if you have partially commented code, semi-descriptive variable names, or partial use of classes and methods
- 0 points if you have no comments in your code, variables are obscurely named, or all your code is in a single flat method
- Writeup: 30 points total. Your writeup is limited to 2 pages maximum. Any writeup over 2 pages will be automatically given a 0. Turn your writeup in to canvas and your code into spacesettlers.
- 20 points for collecting data and demonstrating learning using a learning curve (in the writeup). For full credit, make sure you explain why it is learning or not learning (if it isn’t learning, you will not lose your points if you can explain WHY it is not learning)
- 10 points for describing your learning method in a paragraph or two and explaining why you chose to demonstrate learning in the curve that you present (e.g. I graphed decision trees by the number of leaf nodes to show overfitting or I graphed regression by RMSE over iterations to show it lowered error over time.)