ledmands
commited on
Commit
•
35a7301
1
Parent(s):
ebcd7c3
Updated and formatted README
Browse files
README.md
CHANGED
@@ -26,8 +26,8 @@ model-index:
|
|
26 |
|
27 |
# *Agent using DQN to play ALE/Pacman-v5*
|
28 |
|
29 |
-
|
30 |
-
|
31 |
|
32 |
This is an agent that is trained using Stable Baselines3 as part of the capstone project for South Hills School in Spring 2024. The goal of this project is to gain familiarity with reinforcement learning concepts and tools, and to train an agent to score up into the 400-500 point range in Pac-Man.
|
33 |
|
@@ -50,7 +50,7 @@ After cloning the repository to your local device, run:
|
|
50 |
```bash
|
51 |
pip install -r requirements.txt
|
52 |
```
|
53 |
-
|
54 |
|
55 |
---
|
56 |
|
@@ -153,42 +153,37 @@ python <script_name> --help
|
|
153 |
|
154 |
##### *watch_agent.py*
|
155 |
|
156 |
-
|
157 |
|
158 |
##### *evaluate_agent.py*
|
159 |
|
160 |
-
|
161 |
|
162 |
##### *get_config.py*
|
163 |
|
164 |
-
|
165 |
|
166 |
##### *plot_improvement.py*
|
167 |
|
168 |
-
|
169 |
|
170 |
##### *record_video.py*
|
171 |
|
172 |
-
|
173 |
|
174 |
##### *plot_evaluations.py*
|
175 |
|
176 |
-
|
177 |
|
178 |
---
|
179 |
|
180 |
## *External References*
|
181 |
|
182 |
-
- [Foundations of Deep RL -- 6-lecture series by Pieter Abbeel](https://www.youtube.com/playlist?list=PLwRJQ4m4UJjNymuBM9RdmB3Z9N5-0IlY0)
|
183 |
-
|
184 |
-
- [
|
185 |
-
|
186 |
-
- [
|
187 |
-
- Daniel Takeshi wrote an excellent post that helped me better understand some of the terminology around frame skipping.
|
188 |
-
- [Playing Atari with Deep Reinforcement Learning](https://arxiv.org/abs/1312.5602)
|
189 |
-
- This paper on Deep Q Networks is a landmark in the field of reinforcement learning.
|
190 |
-
- [Hugging Face Deep Reinforcement Learning Course](https://huggingface.co/learn/deep-rl-course/unit0/introduction)
|
191 |
-
- Another inspiration for this project and a great place to get hands-on experience.
|
192 |
- [Stable Baselines3](https://stable-baselines3.readthedocs.io/en/master/)
|
193 |
- [RL Zoo](https://rl-baselines3-zoo.readthedocs.io/en/master/)
|
194 |
- [Gymnasium](https://gymnasium.farama.org/)
|
@@ -197,4 +192,4 @@ python <script_name> --help
|
|
197 |
|
198 |
## *Contact*
|
199 |
|
200 |
-
Please feel free to contact me on [Twitter](https://x.com/ledmands) or [LinkedIn](https://linkedin.com/in/lucasedmands) or in
|
|
|
26 |
|
27 |
# *Agent using DQN to play ALE/Pacman-v5*
|
28 |
|
29 |
+
## Update 20 May 2024: Latest DQN model is version 2.8
|
30 |
+
***NOTE:** Video preview is the best model of version 2.8 playing for 10,000 steps. Evaluation metrics are self-reported based on 10 episodes of evaluation. Can be found in `agents/dqn_v2-8/evals.txt`*
|
31 |
|
32 |
This is an agent that is trained using Stable Baselines3 as part of the capstone project for South Hills School in Spring 2024. The goal of this project is to gain familiarity with reinforcement learning concepts and tools, and to train an agent to score up into the 400-500 point range in Pac-Man.
|
33 |
|
|
|
50 |
```bash
|
51 |
pip install -r requirements.txt
|
52 |
```
|
53 |
+
***NOTE:** The `requirements.txt` file will install all the extra dependencies for Stable Baselines and the entire version of TensorFlow. This is for ease of use for Stable Baselines and to ensure that extra data points and tools are available in TensorBoard. If you wish to install dependencies as needed, you can simply skip the `requirements.txt` file and install packages via `pip` as desired.*
|
54 |
|
55 |
---
|
56 |
|
|
|
153 |
|
154 |
##### *watch_agent.py*
|
155 |
|
156 |
+
- This will render the specified agent in real-time. Does not save any evaluation information.
|
157 |
|
158 |
##### *evaluate_agent.py*
|
159 |
|
160 |
+
- This will evaluate a specified agent and append the results to a specified log file.
|
161 |
|
162 |
##### *get_config.py*
|
163 |
|
164 |
+
- This will pull configuration information from the specified agent and save it in JSON format. The data is pulled from the data file in the agent's zip file and strips out the serialized data to make the data more human-readable. The default save file will save to the directory from which the command is run. Best practice is to save the file to the agent's directory.
|
165 |
|
166 |
##### *plot_improvement.py*
|
167 |
|
168 |
+
- This plots the average score and standard deviation of the `dqn_v2` agent over all evaluation episodes during a training run as a bar graph with each training run shown as one bar. Removes the lowest and highest episode scores from each evaluation.
|
169 |
|
170 |
##### *record_video.py*
|
171 |
|
172 |
+
- This will record a video of a specified agent being evaluated. Does not save any evaluation information. *Currently in major development. Currently located in development branch.*
|
173 |
|
174 |
##### *plot_evaluations.py*
|
175 |
|
176 |
+
- This will plot the evaluation data that was gathered during the training run of the specified agent using MatPlotLib. Charts can be saved to a directory of the user's choosing. *Currently in major development. Currently located in development branch.*
|
177 |
|
178 |
---
|
179 |
|
180 |
## *External References*
|
181 |
|
182 |
+
- [Foundations of Deep RL -- 6-lecture series by Pieter Abbeel](https://www.youtube.com/playlist?list=PLwRJQ4m4UJjNymuBM9RdmB3Z9N5-0IlY0). *This is an excellent introduction to some of the concepts behind Deep RL Algorithms. Pieter Abbeel is a machine learning and robotics researcher at UC Berkeley.*
|
183 |
+
- [Training AI to Play Pokemon with Reinforcement Learning](https://www.youtube.com/watch?v=DcYLT37ImBY). *Peter Whidden's video of using Proximal Policy Optimization was a major inspiration for this project and has some fantastic visualizations of the agent learning.*
|
184 |
+
- [Frame Skipping and Pre-Processing for Deep Q-Networks on Atari 2600 Games](https://danieltakeshi.github.io/2016/11/25/frame-skipping-and-preprocessing-for-deep-q-networks-on-atari-2600-games/). *Daniel Takeshi wrote an excellent post that helped me better understand some of the terminology around frame skipping.*
|
185 |
+
- [Playing Atari with Deep Reinforcement Learning](https://arxiv.org/abs/1312.5602). *This paper on Deep Q Networks is a landmark in the field of reinforcement learning.*
|
186 |
+
- [Hugging Face Deep Reinforcement Learning Course](https://huggingface.co/learn/deep-rl-course/unit0/introduction). *Another inspiration for this project and a great place to get hands-on experience.*
|
|
|
|
|
|
|
|
|
|
|
187 |
- [Stable Baselines3](https://stable-baselines3.readthedocs.io/en/master/)
|
188 |
- [RL Zoo](https://rl-baselines3-zoo.readthedocs.io/en/master/)
|
189 |
- [Gymnasium](https://gymnasium.farama.org/)
|
|
|
192 |
|
193 |
## *Contact*
|
194 |
|
195 |
+
Please feel free to contact me on [Twitter](https://x.com/ledmands) or [LinkedIn](https://linkedin.com/in/lucasedmands) or in the Discussion section on the Community tab of this repository!
|