Merged PR 365: Run SB3 training in SubprocVecEnv
## Summary Demonstration notebook show how to use SB3 SubprocVecEnv to vectorise environments to speed up training. ## Test process Successfully run notebook on Windows and Linux. ## Checklist - [X] PR is linked to a **work item** - [X] **acceptance criteria** of linked ticket are met - [X] performed **self-review** of the code - [ ] written **tests** for any new functionality added with this PR - [ ] updated the **documentation** if this PR changes or adds functionality - [ ] written/updated **design docs** if this PR implements new functionality - [X] updated the **change log** - [X] ran **pre-commit** checks for code style - [ ] attended to any **TO-DOs** left in the code Related work items: #2442
This commit is contained in:
@@ -24,6 +24,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||
- Can be enabled via `primaite dev-mode enable`
|
||||
- Activating dev-mode will change the location where the sessions will be output - by default will output where the PrimAITE repository is located
|
||||
- Refactored all air-space usage to that a new instance of AirSpace is created for each instance of Network. This 1:1 relationship between network and airspace will allow parallelization.
|
||||
- Added notebook to demonstrate use of SubprocVecEnv from SB3 to vectorise environments to speed up training.
|
||||
|
||||
|
||||
|
||||
## [Unreleased]
|
||||
|
||||
148
src/primaite/notebooks/multi-processing.ipynb
Normal file
148
src/primaite/notebooks/multi-processing.ipynb
Normal file
@@ -0,0 +1,148 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Simple multi-processing demo using SubprocVecEnv from SB3\n",
|
||||
"Based on a code example provided by Rachael Proctor."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Import packages and read config file."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import yaml\n",
|
||||
"from stable_baselines3 import PPO\n",
|
||||
"from stable_baselines3.common.utils import set_random_seed\n",
|
||||
"from stable_baselines3.common.vec_env import SubprocVecEnv\n",
|
||||
"\n",
|
||||
"from primaite.session.environment import PrimaiteGymEnv\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from primaite.config.load import data_manipulation_config_path"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"with open(data_manipulation_config_path(), 'r') as f:\n",
|
||||
" cfg = yaml.safe_load(f)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Set up training data."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"\n",
|
||||
"EPISODE_LEN = 128\n",
|
||||
"NUM_EPISODES = 10\n",
|
||||
"NO_STEPS = EPISODE_LEN * NUM_EPISODES\n",
|
||||
"BATCH_SIZE = 32\n",
|
||||
"LEARNING_RATE = 3e-4\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Define an environment function."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"\n",
|
||||
"\n",
|
||||
"def make_env(rank: int, seed: int = 0) -> callable:\n",
|
||||
" \"\"\"Wrapper script for _init function.\"\"\"\n",
|
||||
"\n",
|
||||
" def _init() -> PrimaiteGymEnv:\n",
|
||||
" env = PrimaiteGymEnv(env_config=cfg)\n",
|
||||
" env.reset(seed=seed + rank)\n",
|
||||
" model = PPO(\n",
|
||||
" \"MlpPolicy\",\n",
|
||||
" env,\n",
|
||||
" learning_rate=LEARNING_RATE,\n",
|
||||
" n_steps=NO_STEPS,\n",
|
||||
" batch_size=BATCH_SIZE,\n",
|
||||
" verbose=0,\n",
|
||||
" tensorboard_log=\"./PPO_UC2/\",\n",
|
||||
" )\n",
|
||||
" model.learn(total_timesteps=NO_STEPS)\n",
|
||||
" return env\n",
|
||||
"\n",
|
||||
" set_random_seed(seed)\n",
|
||||
" return _init\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Run experiment."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"n_procs = 2\n",
|
||||
"train_env = SubprocVecEnv([make_env(i + n_procs) for i in range(n_procs)])\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": ".venv",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.11"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
Reference in New Issue
Block a user