PrimAITE/src/primaite/notebooks/Using-Episode-Schedules.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Using Episode Schedules\n",
    "\n",
    "PrimAITE supports the ability to use different variations on a scenario at different episodes. This can be used to increase \n",
    "domain randomisation to prevent overfitting, or to set up curriculum learning to train agents to perform more complicated tasks.\n",
    "\n",
    "When using a fixed scenario, a single yaml config file is used. However, to use episode schedules, PrimAITE uses a \n",
    "directory with several config files that work together."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Defining variations in the config file.\n",
    "\n",
    "### Base scenario\n",
    "The base scenario is essentially the same as a fixed yaml configuration, but it can contain placeholders that are \n",
    "populated with different things at runtime each episode. The base scenario contains any network, agent, or settings that\n",
    "remain fixed for the entire training/evaluation session.\n",
    "\n",
    "The placeholders are defined as YAML Aliases and they are denoted by an asterisk (`*placeholder`).\n",
    "\n",
    "### Variations\n",
    "For each variation that could be used in a placeholder, there is a separate yaml file that contains the data that should populate the placeholder.\n",
    "\n",
    "The data that fills the placeholder is defined as a YAML Anchor in a separate file, denoted by an ampersand (`&anchor`).\n",
    "\n",
    "[Learn more about YAML Aliases and Anchors here.](https://www.educative.io/blog/advanced-yaml-syntax-cheatsheet#:~:text=YAML%20Anchors%20and%20Alias)\n",
    "\n",
    "### Schedule\n",
    "Users must define which combination of scenario variations should be loaded in each episode. This takes the form of a\n",
    "YAML file with a relative path to the base scenario and a list of paths to be loaded in during each episode.\n",
    "\n",
    "It takes the following format:\n",
    "```yaml\n",
    "base_scenario: base.yaml\n",
    "schedule:\n",
    "  0: # list of variations to load in at episode 0 (before the first call to env.reset() happens)\n",
    "    - laydown_1.yaml\n",
    "    - attack_1.yaml\n",
    "  1: # list of variations to load in at episode 1 (after the first env.reset() call)\n",
    "    - laydown_2.yaml\n",
    "    - attack_2.yaml\n",
    "```\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Demonstration"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Run `primaite setup` to copy the example config files into the correct directory. Then, import and define config location."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!primaite setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import yaml\n",
    "from primaite.session.environment import PrimaiteGymEnv\n",
    "from primaite import PRIMAITE_PATHS\n",
    "from prettytable import PrettyTable\n",
    "scenario_path = PRIMAITE_PATHS.user_config_path / \"example_config/scenario_with_placeholders\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Base Scenario File\n",
    "Let's view the contents of the base scenario file:\n",
    "\n",
    "It contains all the base settings that stay fixed throughout all episodes, including the `io_settings`, `game` settings, the network layout and the blue agent definition. There are two placeholders: `*greens` and `*reds`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open(scenario_path/\"scenario.yaml\") as f:\n",
    "    print(f.read())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Schedule File\n",
    "Let's view the contents of the schedule file:\n",
    "\n",
    "This file references the base scenario file and defines which variations should be loaded in at each episode. In this instance, there are four episodes, during the first episode `greens_0` and `reds_0` is used, during the second episode `greens_0` and `reds_1` is used, and so on."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open(scenario_path/\"schedule.yaml\") as f:\n",
    "    print(f.read())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Green Agent Variation Files\n",
    "\n",
    "There are three different variants of the green agent setup. In `greens_0`, there are no green agents, in `greens_1` there is a green agent that executes the database client application 80% of the time, and in `greens_2` there is a green agent that executes the database client application 5% of the time.\n",
    "\n",
    "(the difference between `greens_1` and `greens_2` is in the agent name and action probabilities)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open(scenario_path/\"greens_0.yaml\") as f:\n",
    "    print(f.read())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open(scenario_path/\"greens_1.yaml\") as f:\n",
    "    print(f.read())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open(scenario_path/\"greens_2.yaml\") as f:\n",
    "    print(f.read())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Red Agent Variation Files\n",
    "\n",
    "There are three different variants of the red agent setup. In `reds_0`, there are no red agents, in `reds_1` there is a red agent that executes every 20 steps, but in `reds_2` there is a red agent that executes every 2 steps."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open(scenario_path/\"reds_0.yaml\") as f:\n",
    "    print(f.read())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open(scenario_path/\"reds_1.yaml\") as f:\n",
    "    print(f.read())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open(scenario_path/\"reds_2.yaml\") as f:\n",
    "    print(f.read())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Running the simulation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create the environment using the variable config."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "env = PrimaiteGymEnv(game_config=scenario_path)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Episode 0\n",
    "Let' run the episodes to verify that the agents are changing as expected. In episode 0, there should be no green or red agents, just the defender blue agent."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(f\"Current episode number: {env.episode_counter}\")\n",
    "print(f\"Agents present: {list(env.game.agents.keys())}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Episode 1\n",
    "When we reset the environment, it moves onto episode 1, where it will bring in reds_1 for red agent definition.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "env.reset()\n",
    "print(f\"Current episode number: {env.episode_counter}\")\n",
    "print(f\"Agents present: {list(env.game.agents.keys())}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Episode 2\n",
    "When we reset the environment again, it moves onto episode 2, where it will bring in greens_1 and reds_1 for green and red agent definitions. Let's verify the agent names and that they take actions at the defined frequency.\n",
    "\n",
    "Most green actions will be `NODE_APPLICATION_EXECUTE` while red will `DONOTHING` except at steps 10 and 20."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "env.reset()\n",
    "print(f\"Current episode number: {env.episode_counter}\")\n",
    "print(f\"Agents present: {list(env.game.agents.keys())}\")\n",
    "for i in range(21):\n",
    "    env.step(0)\n",
    "\n",
    "table = PrettyTable()\n",
    "table.field_names = [\"step\", \"Green Action\", \"Red Action\"]\n",
    "for i in range(21):\n",
    "    green_action = env.game.agents['green_A'].action_history[i].action\n",
    "    red_action = env.game.agents['red_A'].action_history[i].action\n",
    "    table.add_row([i, green_action, red_action])\n",
    "print(table)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Episode 3\n",
    "When we reset the environment again, it moves onto episode 3, where it will bring in greens_2 and reds_2 for green and red agent definitions. Let's verify the agent names and that they take actions at the defined frequency.\n",
    "\n",
    "Now, green will perform `NODE_APPLICATION_EXECUTE` only 5% of the time, while red will perform `NODE_APPLICATION_EXECUTE` more frequently than before."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "env.reset()\n",
    "print(f\"Current episode number: {env.episode_counter}\")\n",
    "print(f\"Agents present: {list(env.game.agents.keys())}\")\n",
    "for i in range(21):\n",
    "    env.step(0)\n",
    "\n",
    "table = PrettyTable()\n",
    "table.field_names = [\"step\", \"Green Action\", \"Red Action\"]\n",
    "for i in range(21):\n",
    "    green_action = env.game.agents['green_B'].action_history[i].action\n",
    "    red_action = env.game.agents['red_B'].action_history[i].action\n",
    "    table.add_row([i, green_action, red_action])\n",
    "print(table)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Further Episodes\n",
    "\n",
    "Since the schedule definition only goes up to episode 3, if we reset the environment again, we run out of episodes. The environment will simply loop back to the beginning, but it produces a warning message to make users aware that the episodes are being repeated."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "env.reset(); # semicolon suppresses jupyter outputting the observation space.\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}