Merged PR 633: #3110 Final user guide comments.

## Summary Feedback following James' comments ## Test process ## Checklist - [x ] PR is linked to a **work item** - [x] **acceptance criteria** of linked ticket are met - [x] performed **self-review** of the code - [x] written **tests** for any new functionality added with this PR - [x] updated the **documentation** if this PR changes or adds functionality - [x] written/updated **design docs** if this PR implements new functionality - [x] updated the **change log** - [x] ran **pre-commit** checks for code style - [x] attended to any **TO-DOs** left in the code #3110 Final user guide comments. Related work items: #3110
Fix some issues with sphinx rendering text in jupyter notebooks
2025-03-17 09:09:59 +00:00 · 2025-03-14 16:07:08 +00:00 · 2025-03-14 16:01:55 +00:00 · 2025-03-14 16:00:30 +00:00 · 2025-03-14 14:58:36 +00:00 · 2025-03-14 14:57:33 +00:00
603 changed files with 92305 additions and 11437 deletions
--- a/.azure/azure-build-deploy-docs-pipeline.yml
+++ b/.azure/azure-build-deploy-docs-pipeline.yml
@@ -30,7 +30,7 @@ jobs:
    displayName: 'Install PrimAITE for docs autosummary'
  - script: |
-      apt-get install pandoc
+      sudo apt-get install pandoc
    displayName: 'Install Pandoc'
  - script: |
--- a/.azure/azure-ci-build-pipeline.yaml
+++ b/.azure/azure-ci-build-pipeline.yaml
@@ -14,33 +14,38 @@ parameters:
  - name: matrix
    type: object
    default:
-    # - job_name: 'UbuntuPython38'
+    - job_name: 'UbuntuPython39'
-    #   py: '3.8'
+      py: 'v3.9'
-    #   img: 'ubuntu-latest'
+      img: 'ubuntu-latest'
-    #   every_time: false
+      every_time: false
-    #   publish_coverage: false
+      publish_coverage: false
-    - job_name: 'UbuntuPython311'
+    - job_name: 'UbuntuPython310'
-      py: '3.11'
+      py: 'v3.10'
      img: 'ubuntu-latest'
      every_time: true
      publish_coverage: true
-    # - job_name: 'WindowsPython38'
+    - job_name: 'UbuntuPython311'
-    #   py: '3.8'
+      py: 'v3.11'
-    #   img: 'windows-latest'
+      img: 'ubuntu-latest'
-    #   every_time: false
+      every_time: false
-    #   publish_coverage: false
+      publish_coverage: false
-    - job_name: 'WindowsPython311'
+    - job_name: 'WindowsPython39'
-      py: '3.11'
+      py: 'v3.9'
      img: 'windows-latest'
      every_time: false
      publish_coverage: false
-    # - job_name: 'MacOSPython38'
+    - job_name: 'WindowsPython311'
-    #   py: '3.8'
+      py: 'v3.11'
-    #   img: 'macOS-latest'
+      img: 'windows-latest'
-    #   every_time: false
+      every_time: false
-    #   publish_coverage: false
+      publish_coverage: false
    - job_name: 'MacOSPython39'
      py: 'v3.9'
      img: 'macOS-latest'
      every_time: false
      publish_coverage: false
    - job_name: 'MacOSPython311'
-      py: '3.11'
+      py: 'v3.11'
      img: 'macOS-latest'
      every_time: false
      publish_coverage: false
@@ -63,7 +68,7 @@ stages:
            displayName: 'Use Python ${{ item.py }}'
          - script: |
-              python -m pip install pre-commit
+              python -m pip install pre-commit>=6.1
              pre-commit install
              pre-commit run --all-files
            displayName: 'Run pre-commits'
@@ -71,7 +76,6 @@ stages:
          - script: |
              python -m pip install --upgrade pip==23.0.1
              pip install wheel==0.38.4 --upgrade
              pip install setuptools==66 --upgrade
              pip install build==0.10.0
              pip install pytest-azurepipelines
            displayName: 'Install build dependencies'
@@ -102,16 +106,37 @@ stages:
              version: '2.1.x'
          - script: |
-              coverage run -m --source=primaite pytest -v -o junit_family=xunit2 --junitxml=junit/test-results.xml --cov-fail-under=80
+              python run_test_and_coverage.py
              coverage xml -o coverage.xml -i
              coverage html -d htmlcov -i
            displayName: 'Run tests and code coverage'
          # Run the notebooks
          - script: |
              pytest --nbmake -n=auto src/primaite/notebooks --junit-xml=./notebook-tests/notebooks.xml
              notebooks_exit_code=$?
              # Fail step if exit code not equal to 0
              if [ $notebooks_exit_code -ne 0 ]; then
                exit 1
              fi
            displayName: 'Run notebooks on Linux and macOS'
            condition: or(eq(variables['Agent.OS'], 'Linux'), eq(variables['Agent.OS'], 'Darwin'))
          # Run notebooks
          - script: |
              pytest --nbmake -n=auto src/primaite/notebooks --junit-xml=./notebook-tests/notebooks.xml
              set notebooks_exit_code=%ERRORLEVEL%
              rem Fail step if exit code not equal to 0
              if %notebooks_exit_code% NEQ 0 exit /b 1
            displayName: 'Run notebooks on Windows'
            condition: eq(variables['Agent.OS'], 'Windows_NT')
          - task: PublishTestResults@2
            condition: succeededOrFailed()
            displayName: 'Publish Test Results'
            inputs:
              testRunner: JUnit
-              testResultsFiles: 'junit/**.xml'
+              testResultsFiles: |
                'junit/**.xml'
                'notebook-tests/**.xml'
              testRunTitle: 'Publish test results'
              failTaskOnFailedTests: true
--- a/.github/workflows/python-package.yml
+++ b/.github/workflows/python-package.yml
@@ -5,13 +5,11 @@ on:
    branches:
      - main
      - dev
      - dev-gui
      - 'release/**'
  pull_request:
    branches:
      - main
      - dev
      - dev-gui
      - 'release/**'
 jobs:
  build:
@@ -19,7 +17,7 @@ jobs:
    runs-on: ubuntu-latest
    strategy:
      matrix:
-        python-version: ["3.8", "3.9", "3.10"]
+        python-version: ["3.9", "3.10", "3.11"]
    steps:
      - uses: actions/checkout@v3
@@ -54,13 +52,6 @@ jobs:
        run: |
          primaite setup
      - name: Lint with flake8
        run: |
          # stop the build if there are Python syntax errors or undefined names
          flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
          # exit-zero treats all errors as warnings.
          flake8 . --count --exit-zero --max-complexity=10 --max-line-length=120 --statistics
      - name: Run tests
        run: |
          pytest tests/
--- a/.gitignore
+++ b/.gitignore
@@ -54,6 +54,7 @@ cover/
 tests/assets/**/*.png
 tests/assets/**/tensorboard_logs/
 tests/assets/**/checkpoints/
 notebook-tests/*.xml
 # Translations
 *.mo
@@ -148,7 +149,7 @@ cython_debug/
 # IDE
 .idea/
-docs/source/primaite-dependencies.rst
+
 .vscode/
 # outputs
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -1,9 +1,17 @@
 repos:
  - repo: local
    hooks:
      - id: ensure-copyright-clause
        name: ensure copyright clause
        entry: python copyright_clause_pre_commit_hook.py
        language: python
  - repo: http://github.com/pre-commit/pre-commit-hooks
    rev: v4.4.0
    hooks:
      - id: check-yaml
-        exclude: scenario_with_placeholders/
+        exclude: |
          | scenario_with_placeholders/
          | mini_scenario_with_simulation_variation/
      - id: end-of-file-fixer
      - id: trailing-whitespace
      - id: check-added-large-files
@@ -23,7 +31,7 @@ repos:
      - id: isort
        args: [ "--profile", "black" ]
  - repo: http://github.com/PyCQA/flake8
-    rev: 6.0.0
+    rev: 6.1.0
    hooks:
      - id: flake8
        additional_dependencies:
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -2,239 +2,258 @@
 All notable changes to this project will be documented in this file.
-The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
+The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
-## 3.0.0b9
+## [4.0.0] = 2025-03-XX
 - Removed deprecated `PrimaiteSession` class.
 - Added ability to set log levels via configuration.
 - Upgraded pydantic to version 2.7.0
 - Upgraded Ray to version >= 2.9
 - Added ipywidgets to the dependencies
 - Added ability to define scenarios that change depending on the episode number.
 - Standardised Environment API by renaming the config parameter of `PrimaiteGymEnv` from `game_config` to `env_config`
 - Database Connection ID's are now created/issued by DatabaseService and not DatabaseClient
 - Updated DatabaseClient so that it can now have a single native DatabaseClientConnection along with a collection of DatabaseClientConnection's.
 - Implemented the uninstall functionality for DatabaseClient so that all connections are terminated at the DatabaseService.
 - Added the ability for a DatabaseService to terminate a connection.
 - Added active_connection to DatabaseClientConnection so that if the connection is terminated active_connection is set to False and the object can no longer be used.
 - Added additional show functions to enable connection inspection.
 - Updates to agent logging, to include the reward both per step and per episode.
 - Introduced Developer CLI tools to assist with developing/debugging PrimAITE
  - Can be enabled via `primaite dev-mode enable`
  - Activating dev-mode will change the location where the sessions will be output - by default will output where the PrimAITE repository is located
 - Refactored all air-space usage to that a new instance of AirSpace is created for each instance of Network. This 1:1 relationship between network and airspace will allow parallelization.
 - Added notebook to demonstrate use of SubprocVecEnv from SB3 to vectorise environments to speed up training.
 ## [Unreleased]
 - Made requests fail to reach their target if the node is off
 - Added responses to requests
 - Made environment reset completely recreate the game object.
 - Changed the red agent in the data manipulation scenario to randomly choose client 1 or client 2 to start its attack.
 - Changed the data manipulation scenario to include a second green agent on client 1.
 - Refactored actions and observations to be configurable via object name, instead of UUID.
 - Made database patch correctly take 2 timesteps instead of being immediate
 - Made database patch only possible when the software is compromised or good, it's no longer possible when the software is OFF or RESETTING
 - Added a notebook which explains Data manipulation scenario, demonstrates the attack, and shows off blue agent's action space, observation space, and reward function.
 - Made packet capture and system logging optional (off by default). To turn on, change the io_settings.save_pcap_logs and io_settings.save_sys_logs settings in the config.
 - Made observation space flattening optional (on by default). To turn off for an agent, change the `agent_settings.flatten_obs` setting in the config.
 - Added support for SQL INSERT command.
 - Added ability to log each agent's action choices in each step to a JSON file.
 - Removal of Link bandwidth hardcoding. This can now be configured via the network configuraiton yaml. Will default to 100 if not present.
 ### Bug Fixes
 - ACL rules were not resetting on episode reset.
 - ACLs were not showing up correctly in the observation space.
 - Blue agent's ACL actions were being applied against the wrong IP addresses
 - Deleted files and folders did not reset correctly on episode reset.
 - Service health status was using the actual health state instead of the visible health state
 - Database file health status was using the incorrect value for negative rewards
 - Preventing file actions from reaching their intended file
 - The data manipulation attack was triggered at episode start.
 - FTP STOR stored an additional copy on the client machine's filesystem
 - The red agent acted to early
 - Order of service health state
 - Starting a node didn't start the services on it
 - Fixed an issue where the services were still able to run even though the node the service is installed on is turned off
 - The use of NODE_FILE_CHECKHASH and NODE_FOLDER_CHECKHASH in the current release is marked as 'Not Implemented'.
 ### Added
- Network Hardware - Added base hardware module with NIC, SwitchPort, Node, and Link. Nodes have
+-   Log observation space data by episode and step.
-fundamental services like ARP, ICMP, and PCAP running them by default.
+-   Added ability to set the observation threshold for NMNE, file access and application executions.
- Network Transmission - Modelled OSI Model layers 1 through to 5 with various classes for creating network frames and
+-   Added `show_history` method to Agents, allowing you to view actions taken by an agent per step. By default, `do-nothing` actions are omitted.
-transmitting them from a Service/Application, down through the layers, over the wire, and back up through the layers to
+-   New ``node-send-local-command`` action implemented which grants agents the ability to execute commands locally. (Previously limited to remote only)
-a Service/Application another machine.
+-   Added ability to set the observation threshold for NMNE, file access and application executions
- Introduced `Router` and `Switch` classes to manage networking routes more effectively.
+-   UC7 Scenario model changes including Threat Actor Profile, TAP001 and TAP003 agents plus config files and example notebooks.
-  - Added `ACLRule` and `RouteTableEntry` classes as part of the `Router`.
+-   New how-to guides describing how to use the new extension system to customise actions, environments and rewards.
- New `.show()` methods in all network component classes to inspect the state in either plain text or markdown formats.
+-   Added version and plugin fields to YAML configs to ensure compatibility with future versions.
- Added `Computer` and `Server` class to better differentiate types of network nodes.
+-   Network Node Adder class provides a framework for adding nodes to a network in a standardised way.
 - Integrated a new Use Case 2 network into the system.
 - New unit tests to verify routing between different subnets using `.ping()`.
 - system - Added the core structure of Application, Services, and Components. Also added a SoftwareManager and
 SessionManager.
 - Permission System - each action can define criteria that will be used to permit or deny agent actions.
 - File System - ability to emulate a node's file system during a simulation
 - Example notebooks - There are 5 jupyter notebook which walk through using PrimAITE
  1. Training a Stable Baselines 3 agent
  2. Training a single agent system using Ray RLLib
  3. Training a multi-agent system Ray RLLib
  4. Data manipulation end to end demonstration
  5. Data manipulation scenario with customised red agents
 - Database:
  - `DatabaseClient` and `DatabaseService` created to allow emulation of database actions
  - Ability for `DatabaseService` to backup its data to another server via FTP and restore data from backup
 - Red Agent Services:
  - Data Manipulator Bot - A red agent service which sends a payload to a target machine. (By default this payload is a SQL query that breaks a database). The attack runs in stages with a random, configurable probability of succeeding.
  - `DataManipulationAgent` runs the Data Manipulator Bot according to a configured start step, frequency and variance.
 - DNS Services: `DNSClient` and `DNSServer`
 - FTP Services: `FTPClient` and `FTPServer`
 - HTTP Services: `WebBrowser` to simulate a web client and `WebServer`
 - NTP Services: `NTPClient` and `NTPServer`
 - **RouterNIC Class**: Introduced a new class `RouterNIC`, extending the standard `NIC` functionality. This class is specifically designed for router operations, optimizing the processing and routing of network traffic.
  - **Custom Layer-3 Processing**: The `RouterNIC` class includes custom handling for network frames, bypassing standard Node NIC's Layer 3 broadcast/unicast checks. This allows for more efficient routing behavior in network scenarios where router-specific frame processing is required.
  - **Enhanced Frame Reception**: The `receive_frame` method in `RouterNIC` is tailored to handle frames based on Layer 2 (Ethernet) checks, focusing on MAC address-based routing and broadcast frame acceptance.
 - **Subnet-Wide Broadcasting for Services and Applications**: Implemented the ability for services and applications to conduct broadcasts across an entire IPv4 subnet within the network simulation framework.
 - Introduced the `NetworkInterface` abstract class to provide a common interface for all network interfaces. Subclasses are divided into two main categories: `WiredNetworkInterface` and `WirelessNetworkInterface`, each serving as an abstract base class (ABC) for more specific interface types. Under `WiredNetworkInterface`, the subclasses `NIC` and `SwitchPort` were added. For wireless interfaces, `WirelessNIC` and `WirelessAccessPoint` are the subclasses under `WirelessNetworkInterface`.
 - Added `Layer3Interface` as an abstract base class for networking functionalities at layer 3, including IP addressing and routing capabilities. This class is inherited by `NIC`, `WirelessNIC`, and `WirelessAccessPoint` to provide them with layer 3 capabilities, facilitating their role in both wired and wireless networking contexts with IP-based communication.
 - Created the `ARP` and `ICMP` service classes to handle Address Resolution Protocol operations and Internet Control Message Protocol messages, respectively, with `RouterARP` and `RouterICMP` for router-specific implementations.
 - Created `HostNode` as a subclass of `Node`, extending its functionality with host-specific services and applications. This class is designed to represent end-user devices like computers or servers that can initiate and respond to network communications.
 - Introduced a new `IPV4Address` type in the Pydantic model for enhanced validation and auto-conversion of IPv4 addresses from strings using an `ipv4_validator`.
 - Comprehensive documentation for the Node and its network interfaces, detailing the operational workflow from frame reception to application-level processing.
 - Detailed descriptions of the Session Manager and Software Manager functionalities, including their roles in managing sessions, software services, and applications within the simulation.
 - Documentation for the Packet Capture (PCAP) service and SysLog functionality, highlighting their importance in logging network frames and system events, respectively.
 - Expanded documentation on network devices such as Routers, Switches, Computers, and Switch Nodes, explaining their specific processing logic and protocol support.
 - **Firewall Node**: Introduced the `Firewall` class extending the functionality of the existing `Router` class. The `Firewall` class incorporates advanced features to scrutinize, direct, and filter traffic between various network zones, guided by predefined security rules and policies. Key functionalities include:
    - Access Control Lists (ACLs) for traffic filtering based on IP addresses, protocols, and port numbers.
    - Network zone segmentation for managing traffic across external, internal, and DMZ (De-Militarized Zone) networks.
    - Interface configuration to establish connectivity and define network parameters for external, internal, and DMZ interfaces.
    - Protocol and service management to oversee traffic and enforce security policies.
    - Dynamic traffic processing and filtering to ensure network security and integrity.
 - `AirSpace` class to simulate wireless communications, managing wireless interfaces and facilitating the transmission of frames within specified frequencies.
 - `AirSpaceFrequency` enum for defining standard wireless frequencies, including 2.4 GHz and 5 GHz bands, to support realistic wireless network simulations.
 - `WirelessRouter` class, extending the `Router` class, to incorporate wireless networking capabilities alongside traditional wired connections. This class allows the configuration of wireless access points with specific IP settings and operating frequencies.
 - Documentation Updates:
    - Examples include how to set up PrimAITE session via config
    - Examples include how to create nodes and install software via config
    - Examples include how to set up PrimAITE session via Python
    - Examples include how to create nodes and install software via Python
    - Added missing ``DoSBot`` documentation page
    - Added diagrams where needed to make understanding some things easier
    - Templated parts of the documentation to prevent unnecessary repetition and for easier maintaining of documentation
    - Separated documentation pages of some items i.e. client and server software were on the same pages - which may make things confusing
    - Configuration section at the bottom of the software pages specifying the configuration options available (and which ones are optional)
 - Ability to add ``Firewall`` node via config
 - Ability to add ``Router`` routes via config
 - Ability to add ``Router``/``Firewall`` ``ACLRule`` via config
 - NMNE capturing capabilities to `NetworkInterface` class for detecting and logging Malicious Network Events.
 - New `nmne_config` settings in the simulation configuration to enable NMNE capturing and specify keywords such as "DELETE".
 - Router-specific SessionManager Implementation: Introduced a specialized version of the SessionManager tailored for router operations. This enhancement enables the SessionManager to determine the routing path by consulting the route table.
 ### Changed
- Integrated the RouteTable into the Routers frame processing.
+-   ACLs are no longer applied to layer-2 traffic.
- Frames are now dropped when their TTL reaches 0
+-   Random number seed values are recorded in simulation/seed.log if the seed is set in the config file
- **NIC Functionality Update**: Updated the Network Interface Card (`NIC`) functionality to support Layer 3 (L3) broadcasts.
+    or `generate_seed_value` is set to `true`.
-  - **Layer 3 Broadcast Handling**: Enhanced the existing `NIC` classes to correctly process and handle Layer 3 broadcasts. This update allows devices using standard NICs to effectively participate in network activities that involve L3 broadcasting.
+-   ARP .show() method will now include the port number associated with each entry.
-  - **Improved Frame Reception Logic**: The `receive_frame` method of the `NIC` class has been updated to include additional checks and handling for L3 broadcasts, ensuring proper frame processing in a wider range of network scenarios.
+-   The behaviour that services, applications, files and folders require scanning before their observations are updated is now optional.
- Standardised the way network interfaces are accessed across all `Node` subclasses (`HostNode`, `Router`, `Switch`) by maintaining a comprehensive `network_interface` attribute. This attribute captures all network interfaces by their port number, streamlining the management and interaction with network interfaces across different types of nodes.
+-   Updated the `Terminal` class to provide response information when sending remote command execution.
- Refactored all tests to utilise new `Node` subclasses (`Computer`, `Server`, `Router`, `Switch`) instead of creating generic `Node` instances and manually adding network interfaces. This change aligns test setups more closely with the intended use cases and hierarchies within the network simulation framework.
+-   Agents now follow a common configuration format, simplifying the configuration of agents and their extensibilty.
- Updated all tests to employ the `Network()` class for managing nodes and their connections, ensuring a consistent and structured approach to setting up network topologies in testing scenarios.
+-   Actions within PrimAITE are now extensible, allowing for plugin support.
- **ACLRule Wildcard Masking**: Updated the `ACLRule` class to support IP ranges using wildcard masking. This enhancement allows for more flexible and granular control over traffic filtering, enabling the specification of broader or more specific IP address ranges in ACL rules.
+-   Added a config schema to `ObservationManager`, `ActionManager`, and `RewardFunction`.
- Updated `NetworkInterface` documentation to reflect the new NMNE capturing features and how to use them.
+-   Streamlined the way agents are created from config
- Integration of NMNE capturing functionality within the `NICObservation` class.
+-   Agent config no longer requires a dummy action space if the action space is empty, the same applies for observation space and reward function
- Changed blue action set to enable applying node scan, reset, start, and shutdown to every host in data manipulation scenario
+-   Actions now support a config schema, to allow yaml data validation and default parameter values
-
+-   Action parameters are no longer defined through IDs, instead meaningful data is provided directly in the action map
-### Removed
+-   Test and example YAMLs have been updated to match the new agent and action schemas, such as:
- Removed legacy simulation modules: `acl`, `common`, `environment`, `links`, `nodes`, `pol`
+    -   Removed empty action spaces, observation spaces, or reward spaces for agent which didn't use them
- Removed legacy training modules
+    -   Relabelled action parameters to match the new action config schemas, and updated the values to no longer rely on indices
- Removed tests for legacy code
+    -   Removed action space options which were previously used for assigning meaning to action space IDs
 -   Updated tests that don't use YAMLs to still use the new action and agent schemas
 -   Nodes now use a config schema and are extensible, allowing for plugin support.
 -   Node tests have been updated to use the new node config schemas when not using YAML files.
 -   Documentation has been updated to include details of extensibility with PrimAITE.
 -   Software is created in the GOOD health state instead of UNUSED.
 -   Standardised naming convention for YAML config files using kebab-case.
    This naming convention is used for configuring software, observations, actions and node types.
    NB: A migration guide will be available with this release.
 ### Fixed
- Addressed network transmission issues that previously allowed ARP requests to be incorrectly routed and repeated across different subnets. This fix ensures ARP requests are correctly managed and confined to their appropriate network segments.
+-   DNS client no longer fails to check its cache if a DNS server address is missing.
- Resolved problems in `Node` and its subclasses where the default gateway configuration was not properly utilized for communications across different subnets. This correction ensures that nodes effectively use their configured default gateways for outbound communications to other network segments, thereby enhancing the network's routing functionality and reliability.
+-   DNS client now correctly inherits the node's DNS address configuration setting.
- Network Interface Port name/num being set properly for sys log and PCAP output.
+-   ACL observations now include the ACL at index 0.
 -   SoftwareManager.show() correctly displays all the software associated with a port whether the software is listening or not.
 ## [3.3.0] - 2024-09-04
 ### Added
 -   Random Number Generator Seeding by specifying a random number seed in the config file.
 -   Implemented Terminal service class, providing a generic terminal simulation.
 -   Added `User`, `UserManager` and `UserSessionManager` to enable the creation of user accounts and login on Nodes.
 -   Added actions to establish SSH connections, send commands remotely and terminate SSH connections.
 -   Added actions to change users' passwords.
 -   Added a `listen_on_ports` set in the `IOSoftware` class to enable software listening on ports in addition to the
    main port they're assigned.
 -   Added two new red applications: ``C2Beacon`` and ``C2Server`` which aim to simulate malicious network infrastructure.
    Refer to the ``Command and Control Application Suite E2E Demonstration`` notebook for more information.
 -   Added reward calculation details to AgentHistoryItem.
 -   Added a new Privilege-Escalation-and Data-Loss-Example.ipynb notebook with a realistic cyber scenario focusing on
    internal privilege escalation and data loss through the manipulation of SSH access and Access Control Lists (ACLs).
 -   Added a new extensible `NetworkNodeAdder` class for convenient addition of sets of nodes based on a simplified config.
 ### Changed
 -   File and folder observations can now be configured to always show the true health status, or require scanning like before.
 -   It's now possible to disable stickiness on reward components, meaning their value returns to 0 during timesteps where agent don't issue the corresponding action. Affects `GreenAdminDatabaseUnreachablePenalty`, `WebpageUnavailablePenalty`, `WebServer404Penalty`
 -   Node observations can now be configured to show the number of active local and remote logins.
 -   Ports and IP Protocols no longer use enums. They are defined in dictionary lookups and are handled by custom validation to enable extensibility with plugins.
 -   Changed AirSpaceFrequency to a data transfer object with a registry to allow extensibility
 -   Changed the Office LAN creation convenience function to follow the new `NetworkNodeAdder` pattern. Office LANs can now also be defined in YAML config.
 ### Fixed
 -   Folder observations showing the true health state without scanning (the old behaviour can be reenabled via config)
 -   Updated `SoftwareManager` `install` and `uninstall` to handle all functionality that was being done at the `install`
    and `uninstall` methods in the `Node` class.
 -   Updated the `receive_payload_from_session_manager` method in `SoftwareManager` so that it now sends a copy of the
    payload to any software listening on the destination port of the `Frame`.
 -   Made the `show` method of `Network` show all node types, including ones registered at runtime
 ### Removed
 -   Removed the `install` and `uninstall` methods in the `Node` class.
 ## [3.2.0] - 2024-07-18
 ### Added
 -   Action penalty is a reward component that applies a negative reward for doing any action other than DONOTHING
 -   Application configuration actions for RansomwareScript, DatabaseClient, and DoSBot applications
 -   Ability to configure how long it takes to apply the service fix action
 -   Terminal service using SSH
 -   Airspaces now track the amount of data being transmitted, viewable using the `show_bandwidth_load` method
 -   Tests to verify that airspace bandwidth is applied correctly and can be configured via YAML
 -   Agent logging for agents' internal decision logic
 -   Action masking in all PrimAITE environments
 ### Changed
 -   Application registry was moved to the `Application` class and now updates automatically when Application is subclassed
 -   Databases can no longer respond to request while performing a backup
 -   Application install no longer accepts an `ip_address` parameter
 -   Application install action can now be used on all applications
 -   Actions have additional logic for checking validity
 -   Frame `size` attribute now includes both core size and payload size in bytes
 -   The `speed` attribute of `NetworkInterface` has been changed from `int` to `float`
 -   Tidied up CHANGELOG
 -   Enhanced `AirSpace` logic to block transmissions that would exceed the available capacity.
 -   Updated `_can_transmit` function in `Link` to account for current load and total bandwidth capacity, ensuring transmissions do not exceed limits.
 ### Fixed
 -   Links and airspaces can no longer transmit data if this would exceed their bandwidth
 ## [3.1.0] - 2024-06-25
 ### Added
 -   Observations for traffic amounts on host network interfaces
 -   NMAP application network discovery, including ping scan and port scan
 -   NMAP actions
 -   Automated adding copyright notices to source files
 -   More file types
 -   `show` method to files
 -   `model_dump` methods to network enums to enable better logging
 ### Changed
 -   Updated file system actions to stop failures when creating duplicate files
 -   Improved parsing of ACL add rule actions to make some parameters optional
 ### Fixed
 -   Fixed database client uninstall failing due to persistent connections
 -   Fixed packet storm when pinging broadcast addresses
 ## [3.0.0] - 2024-06-10
 ### Added
 -   New simulation module
 -   Multi agent reinforcement learning support
 -   File system class to manage files and folders
 -   Software for nodes that can have its own behaviour
 -   Software classes to model FTP, Postgres databases, web traffic, NTP
 -   Much more detailed network simulation including packets, links, and network interfaces
 -   More node types: host, computer, server, router, switch, wireless router, and firewalls
 -   Network Hardware - NIC, SwitchPort, Node, and Link. Nodes have fundamental services like ARP, ICMP, and PCAP running them by default.
 -   Malicious network event detection
 -   New `game` module for managing agents
 -   ACL rule wildcard masking
 -   Network broadcasting
 -   Wireless transmission
 -   More detailed documentation
 -   Example jupyter notebooks to demonstrate new functionality
 -   More reward components
 -   Packet capture logs
 -   Node system logs
 -   Per-step full simulation state log
 -   Attack randomisation with respect to timing and attack source
 -   Ability to set log level via CLI
 -   Ability to vary the YAML configuration per-episode
 -   Developer CLI tools for enhanced debugging (with `primaite dev-mode enable`)
 -   `show` function to many simulation objects to inspect their current state
 ### Changed
 -   Decoupled the environment from the simulation by adding the `game` interface layer
 -   Made agents share a common base class
 -   Added more actions
 -   Made all agents use CAOS actions, including red and green agents
 -   Reworked YAML configuration file schema
 -   Reworked the reward system to be component-based
 -   Changed agent logs to create a JSON output instead of CSV with more detailed action information
 -   Made observation space flattening optional
 -   Made all logging optional
 -   Agent actions now provide responses with a success code
 ### Removed
 -   Legacy simulation modules
 -   Legacy training modules
 -   Tests for legacy code
 -   Hardcoded IERs and PoL, traffic generation is now handled by agents and software
 -   Inbuilt agent training scripts
 ## [2.0.0] - 2023-07-26
 ### Added
- Command Line Interface (CLI) for easy access and streamlined usage of PrimAITE.
+-   Command Line Interface (CLI) for easy access and streamlined usage of PrimAITE.
- Application Directories to enable PrimAITE as a Python package with predefined directories for storage.
+-   Application Directories to enable PrimAITE as a Python package with predefined directories for storage.
- Support for Ray Rllib, allowing training of PPO and A2C agents using Stable Baselines3 and Ray RLlib.
+-   Support for Ray Rllib, allowing training of PPO and A2C agents using Stable Baselines3 and Ray RLlib.
- Random Red Agent to train the blue agent against, with options for randomised Red Agent `POL` and `IER`.
+-   Random Red Agent to train the blue agent against, with options for randomised Red Agent `POL` and `IER`.
- Repeatability of sessions through seed settings, and deterministic or stochastic evaluation options.
+-   Repeatability of sessions through seed settings, and deterministic or stochastic evaluation options.
- Session loading to revisit previously run sessions for SB3 Agents.
+-   Session loading to revisit previously run sessions for SB3 Agents.
- Agent Session Classes (`AgentSessionABC` and `HardCodedAgentSessionABC`) to standardise agent training with a common interface.
+-   Agent Session Classes (`AgentSessionABC` and `HardCodedAgentSessionABC`) to standardise agent training with a common interface.
- Standardised Session Output in a structured format in the user's app sessions directory, providing four types of outputs:
+-   Standardised Session Output in a structured format in the user's app sessions directory, providing four types of outputs: Session Metadata, Results, Diagrams, Trained agents.
-  1. Session Metadata
+-   Configurable Observation Space managed by the `ObservationHandler` class for a more flexible observation space setup.
-  2. Results
+-   Benchmarking of PrimAITE performance, showcasing session and step durations for reference.
-  3. Diagrams
+-   Documentation overhaul, including automatic API and test documentation with recursive Sphinx auto-summary, using the Furo theme for responsive light/dark theme, and enhanced navigation with `sphinx-code-tabs` and `sphinx-copybutton`.
  4. Saved agents (training checkpoints and a final trained agent).
 - Configurable Observation Space managed by the `ObservationHandler` class for a more flexible observation space setup.
 - Benchmarking of PrimAITE performance, showcasing session and step durations for reference.
 - Documentation overhaul, including automatic API and test documentation with recursive Sphinx auto-summary, using the Furo theme for responsive light/dark theme, and enhanced navigation with `sphinx-code-tabs` and `sphinx-copybutton`.
 ### Changed
- Action Space updated to discrete spaces, introducing a new `ANY` action space option for combined `NODE` and `ACL` actions.
+-   Action Space updated to discrete spaces, introducing a new `ANY` action space option for combined `NODE` and `ACL` actions.
- Improved `Node` attribute naming convention for consistency, now adhering to `Pascal Case`.
+-   Improved `Node` attribute naming convention for consistency, now adhering to `Pascal Case`.
- Package Structure has been refactored for better build, distribution, and installation, with all source code now in the `src/` directory, and the `PRIMAITE` Python package renamed to `primaite` to adhere to PEP-8 Package & Module Names.
+-   Package Structure has been refactored for better build, distribution, and installation, with all source code now in the `src/` directory, and the `PRIMAITE` Python package renamed to `primaite` to adhere to PEP-8 Package & Module Names.
- Docs and Tests now sit outside the `src/` directory.
+-   Docs and Tests now sit outside the `src/` directory.
- Non-python files (example config files, Jupyter notebooks, etc.) now sit inside a `*/_package_data/` directory in their respective sub-packages.
+-   Non-python files (example config files, Jupyter notebooks, etc.) now sit inside a `*/_package_data/` directory in their respective sub-packages.
- All dependencies are now defined in the `pyproject.toml` file.
+-   All dependencies are now defined in the `pyproject.toml` file.
- Introduced individual configuration for the number of episodes and time steps for training and evaluation sessions, with separate config values for each.
+-   Introduced individual configuration for the number of episodes and time steps for training and evaluation sessions, with separate config values for each.
- Decoupled the lay down config file from the training config, allowing more flexibility in configuration management.
+-   Decoupled the lay down config file from the training config, allowing more flexibility in configuration management.
- Updated `Transactions` to only report pre-action observation, improving the CSV header and providing more human-readable descriptions for columns relating to observations.
+-   Updated `Transactions` to only report pre-action observation, improving the CSV header and providing more human-readable descriptions for columns relating to observations.
- Changes to `AccessControlList`, where the `acl` dictionary is now a list to accommodate changes to ACL action space and positioning of `ACLRules` inside the list to signal their level of priority.
+-   Changes to `AccessControlList`, where the `acl` dictionary is now a list to accommodate changes to ACL action space and positioning of `ACLRules` inside the list to signal their level of priority.
 ### Fixed
- Various bug fixes, including Green IERs separation, correct clearing of links in the reference environment, and proper reward calculation.
+-   Various bug fixes, including Green IERs separation, correct clearing of links in the reference environment, and proper reward calculation.
- Logic to check if a node is OFF before executing actions on the node by the blue agent, preventing erroneous state changes.
+-   Logic to check if a node is OFF before executing actions on the node by the blue agent, preventing erroneous state changes.
- Improved functionality of Resetting a Node, adding "SHUTTING DOWN" and "BOOTING" operating states for more reliable reset commands.
+-   Improved functionality of Resetting a Node, adding "SHUTTING DOWN" and "BOOTING" operating states for more reliable reset commands.
- Corrected the order of actions in the `Primaite` env to ensure the blue agent uses the current state for decision-making.
+-   Corrected the order of actions in the `Primaite` env to ensure the blue agent uses the current state for decision-making.
 ## [1.1.1] - 2023-06-27
-### Bug Fixes
+### Fixed
-* Fixed bug whereby 'reference' environment links reach bandwidth capacity and are never cleared due to green & red IERs being applied to them. This bug had a knock-on effect that meant IERs were being blocked based on the full capacity of links on the reference environment which was not correct; they should only be based on the link capacity of the 'live' environment. This fix has been addressed by:
+-   Fixed bug whereby 'reference' environment links reach bandwidth capacity and are never cleared due to green & red IERs being applied to them. This bug had a knock-on effect that meant IERs were being blocked based on the full capacity of links on the reference environment which was not correct; they should only be based on the link capacity of the 'live' environment. This fix has been addressed by:
-  * Implementing a reference copy of all green IERs (`self.green_iers_reference`).
+    -   Implementing a reference copy of all green IERs (`self.green_iers_reference`).
-  * Clearing the traffic on reference IERs at the same time as the live IERs.
+    -   Clearing the traffic on reference IERs at the same time as the live IERs.
-  * Passing the `green_iers_reference` to the `apply_iers` function at the reference stage.
+    -   Passing the `green_iers_reference` to the `apply_iers` function at the reference stage.
-  * Passing the `green_iers_reference` as an additional argument to `calculate_reward_function`.
+    -   Passing the `green_iers_reference` as an additional argument to `calculate_reward_function`.
-  * Updating the green IERs section of the `calculate_reward_function` to now take into account both the green reference IERs and live IERs. The `green_ier_blocked` reward is only applied if the IER is blocked in the live environment but is running in the reference environment.
+    -   Updating the green IERs section of the `calculate_reward_function` to now take into account both the green reference IERs and live IERs. The `green_ier_blocked` reward is only applied if the IER is blocked in the live environment but is running in the reference environment.
-  * Re-ordering the actions taken as part of the step function to ensure the blue action happens first before other changes.
+    -   Re-ordering the actions taken as part of the step function to ensure the blue action happens first before other changes.
-  * Removing the unnecessary "Reapply PoL and IERs" action from the step function.
+    -   Removing the unnecessary "Reapply PoL and IERs" action from the step function.
-  * Moving the deep-copy of nodes and links to below the "Implement blue action" stage of the step function.
+    -   Moving the deep-copy of nodes and links to below the "Implement blue action" stage of the step function.
 ## [1.1.0] - 2023-03-13
 ### Added
-* The user can now initiate either a TRAINING session or an EVALUATION (test) session with the Stable Baselines 3 (SB3) agents via the config_main.yaml file. During evaluation/testing, the agent policy will be fixed (no longer learning) and subjected to the SB3 `evaluate_policy()` function.
+-   The user can now initiate either a TRAINING session or an EVALUATION (test) session with the Stable Baselines 3 (SB3) agents via the config_main.yaml file. During evaluation/testing, the agent policy will be fixed (no longer learning) and subjected to the SB3 `evaluate_policy()` function.
-* The user can choose whether a saved agent is loaded into the session (with reference to a URL) via the `config_main.yaml` file. They specify a Boolean true/false indicating whether a saved agent should be loaded, and specify the URL and file name.
+-   The user can choose whether a saved agent is loaded into the session (with reference to a URL) via the `config_main.yaml` file. They specify a Boolean true/false indicating whether a saved agent should be loaded, and specify the URL and file name.
-* Active and Service nodes now possess a new "File System State" attribute. This attribute is permitted to have the states GOOD, CORRUPT, DESTROYED, REPAIRING, and RESTORING. This new feature affects the following components:
+-   Active and Service nodes now possess a new "File System State" attribute. This attribute is permitted to have the states GOOD, CORRUPT, DESTROYED, REPAIRING, and RESTORING. This new feature affects the following components:
-  * Blue agent observation space;
+    -   Blue agent observation space;
-  * Blue agent action space;
+    -   Blue agent action space;
-  * Reward function;
+    -   Reward function;
-  * Node pattern-of-life.
+    -   Node pattern-of-life.
-* The Red Agent node pattern-of-life has been enhanced so that node PoL is triggered by an 'initiator'. The initiator is either DIRECT (state change is applied to the node without any conditions), IER (state change is applied to the node based on IER entry condition), or SERVICE (state change is applied to the node based on a service state condition on the same node or a different node within the network).
+-   The Red Agent node pattern-of-life has been enhanced so that node PoL is triggered by an 'initiator'. The initiator is either DIRECT (state change is applied to the node without any conditions), IER (state change is applied to the node based on IER entry condition), or SERVICE (state change is applied to the node based on a service state condition on the same node or a different node within the network).
-* New default config named "config_5_DATA_MANIPULATION.yaml" and associated Training Use Case Profile.
+-   New default config named "config_5_DATA_MANIPULATION.yaml" and associated Training Use Case Profile.
-* NodeStateInstruction has been split into `NodeStateInstructionGreen` and `NodeStateInstructionRed` to reflect the changes within the red agent pattern-of-life capability.
+-   NodeStateInstruction has been split into `NodeStateInstructionGreen` and `NodeStateInstructionRed` to reflect the changes within the red agent pattern-of-life capability.
-* The reward function has been enhanced so that node attribute states of resetting, patching, repairing, and restarting contribute to the overall reward value.
+-   The reward function has been enhanced so that node attribute states of resetting, patching, repairing, and restarting contribute to the overall reward value.
-* The User Guide has been updated to reflect all the above changes.
+-   The User Guide has been updated to reflect all the above changes.
 ### Changed
-* "config_1_DDOS_BASIC.yaml" modified to make it more simplistic to aid evaluation testing.
+-   "config_1_DDOS_BASIC.yaml" modified to make it more simplistic to aid evaluation testing.
-* "config_2_DDOS_BASIC.yaml" updated to reflect the addition of the File System State and the Red Agent node pattern-of-life enhancement.
+-   "config_2_DDOS_BASIC.yaml" updated to reflect the addition of the File System State and the Red Agent node pattern-of-life enhancement.
-* "config_3_DOS_VERY_BASIC.yaml" updated to reflect the addition of the File System State and the Red Agent node pattern-of-life enhancement.
+-   "config_3_DOS_VERY_BASIC.yaml" updated to reflect the addition of the File System State and the Red Agent node pattern-of-life enhancement.
-* "config_UNIT_TEST.yaml" is a copy of the new "config_5_DATA_MANIPULATION.yaml" file.
+-   "config_UNIT_TEST.yaml" is a copy of the new "config_5_DATA_MANIPULATION.yaml" file.
-* Updates to Transactions.
+-   Updates to Transactions.
 ### Fixed
-* Fixed "config_2_DDOS_BASIC.yaml" by adding another ACL rule to allow traffic to flow from Node 9 to Node 3. Previously, there was no rule, so one of the green IERs could not flow by default.
+-   Fixed "config_2_DDOS_BASIC.yaml" by adding another ACL rule to allow traffic to flow from Node 9 to Node 3. Previously, there was no rule, so one of the green IERs could not flow by default.
 [unreleased]: https://github.com/Autonomous-Resilient-Cyber-Defence/PrimAITE/compare/v2.0.0...HEAD
 [2.0.0]: https://github.com/Autonomous-Resilient-Cyber-Defence/PrimAITE/releases/tag/v2.0.0
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -13,9 +13,6 @@
 * [Fork the repository](https://github.com/Autonomous-Resilient-Cyber-Defence/PrimAITE/fork).
 * Install the pre-commit hook with `pre-commit install`.
 * Implement the bug fix.
 * Update documentation where applicable.
 * Update the **UNRELEASED** section of the [CHANGELOG.md](CHANGELOG.md) file
 * Write a suitable test/tests.
 * Commit the bug fix to the dev branch on your fork. If the bug has an open issue under [Issues](https://github.com/Autonomous-Resilient-Cyber-Defence/PrimAITE/issues), reference the issue in the commit message (e.g. #1 references issue 1).
 * Submit a pull request from your dev branch to the Autonomous-Resilient-Cyber-Defence/PrimAITE dev branch. Again, if the bug has an open issue under [Issues](https://github.com/Autonomous-Resilient-Cyber-Defence/PrimAITE/issues), reference the issue in the pull request description.
--- a/11
+++ b/11
@@ -1,21 +1,24 @@
-MIT License
+MIT License License
-Copyright (c) 2023 - 2025 Defence Science and Technology Laboratory UK (https://dstl.gov.uk)
+MIT License Conditions
-Permission is hereby granted, free of charge, to any person obtaining a copy
+These MIT License conditions confirm the provision of the following artefacts as MIT License by Defence Science and Technology
 of this software and associated documentation files (the "Software"), to deal
 in the Software without restriction, including without limitation the rights
 to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 copies of the Software, and to permit persons to whom the Software is
 furnished to do so, subject to the following conditions:
 The above copyright notice and this permission notice shall be included in all
 copies or substantial portions of the Software.
 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
--- a/MANIFEST.in
+++ b/MANIFEST.in
@@ -1,3 +1,2 @@
 include src/primaite/setup/_package_data/primaite_config.yaml
 include src/primaite/config/_package_data/*.yaml
 include src/primaite/simulator/_package_data/*.ipynb
--- a/README.md
+++ b/README.md
@@ -24,6 +24,8 @@ PrimAITE presents the following features:
 - Support for multiple agents, each having their own customisable observation space, action space, and reward function definition, and either deterministic or RL-directed behaviour
 Whilst PrimAITE ships with a number of example modelled scenarios (a.k.a. Use Cases), it has not been developed to mandate the solving of a single cyber challenge, and instead provides a highly flexible environment application that can be extended and reconfigured by the user to suit their specific cyber defence training and evaluation needs. PrimAITE provides default networks, red agent and green agent behaviour, reward functions, and action / observation space configuration, all of which can be utilised out of the box, but which ultimately can (and in some instances should) be built upon and / or reconfigured to meet the needs of different defensive agent developers. The PrimAITE user guide provides comprehensive instruction on all PrimAITE features, functionality and components, and can be consulted in order to help guide users in any reconfiguration or enhancements they wish to undertake; a library of example Jupyter notebooks are also provided to support such work.
 ## Getting Started with PrimAITE
 ### 💫 Installation
@@ -33,7 +35,7 @@ Currently, the PrimAITE wheel can only be installed from GitHub. This may change
 #### Windows (PowerShell)
 **Prerequisites:**
-* Manual install of Python >= 3.8 < 3.12
+* Manual install of Python >= 3.9 < 3.12
 **Install:**
@@ -43,7 +45,7 @@ cd ~\primaite
 python3 -m venv .venv
 attrib +h .venv /s /d # Hides the .venv directory
 .\.venv\Scripts\activate
-pip install primaite-3.0.0-py3-none-any.whl[rl]
+pip install primaite-{VERSION}-py3-none-any.whl[rl]
 primaite setup
 ```
@@ -66,7 +68,7 @@ mkdir ~/primaite
 cd ~/primaite
 python3 -m venv .venv
 source .venv/bin/activate
-pip install primaite-3.0.0-py3-none-any.whl[rl]
+pip install primaite-{VERSION}-py3-none-any.whl[rl]
 primaite setup
 ```
--- a/_config.yml
+++ b/_config.yml
@@ -0,0 +1,3 @@
 # Used by nbmake to change build pipeline notebook timeout
 execute:
  timeout: 600
--- a/benchmark/benchmark.py
+++ b/benchmark/benchmark.py
@@ -1,4 +1,4 @@
-# © Crown-owned copyright 2024, Defence Science and Technology Laboratory UK
+# © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 from typing import Any, Dict, Optional, Tuple
 from gymnasium.core import ObsType
--- a/benchmark/primaite_benchmark.py
+++ b/benchmark/primaite_benchmark.py
@@ -1,11 +1,11 @@
-# © Crown-owned copyright 2024, Defence Science and Technology Laboratory UK
+# © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 import json
 import shutil
 from datetime import datetime
 from pathlib import Path
 from typing import Any, Dict, Final, Tuple
-from report import build_benchmark_md_report
+from report import build_benchmark_md_report, md2pdf
 from stable_baselines3 import PPO
 import primaite
@@ -159,6 +159,13 @@ def run(
    learning_rate: float = 3e-4,
 ) -> None:
    """Run the PrimAITE benchmark."""
    # generate report folder
    v_str = f"v{primaite.__version__}"
    version_result_dir = _RESULTS_ROOT / v_str
    version_result_dir.mkdir(exist_ok=True, parents=True)
    output_path = version_result_dir / f"PrimAITE {v_str} Benchmark Report.md"
    benchmark_start_time = datetime.now()
    session_metadata_dict = {}
@@ -193,6 +200,12 @@ def run(
        session_metadata=session_metadata_dict,
        config_path=data_manipulation_config_path(),
        results_root_path=_RESULTS_ROOT,
        output_path=output_path,
    )
    md2pdf(
        md_path=output_path,
        pdf_path=str(output_path).replace(".md", ".pdf"),
        css_path="static/styles.css",
    )
--- a/benchmark/report.py
+++ b/benchmark/report.py
@@ -1,7 +1,8 @@
-# © Crown-owned copyright 2024, Defence Science and Technology Laboratory UK
+# © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 import json
 import sys
 from datetime import datetime
 from os import PathLike
 from pathlib import Path
 from typing import Dict, Optional
@@ -14,11 +15,9 @@ from utils import _get_system_info
 import primaite
 PLOT_CONFIG = {
-    "size": {"auto_size": False, "width": 1500, "height": 900},
+    "size": {"auto_size": False, "width": 800, "height": 640},
    "template": "plotly_white",
    "range_slider": False,
    "av_s_per_100_steps_10_nodes_benchmark_threshold": 5,
    "benchmark_line_color": "grey",
 }
@@ -146,6 +145,20 @@ def _plot_benchmark_metadata(
        yaxis={"title": "Total Reward"},
        title=title,
    )
    fig.update_layout(
        legend=dict(
            yanchor="top",
            y=0.99,
            xanchor="left",
            x=0.01,
            bgcolor="rgba(255,255,255,0.3)",
        )
    )
    for trace in fig["data"]:
        if trace["name"].startswith("Session"):
            trace["showlegend"] = False
    fig["data"][0]["name"] = "Individual Sessions"
    fig["data"][0]["showlegend"] = True
    return fig
@@ -196,6 +209,7 @@ def _plot_all_benchmarks_combined_session_av(results_directory: Path) -> Figure:
        title=title,
    )
    fig["data"][0]["showlegend"] = True
    fig.update_layout(legend=dict(yanchor="top", y=-0.2, xanchor="left", x=0.01, orientation="h"))
    return fig
@@ -229,20 +243,14 @@ def _plot_av_s_per_100_steps_10_nodes(
    """
    Creates a bar chart visualising the performance of each version of PrimAITE.
-    Performance is based on the average training time per 100 steps on 10 nodes. The function also includes a benchmark
+    Performance is based on the average training time per 100 steps on 10 nodes.
    line indicating the target maximum time.
    Versions that perform under this time are marked in green, and those over are marked in red.
    :param version_times_dict: A dictionary with software versions as keys and average times as values.
    :return: A Plotly figure object representing the bar chart of the performance metrics.
    """
    major_v = primaite.__version__.split(".")[0]
    title = f"Performance of Minor and Bugfix Releases for Major Version {major_v}"
-    subtitle = (
+    subtitle = "Average Training Time per 100 Steps on 10 Nodes "
        f"Average Training Time per 100 Steps on 10 Nodes "
        f"(target: <= {PLOT_CONFIG['av_s_per_100_steps_10_nodes_benchmark_threshold']} seconds)"
    )
    title = f"{title} <br><sub>{subtitle}</sub>"
    layout = go.Layout(
@@ -255,42 +263,12 @@ def _plot_av_s_per_100_steps_10_nodes(
    versions = sorted(list(version_times_dict.keys()))
    times = [version_times_dict[version] for version in versions]
    av_s_per_100_steps_10_nodes_benchmark_threshold = PLOT_CONFIG["av_s_per_100_steps_10_nodes_benchmark_threshold"]
    benchmark_line_color = PLOT_CONFIG["benchmark_line_color"]
-    # Calculate the appropriate maximum y-axis value
+    fig.add_trace(go.Bar(x=versions, y=times, text=times, textposition="auto", texttemplate="%{y:.3f}"))
    max_y_axis_value = max(max(times), av_s_per_100_steps_10_nodes_benchmark_threshold) + 1
    fig.add_trace(
        go.Bar(
            x=versions,
            y=times,
            marker_color=[
                "green" if time < av_s_per_100_steps_10_nodes_benchmark_threshold else "red" for time in times
            ],
            text=times,
            textposition="auto",
        )
    )
    # Add a horizontal line for the benchmark
    fig.add_shape(
        type="line",
        x0=-0.5,  # start slightly before the first bar
        x1=len(versions) - 0.5,  # end slightly after the last bar
        y0=av_s_per_100_steps_10_nodes_benchmark_threshold,
        y1=av_s_per_100_steps_10_nodes_benchmark_threshold,
        line=dict(
            color=benchmark_line_color,
            width=2,
            dash="dot",
        ),
    )
    fig.update_layout(
        xaxis_title="PrimAITE Version",
        yaxis_title="Avg Time per 100 Steps on 10 Nodes (seconds)",
        yaxis=dict(range=[0, max_y_axis_value]),
        title=title,
    )
@@ -298,7 +276,11 @@ def _plot_av_s_per_100_steps_10_nodes(
 def build_benchmark_md_report(
-    benchmark_start_time: datetime, session_metadata: Dict, config_path: Path, results_root_path: Path
+    benchmark_start_time: datetime,
    session_metadata: Dict,
    config_path: Path,
    results_root_path: Path,
    output_path: PathLike,
 ) -> None:
    """
    Generates a Markdown report for a benchmarking session, documenting performance metrics and graphs.
@@ -350,7 +332,7 @@ def build_benchmark_md_report(
    data = benchmark_metadata_dict
    primaite_version = data["primaite_version"]
-    with open(version_result_dir / f"PrimAITE v{primaite_version} Benchmark Report.md", "w") as file:
+    with open(output_path, "w") as file:
        # Title
        file.write(f"# PrimAITE v{primaite_version} Learning Benchmark\n")
        file.write("## PrimAITE Dev Team\n")
@@ -424,3 +406,15 @@ def build_benchmark_md_report(
            f"![Performance of Minor and Bugfix Releases for Major Version {major_v}]"
            f"({performance_benchmark_plot_path.name})\n"
        )
 def md2pdf(md_path: PathLike, pdf_path: PathLike, css_path: PathLike) -> None:
    """Generate PDF version of Markdown report."""
    from md2pdf.core import md2pdf
    md2pdf(
        pdf_file_path=pdf_path,
        md_file_path=md_path,
        base_url=Path(md_path).parent,
        css_file_path=css_path,
    )
--- a/benchmark/results/v3/PrimAITE
+++ b/benchmark/results/v3/PrimAITE
--- a/benchmark/results/v3/v3.0.0/PrimAITE
+++ b/benchmark/results/v3/v3.0.0/PrimAITE
--- a/benchmark/results/v3/v3.0.0/PrimAITE
+++ b/benchmark/results/v3/v3.0.0/PrimAITE
--- a/benchmark/results/v3/v3.0.0/PrimAITE
+++ b/benchmark/results/v3/v3.0.0/PrimAITE
--- a/benchmark/results/v3/v3.0.0/PrimAITE
+++ b/benchmark/results/v3/v3.0.0/PrimAITE
--- a/benchmark/results/v3/v3.0.0/session_metadata/1.json
+++ b/benchmark/results/v3/v3.0.0/session_metadata/1.json
@@ -1006,4 +1006,4 @@
        "999": 112.49999999999994,
        "1000": 115.2500000000002
    }
-}
+}
--- a/benchmark/results/v3/v3.0.0/session_metadata/2.json
+++ b/benchmark/results/v3/v3.0.0/session_metadata/2.json
@@ -1006,4 +1006,4 @@
        "999": 103.49999999999994,
        "1000": 117.9500000000001
    }
-}
+}
--- a/benchmark/results/v3/v3.0.0/session_metadata/3.json
+++ b/benchmark/results/v3/v3.0.0/session_metadata/3.json
@@ -1006,4 +1006,4 @@
        "999": 112.00000000000017,
        "1000": 106.10000000000002
    }
-}
+}
--- a/benchmark/results/v3/v3.0.0/session_metadata/4.json
+++ b/benchmark/results/v3/v3.0.0/session_metadata/4.json
@@ -1006,4 +1006,4 @@
        "999": 100.75000000000009,
        "1000": 110.70000000000007
    }
-}
+}
--- a/benchmark/results/v3/v3.0.0/session_metadata/5.json
+++ b/benchmark/results/v3/v3.0.0/session_metadata/5.json
@@ -1006,4 +1006,4 @@
        "999": 110.6500000000001,
        "1000": 113.10000000000015
    }
-}
+}
--- a/benchmark/results/v3/v3.0.0/v3.0.0_benchmark_metadata.json
+++ b/benchmark/results/v3/v3.0.0/v3.0.0_benchmark_metadata.json
@@ -7433,4 +7433,4 @@
            }
        }
    }
-}
+}
--- a/benchmark/results/v3/v3.1.0/PrimAITE
+++ b/benchmark/results/v3/v3.1.0/PrimAITE
--- a/benchmark/results/v3/v3.1.0/PrimAITE
+++ b/benchmark/results/v3/v3.1.0/PrimAITE
--- a/benchmark/results/v3/v3.1.0/PrimAITE
+++ b/benchmark/results/v3/v3.1.0/PrimAITE
@@ -1,10 +1,10 @@
-# PrimAITE v3.0.0 Learning Benchmark
+# PrimAITE v3.1.0 Learning Benchmark
 ## PrimAITE Dev Team
 ### 2024-07-20
 ---
 ## 1 Introduction
-PrimAITE v3.0.0 was benchmarked automatically upon release. Learning rate metrics were captured to be referenced during system-level testing and user acceptance testing (UAT).
+PrimAITE v3.1.0 was benchmarked automatically upon release. Learning rate metrics were captured to be referenced during system-level testing and user acceptance testing (UAT).
 The benchmarking process consists of running 5 training session using the same config file. Each session trains an agent for 1000 episodes, with each episode consisting of 128 steps.
 The total reward per episode from each session is captured. This is then used to calculate an caverage total reward per episode from the 5 individual sessions for smoothing. Finally, a 25-widow rolling average of the average total reward per session is calculated for further smoothing.
 ## 2 System Information
@@ -26,12 +26,12 @@ The total reward per episode from each session is captured. This is then used to
 - **Total Sessions:** 5
 - **Total Episodes:** 5005
 - **Total Steps:** 640000
- **Av Session Duration (s):** 1452.5910
+- **Av Session Duration (s):** 1632.8888
- **Av Step Duration (s):** 0.0454
+- **Av Step Duration (s):** 0.0510
- **Av Duration per 100 Steps per 10 Nodes (s):** 4.5393
+- **Av Duration per 100 Steps per 10 Nodes (s):** 5.1028
 ## 4 Graphs
-### 4.1 v3.0.0 Learning Benchmark Plot
+### 4.1 v3.1.0 Learning Benchmark Plot
-![PrimAITE 3.0.0 Learning Benchmark Plot](PrimAITE v3.0.0 Learning Benchmark.png)
+![PrimAITE 3.1.0 Learning Benchmark Plot](PrimAITE v3.1.0 Learning Benchmark.png)
 ### 4.2 Learning Benchmark of Minor and Bugfix Releases for Major Version 3
 ![Learning Benchmark of Minor and Bugfix Releases for Major Version 3](PrimAITE Learning Benchmark of Minor and Bugfix Releases for Major Version 3.png)
 ### 4.3 Performance of Minor and Bugfix Releases for Major Version 3
--- a/benchmark/results/v3/v3.1.0/PrimAITE
+++ b/benchmark/results/v3/v3.1.0/PrimAITE
--- a/benchmark/results/v3/v3.1.0/session_metadata/1.json
+++ b/benchmark/results/v3/v3.1.0/session_metadata/1.json
--- a/benchmark/results/v3/v3.1.0/session_metadata/2.json
+++ b/benchmark/results/v3/v3.1.0/session_metadata/2.json
--- a/benchmark/results/v3/v3.1.0/session_metadata/3.json
+++ b/benchmark/results/v3/v3.1.0/session_metadata/3.json
--- a/benchmark/results/v3/v3.1.0/session_metadata/4.json
+++ b/benchmark/results/v3/v3.1.0/session_metadata/4.json
--- a/benchmark/results/v3/v3.1.0/session_metadata/5.json
+++ b/benchmark/results/v3/v3.1.0/session_metadata/5.json
--- a/benchmark/results/v3/v3.1.0/v3.1.0_benchmark_metadata.json
+++ b/benchmark/results/v3/v3.1.0/v3.1.0_benchmark_metadata.json
--- a/benchmark/results/v3/v3.2.0/PrimAITE
+++ b/benchmark/results/v3/v3.2.0/PrimAITE
--- a/benchmark/results/v3/v3.2.0/PrimAITE
+++ b/benchmark/results/v3/v3.2.0/PrimAITE
--- a/benchmark/results/v3/v3.2.0/PrimAITE
+++ b/benchmark/results/v3/v3.2.0/PrimAITE
@@ -0,0 +1,38 @@
 # PrimAITE v3.2.0 Learning Benchmark
 ## PrimAITE Dev Team
 ### 2024-07-21
 ---
 ## 1 Introduction
 PrimAITE v3.2.0 was benchmarked automatically upon release. Learning rate metrics were captured to be referenced during system-level testing and user acceptance testing (UAT).
 The benchmarking process consists of running 5 training session using the same config file. Each session trains an agent for 1000 episodes, with each episode consisting of 128 steps.
 The total reward per episode from each session is captured. This is then used to calculate an caverage total reward per episode from the 5 individual sessions for smoothing. Finally, a 25-widow rolling average of the average total reward per session is calculated for further smoothing.
 ## 2 System Information
 ### 2.1 Python
 **Version:** 3.10.14 (main, Apr  6 2024, 18:45:05) [GCC 9.4.0]
 ### 2.2 System
 - **OS:** Linux
 - **OS Version:** #76~20.04.1-Ubuntu SMP Thu Jun 13 18:00:23 UTC 2024
 - **Machine:** x86_64
 - **Processor:** x86_64
 ### 2.3 CPU
 - **Physical Cores:** 2
 - **Total Cores:** 4
 - **Max Frequency:** 0.00Mhz
 ### 2.4 Memory
 - **Total:** 15.62GB
 - **Swap Total:** 0.00B
 ## 3 Stats
 - **Total Sessions:** 5
 - **Total Episodes:** 5005
 - **Total Steps:** 640000
 - **Av Session Duration (s):** 1691.5034
 - **Av Step Duration (s):** 0.0529
 - **Av Duration per 100 Steps per 10 Nodes (s):** 5.2859
 ## 4 Graphs
 ### 4.1 v3.2.0 Learning Benchmark Plot
 ![PrimAITE 3.2.0 Learning Benchmark Plot](PrimAITE v3.2.0 Learning Benchmark.png)
 ### 4.2 Learning Benchmark of Minor and Bugfix Releases for Major Version 3
 ![Learning Benchmark of Minor and Bugfix Releases for Major Version 3](PrimAITE Learning Benchmark of Minor and Bugfix Releases for Major Version 3.png)
 ### 4.3 Performance of Minor and Bugfix Releases for Major Version 3
 ![Performance of Minor and Bugfix Releases for Major Version 3](PrimAITE Performance of Minor and Bugfix Releases for Major Version 3.png)
--- a/benchmark/results/v3/v3.2.0/PrimAITE
+++ b/benchmark/results/v3/v3.2.0/PrimAITE
--- a/benchmark/results/v3/v3.2.0/session_metadata/1.json
+++ b/benchmark/results/v3/v3.2.0/session_metadata/1.json
--- a/benchmark/results/v3/v3.2.0/session_metadata/2.json
+++ b/benchmark/results/v3/v3.2.0/session_metadata/2.json
--- a/benchmark/results/v3/v3.2.0/session_metadata/3.json
+++ b/benchmark/results/v3/v3.2.0/session_metadata/3.json
--- a/benchmark/results/v3/v3.2.0/session_metadata/4.json
+++ b/benchmark/results/v3/v3.2.0/session_metadata/4.json
--- a/benchmark/results/v3/v3.2.0/session_metadata/5.json
+++ b/benchmark/results/v3/v3.2.0/session_metadata/5.json
--- a/benchmark/results/v3/v3.2.0/v3.2.0_benchmark_metadata.json
+++ b/benchmark/results/v3/v3.2.0/v3.2.0_benchmark_metadata.json
--- a/benchmark/results/v3/v3.3.0/PrimAITE
+++ b/benchmark/results/v3/v3.3.0/PrimAITE
--- a/benchmark/results/v3/v3.3.0/PrimAITE
+++ b/benchmark/results/v3/v3.3.0/PrimAITE
--- a/benchmark/results/v3/v3.3.0/PrimAITE
+++ b/benchmark/results/v3/v3.3.0/PrimAITE
@@ -0,0 +1,38 @@
 # PrimAITE v3.3.0 Learning Benchmark
 ## PrimAITE Dev Team
 ### 2024-09-02
 ---
 ## 1 Introduction
 PrimAITE v3.3.0 was benchmarked automatically upon release. Learning rate metrics were captured to be referenced during system-level testing and user acceptance testing (UAT).
 The benchmarking process consists of running 5 training session using the same config file. Each session trains an agent for 1000 episodes, with each episode consisting of 128 steps.
 The total reward per episode from each session is captured. This is then used to calculate an caverage total reward per episode from the 5 individual sessions for smoothing. Finally, a 25-widow rolling average of the average total reward per session is calculated for further smoothing.
 ## 2 System Information
 ### 2.1 Python
 **Version:** 3.10.14 (main, Apr  6 2024, 18:45:05) [GCC 9.4.0]
 ### 2.2 System
 - **OS:** Linux
 - **OS Version:** #76~20.04.1-Ubuntu SMP Thu Jun 13 18:00:23 UTC 2024
 - **Machine:** x86_64
 - **Processor:** x86_64
 ### 2.3 CPU
 - **Physical Cores:** 2
 - **Total Cores:** 4
 - **Max Frequency:** 0.00Mhz
 ### 2.4 Memory
 - **Total:** 15.62GB
 - **Swap Total:** 0.00B
 ## 3 Stats
 - **Total Sessions:** 5
 - **Total Episodes:** 5005
 - **Total Steps:** 640000
 - **Av Session Duration (s):** 1458.2831
 - **Av Step Duration (s):** 0.0456
 - **Av Duration per 100 Steps per 10 Nodes (s):** 4.5571
 ## 4 Graphs
 ### 4.1 v3.3.0 Learning Benchmark Plot
 ![PrimAITE 3.3.0 Learning Benchmark Plot](PrimAITE v3.3.0 Learning Benchmark.png)
 ### 4.2 Learning Benchmark of Minor and Bugfix Releases for Major Version 3
 ![Learning Benchmark of Minor and Bugfix Releases for Major Version 3](PrimAITE Learning Benchmark of Minor and Bugfix Releases for Major Version 3.png)
 ### 4.3 Performance of Minor and Bugfix Releases for Major Version 3
 ![Performance of Minor and Bugfix Releases for Major Version 3](PrimAITE Performance of Minor and Bugfix Releases for Major Version 3.png)
--- a/benchmark/results/v3/v3.3.0/PrimAITE
+++ b/benchmark/results/v3/v3.3.0/PrimAITE
--- a/benchmark/results/v3/v3.3.0/PrimAITE
+++ b/benchmark/results/v3/v3.3.0/PrimAITE
--- a/benchmark/results/v3/v3.3.0/session_metadata/1.json
+++ b/benchmark/results/v3/v3.3.0/session_metadata/1.json
--- a/benchmark/results/v3/v3.3.0/session_metadata/2.json
+++ b/benchmark/results/v3/v3.3.0/session_metadata/2.json
--- a/benchmark/results/v3/v3.3.0/session_metadata/3.json
+++ b/benchmark/results/v3/v3.3.0/session_metadata/3.json
--- a/benchmark/results/v3/v3.3.0/session_metadata/4.json
+++ b/benchmark/results/v3/v3.3.0/session_metadata/4.json
--- a/benchmark/results/v3/v3.3.0/session_metadata/5.json
+++ b/benchmark/results/v3/v3.3.0/session_metadata/5.json
--- a/benchmark/results/v3/v3.3.0/v3.3.0_benchmark_metadata.json
+++ b/benchmark/results/v3/v3.3.0/v3.3.0_benchmark_metadata.json
--- a/benchmark/static/styles.css
+++ b/benchmark/static/styles.css
@@ -0,0 +1,34 @@
 body {
    font-family: 'Arial', sans-serif;
    line-height: 1.6;
    /* margin: 1cm; */
 }
 h1, h2, h3, h4, h5, h6 {
    font-weight: bold;
    /* margin: 1em 0; */
 }
 p {
    /* margin: 0.5em 0; */
 }
 ul, ol {
    margin: 1em 0;
    padding-left: 1.5em;
 }
 pre {
    background: #f4f4f4;
    padding: 0.5em;
    overflow-x: auto;
 }
 img {
    max-width: 100%;
    height: auto;
 }
 table {
    width: 100%;
    border-collapse: collapse;
    margin: 1em 0;
 }
 th, td {
    padding: 0.5em;
    border: 1px solid #ddd;
 }
--- a/benchmark/utils.py
+++ b/benchmark/utils.py
@@ -1,4 +1,4 @@
-# © Crown-owned copyright 2024, Defence Science and Technology Laboratory UK
+# © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 import platform
 from typing import Dict
--- a/copyright_clause_pre_commit_hook.py
+++ b/copyright_clause_pre_commit_hook.py
@@ -0,0 +1,154 @@
 # -*- coding: utf-8 -*-
 import datetime
 import sys
 from pathlib import Path
 # Constants
 CURRENT_YEAR = datetime.date.today().year
 COPYRIGHT_PY_STR = f"# © Crown-owned copyright {CURRENT_YEAR}, Defence Science and Technology Laboratory UK"
 COPYRIGHT_RST_LINES = [
    ".. only:: comment",
    "",
    f"    © Crown-owned copyright {CURRENT_YEAR}, Defence Science and Technology Laboratory UK",
 ]
 PATHS = {Path("./src"), Path("./tests"), Path("./docs"), Path("./benchmark")}
 EXTENSIONS = {".py", ".rst"}
 def _is_copyright_line(line: str) -> bool:
    """
    Check if a line is a copyright line.
    :param line: The line to check.
    :return: True if the line is a copyright line, False otherwise.
    """
    return line.startswith("#") and "copyright" in line.lower()
 def _is_rst_copyright_lines(lines: list) -> bool:
    """
    Check if the lines match the RST copyright format.
    :param lines: The lines to check.
    :return: True if the lines match the RST copyright format, False otherwise.
    """
    return len(lines) >= 3 and lines[0] == ".. only:: comment" and "copyright" in lines[2].lower()
 def process_py_file(file_path: Path) -> bool:
    """
    Process a Python file to check and add/update the copyright clause.
    :param file_path: The path to the file to check and update.
    :return: True if the file was modified, False otherwise.
    """
    modified = False
    try:
        content = file_path.read_text(encoding="utf-8")
        lines = content.splitlines(keepends=True)  # Keep line endings
        if lines and _is_copyright_line(lines[0]):
            if lines[0].strip() != COPYRIGHT_PY_STR:
                lines[0] = COPYRIGHT_PY_STR + "\n"
                modified = True
                print(f"Updated copyright clause in {file_path}")
        else:
            lines.insert(0, COPYRIGHT_PY_STR + "\n")
            modified = True
            print(f"Added copyright clause to {file_path}")
        if modified:
            file_path.write_text("".join(lines), encoding="utf-8")
    except Exception as e:
        print(f"Failed to process {file_path}: {e}")
        return False
    return modified
 def process_rst_file(file_path: Path) -> bool:
    """
    Process an RST file to check and add/update the copyright clause.
    :param file_path: The path to the file to check and update.
    :return: True if the file was modified, False otherwise.
    """
    modified = False
    try:
        content = file_path.read_text(encoding="utf-8")
        lines = content.splitlines(keepends=True)  # Keep line endings
        existing_block = any(".. only:: comment" in line for line in lines)
        if existing_block:
            # Check if the block is correct
            for i, line in enumerate(lines):
                if line.strip() == ".. only:: comment":
                    if lines[i : i + 3] != [
                        COPYRIGHT_RST_LINES[0] + "\n",
                        COPYRIGHT_RST_LINES[1] + "\n",
                        COPYRIGHT_RST_LINES[2] + "\n",
                    ]:
                        # Update the incorrect block
                        lines[i : i + 3] = [
                            COPYRIGHT_RST_LINES[0] + "\n",
                            COPYRIGHT_RST_LINES[1] + "\n",
                            COPYRIGHT_RST_LINES[2] + "\n",
                        ]
                        modified = True
                        print(f"Updated copyright clause in {file_path}")
                    break
        else:
            # Insert new copyright block
            lines = [line + "\n" for line in COPYRIGHT_RST_LINES] + ["\n"] + lines
            modified = True
            print(f"Added copyright clause to {file_path}")
        if modified:
            file_path.write_text("".join(lines), encoding="utf-8")
    except Exception as e:
        print(f"Failed to process {file_path}: {e}")
        return False
    return modified
 def process_file(file_path: Path) -> bool:
    """
    Check if a file has the correct copyright clause and add or update it if necessary.
    :param file_path: The path to the file to check and update.
    :return: True if the file was modified, False otherwise.
    """
    if file_path.suffix == ".py":
        return process_py_file(file_path)
    elif file_path.suffix == ".rst":
        return process_rst_file(file_path)
    return False
 def main() -> int:
    """
    Main function to walk through the root directories, check files, and update the copyright clause.
    :return: 1 if any file was modified, 0 otherwise.
    """
    files_checked = 0
    files_modified = 0
    any_file_modified = False
    for path in PATHS:
        for file_path in path.rglob("*"):
            if file_path.suffix in EXTENSIONS:
                files_checked += 1
                if process_file(file_path):
                    files_modified += 1
                    any_file_modified = True
    if any_file_modified:
        print(f"Files Checked: {files_checked}. Files Modified: {files_modified}")
        return 1
    return 0
 if __name__ == "__main__":
    sys.exit(main())
--- a/docs/Makefile
+++ b/docs/Makefile
@@ -1,3 +1,4 @@
 # © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 # Minimal makefile for Sphinx documentation
 # You can set these variables from the command line, and also
 # from the environment for the first two.
@@ -7,16 +8,18 @@ SOURCEDIR     = .
 BUILDDIR      = _build
 AUTOSUMMARY="source/_autosummary"
 NOTEBOOKS="source/notebooks/notebooks"
 # Remove command is different depending on OS
 ifdef OS
-	RM = IF exist $(AUTOSUMMARY) (  RMDIR $(AUTOSUMMARY) /s /q )
+	RM = IF exist $(AUTOSUMMARY) (RMDIR $(AUTOSUMMARY) /s /q) & IF exist $(NOTEBOOKS) (RMDIR $(NOTEBOOKS) /s /q)
 else
   ifeq ($(shell uname), Linux)
-      RM = rm -rf $(AUTOSUMMARY)
+      RM = rm -rf $(AUTOSUMMARY) $(NOTEBOOKS)
   endif
 endif
 # Put it first so that "make" without argument is like "make help".
 help:
 	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
--- a/docs/_static/c2_sequence.png
+++ b/docs/_static/c2_sequence.png
--- a/docs/_templates/custom-class-template.rst
+++ b/docs/_templates/custom-class-template.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 ..
    Credit to https://github.com/JamesALeedham/Sphinx-Autosummary-Recursion for the custom templates.
@@ -12,7 +12,8 @@
 .. autoclass:: {{ objname }}
   :members:
   :show-inheritance:
-   :inherited-members:
+   :inherited-members: BaseModel
   :exclude-members: model_computed_fields, model_config, model_fields
   :special-members: __init__, __call__, __add__, __mul__
   {% block methods %}
@@ -22,7 +23,14 @@
   .. autosummary::
      :nosignatures:
   {% for item in methods %}
-      {%- if not item.startswith('_') %}
+      {%- if not item.startswith('_') and item not in [
         'construct', 'copy', 'dict', 'from_orm', 'json', 'model_construct',
         'model_copy', 'model_dump', 'model_dump_json', 'model_json_schema',
         'model_parametrized_name', 'model_post_init', 'model_rebuild', '',
         'model_validate', 'model_validate_json', 'model_validate_strings',
         'parse_file', 'parse_obj', 'parse_raw', 'schema', 'schema_json',
         'update_forward_refs', 'validate',
      ] %}
      ~{{ name }}.{{ item }}
      {%- endif -%}
   {%- endfor %}
@@ -35,7 +43,12 @@
   .. autosummary::
   {% for item in attributes %}
      {%- if not item.startswith('_') and item not in [
         'model_computed_fields', 'model_config', 'model_extra', 'model_fields',
         'model_fields_set',
      ] %}
      ~{{ name }}.{{ item }}
      {%- endif -%}
   {%- endfor %}
   {% endif %}
   {% endblock %}
--- a/docs/_templates/custom-module-template.rst
+++ b/docs/_templates/custom-module-template.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 ..
    Credit to https://github.com/JamesALeedham/Sphinx-Autosummary-Recursion for the custom templates.
--- a/docs/api.rst
+++ b/docs/api.rst
@@ -2,7 +2,7 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 ..
   DO NOT DELETE THIS FILE! It contains the all-important `.. autosummary::` directive with `:recursive:` option, without
--- a/docs/build-sphinx-docs-to-github-pages.sh
+++ b/docs/build-sphinx-docs-to-github-pages.sh
@@ -1,3 +1,4 @@
 # © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 #!/bin/bash
 set -x
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -1,4 +1,4 @@
-# © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+# © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 # Configuration file for the Sphinx documentation builder.
 #
 # For the full list of built-in configuration values, see the documentation:
@@ -147,7 +147,7 @@ def copy_notebooks_to_docs() -> Any:
    This allows developers to create new notebooks without having to worry about updating documentation when
    a new notebook is included within PrimAITE.
    """
-    notebook_asset_types = [".ipynb", ".png"]
+    notebook_asset_types = [".ipynb", ".png", ".svg"]
    notebook_directories = []
    # find paths where notebooks are contained
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2024, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 Welcome to PrimAITE's documentation
 ====================================
@@ -8,101 +8,6 @@ Welcome to PrimAITE's documentation
 What is PrimAITE?
 -----------------
 Overview
 ^^^^^^^^
 The ARCD Primary-level AI Training Environment (**PrimAITE**) provides an effective simulation capability for training and evaluating AI in a cyber-defensive role. It incorporates the functionality required of a primary-level  ARCD environment:
 - The ability to model a relevant system context;
 - Modelling an adversarial agent that the defensive agent can be trained and evaluated against;
 - The ability to model key characteristics of a system by representing hosts, servers, network devices, IP addresses, ports, operating systems, folders / files, applications, services and links;
 - Modelling background (green) pattern-of-life;
 - Operates at machine-speed to enable fast training cycles via Reinforcement Learning (RL).
 Features
 ^^^^^^^^
 PrimAITE incorporates the following features:
 - Architected with a separate Simulation layer and Game layer. This separation of concerns defines a clear path towards transfer learning with environments of differing fidelity;
 - Ability to reconfigure an RL reward function based on (a) the ability to counter the modelled adversarial cyber-attack, and (b) the ability to ensure success for green agents;
 - Access Control List (ACL) functions for network devices (routers and firewalls), following standard ACL rule format (e.g., DENY / ALLOW, source / destination IP addresses, protocol and port);
 - Application of traffic to the links of the system laydown adheres to the ACL rulesets and routing tables contained within each network device;
 - Provides RL environments adherent to the Farama Foundation Gymnasium (Previously OpenAI Gym) API, allowing integration with any compliant RL Agent frameworks;
 - Provides RL environments adherent to Ray RLlib environment specifications for single-agent and multi-agent scenarios;
 - Assessed for compatibility with Stable-Baselines3 (SB3), Ray RLlib, and bespoke agents;
 - Persona-based adversarial (Red) agent behaviour; several out-the-box personas are provided, and more can be developed to suit the needs of the task. Stochastic variations in Red agent behaviour are also included as required;
 - A robust system logging tool, automatically enabled at the node level and featuring various log levels and terminal output options, enables PrimAITE users to conduct in-depth network simulations;
 - A PCAP service is seamlessly integrated within the simulation, automatically capturing and logging frames for both
  inbound and outbound traffic at the network interface level. This automatic functionality, combined with the ability
  to separate traffic directions, significantly enhances network analysis and troubleshooting capabilities;
 - Agent action logs provide a description of every action taken by each agent during the episode. This includes timestep, action, parameters, request and response, for all Blue agent activity, which is aligned with the Track 2 Common Action / Observation Space (CAOS) format. Action logs also details of all scripted / stochastic red / green agent actions;
 - Environment ground truth is provided at every timestep, providing a full description of the environment’s true state;
 - Alignment with CAOS provides the ability to transfer agents between CAOS compliant environments.
 Architecture
 ^^^^^^^^^^^^
 PrimAITE is a Python application and will operate on multiple Operating Systems (Windows, Linux and Mac);
 a comprehensive installation and user guide is provided with each release to support its usage.
 Configuration of PrimAITE is achieved via included YAML files which support full control over the network / system laydown being modelled, background pattern of life, adversarial (red agent) behaviour, and step and episode count.
 A Simulation Controller layer manages the overall running of the simulation, keeping track of all low-level objects.
 It is agnostic to the number of agents, their action / observation spaces, and the RL library being used.
 It presents a public API providing a method for describing the current state of the simulation, a method that accepts action requests and provides responses, and a method that triggers a timestep advancement.
 The Game Layer converts the simulation into a playable game for the agent(s).
 It translates between simulation state and Gymnasium.Spaces to pass action / observation data between the agent(s) and the simulation. It is responsible for calculating rewards, managing Multi-Agent RL (MARL) action turns, and via a single agent interface can interact with Blue, Red and Green agents.
 Agents can either generate their own scripted behaviour or accept input behaviour from an RL agent.
 Finally, a Gymnasium / Ray RLlib Environment Layer forwards requests to the Game Layer as the agent sends them. This layer also manages most of the I/O, such as reading in the configuration files and saving agent logs.
 .. image:: ../../_static/primAITE_architecture.png
    :width: 500
    :align: center
 Training & Evaluation Capability
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 PrimAITE provides a training and evaluation capability to AI agents in the context of cyber-attack, via its Gymnasium / Ray RLlib compliant interface.
 Scenarios can be constructed to reflect network / system laydowns consisting of any configuration of nodes (e.g., PCs, servers etc.) and the networking equipment and links between them.
 All nodes can be configured to contain applications, services, folders and files (and their status).
 Traffic flows between services and applications as directed by an ‘execution definition,’ with the traffic flow on the network governed by the network equipment (switches, routers and firewalls) and the ACL rules and routing tables they employ.
 Highlights of PrimAITE’s training and evaluation capability are:
 - The scenario is not bound to a representation of any platform, system, or technology;
 - Fully configurable (network / system laydown, green pattern-of-life, red personas, reward function, ACL rules for each device, number of episodes / steps, action / observation space) and repeatable to suit the requirements of AI agents;
 - Can integrate with any Gymnasium / Ray RLlib compliant AI agent .
 PrimAITE provides a number of use cases (network and red/green action configurations) by default which the user is able to extend and modify as required.
 What is PrimAITE built with
 ---------------------------
 * `Gymnasium <https://gymnasium.farama.org/>`_ is used as the basis for AI blue agent interaction with the PrimAITE environment
 * `Networkx <https://github.com/networkx/networkx>`_ is used as the underlying data structure used for the PrimAITE environment
 * `Stable Baselines 3 <https://github.com/DLR-RM/stable-baselines3>`_ is used as a default source of RL algorithms (although PrimAITE is not limited to SB3 agents)
 * `Ray RLlib <https://github.com/ray-project/ray>`_ is used as an additional source of RL algorithms
 * `Typer <https://github.com/tiangolo/typer>`_ is used for building CLIs (Command Line Interface applications)
 * `Jupyterlab <https://github.com/jupyterlab/jupyterlab>`_ is used as an extensible environment for interactive and reproducible computing, based on the Jupyter Notebook Architecture
 * `Platformdirs <https://github.com/platformdirs/platformdirs>`_ is used for finding the right location to store user data and configuration but varies per platform
 * `Plotly <https://github.com/plotly/plotly.py>`_ is used for building high level charts
 Getting Started with PrimAITE
 -----------------------------
 Head over to the :ref:`getting-started` page to install and setup PrimAITE!
 .. toctree::
   :maxdepth: 8
   :caption: About PrimAITE:
@@ -112,17 +17,36 @@ Head over to the :ref:`getting-started` page to install and setup PrimAITE!
   source/dependencies
   source/glossary
 .. toctree::
   :maxdepth: 8
   :caption: How To
   :hidden:
   source/how_to
   source/how_to_guides/custom_actions
   source/how_to_guides/custom_environments
   source/how_to_guides/custom_rewards
   source/how_to_guides/custom_software
   source/how_to_guides/using_dev_cli
   source/how_to_guides/extensible_actions
   source/how_to_guides/extensible_agents
   source/how_to_guides/extensible_nodes
   source/how_to_guides/extensible_rewards
   source/how_to_guides/primaite_yaml_migration_guide
 .. toctree::
   :caption: Usage:
   :hidden:
   source/getting_started
   source/simulation
   source/game_layer
   source/simulation
   source/config
-   source/environment
+   source/rewards
   source/customising_scenarios
   source/varying_config_files
   source/environment
   source/action_masking
   source/node_sets
 .. toctree::
   :caption: Notebooks:
@@ -140,3 +64,38 @@ Head over to the :ref:`getting-started` page to install and setup PrimAITE!
   source/request_system
   PrimAITE API <source/_autosummary/primaite>
   PrimAITE Tests <source/_autosummary/tests>
 Overview
 ^^^^^^^^
 The ARCD Primary-level AI Training Environment (**PrimAITE**) provides an effective simulation capability for training and evaluating AI in a cyber-defensive role. It incorporates the functionality required of a primary-level  ARCD environment:
 - The ability to model a relevant system context;
 - Modelling an adversarial agent that the defensive agent can be trained and evaluated against;
 - The ability to model key characteristics of a system by representing hosts, servers, network devices, IP addresses, ports, operating systems, folders / files, applications, services and links;
 - Modelling background (green) pattern-of-life;
 - Operates at machine-speed to enable fast training cycles via Reinforcement Learning (RL).
 PrimAITE has been designed as an extensible environment and toolkit to support the development, test, training and evaluation of AI-based cyber defensive agents. Whilst PrimAITE ships with a number of example modelled scenarios (a.k.a. Use Cases), it has not been developed to mandate the solving of a single cyber challenge, and instead provides a highly flexible environment application that can be extended and reconfigured by the user to suit their specific cyber defence training and evaluation needs. PrimAITE provides default networks, red agent and green agent behaviour, reward functions, and action / observation space configuration, all of which can be utilised out of the box, but which ultimately can (and in some instances should) be built upon and / or reconfigured to meet the needs of different defensive agent developers. The PrimAITE user guide provides comprehensive instruction on all PrimAITE features, functionality and components, and can be consulted in order to help guide users in any reconfiguration or enhancements they wish to undertake; a library of example Jupyter notebooks are also provided to support such work.
 Features
 ^^^^^^^^
 PrimAITE incorporates the following features:
 - Architected with a separate Simulation layer and Game layer. This separation of concerns defines a clear path towards transfer learning with environments of differing fidelity;
 - Ability to reconfigure an RL reward function based on (a) the ability to counter the modelled adversarial cyber-attack, and (b) the ability to ensure success for green agents;
 - Access Control List (ACL) functions for network devices (routers and firewalls), following standard ACL rule format (e.g., DENY / PERMIT, source / destination IP addresses, protocol and port);
 - Application of traffic to the links of the system laydown adheres to the ACL rulesets and routing tables contained within each network device;
 - Provides RL environments adherent to the Farama Foundation Gymnasium (Previously OpenAI Gym) API, allowing integration with any compliant RL Agent frameworks;
 - Provides RL environments adherent to Ray RLlib environment specifications for single-agent and multi-agent scenarios;
 - Assessed for compatibility with Stable-Baselines3 (SB3), Ray RLlib, and bespoke agents;
 - Persona-based adversarial (Red) agent behaviour; several out-the-box personas are provided, and more can be developed to suit the needs of the task. Stochastic variations in Red agent behaviour are also included as required;
 - A robust system logging tool, automatically enabled at the node level and featuring various log levels and terminal output options, enables PrimAITE users to conduct in-depth network simulations;
 - A PCAP service is seamlessly integrated within the simulation, automatically capturing and logging frames for both
  inbound and outbound traffic at the network interface level. This automatic functionality, combined with the ability
  to separate traffic directions, significantly enhances network analysis and troubleshooting capabilities;
 - Agent action logs provide a description of every action taken by each agent during the episode. This includes timestep, action, parameters, request and response, for all Blue agent activity, which is aligned with the Track 2 Common Action / Observation Space (CAOS) format. Action logs also detail all scripted / stochastic red / green agent actions;
 - Environment ground truth is provided at every timestep, providing a full description of the environment’s true state;
 - Alignment with CAOS provides the ability to transfer agents between CAOS compliant environments.
--- a/docs/make.bat
+++ b/docs/make.bat
@@ -1,4 +1,5 @@
@ECHO OFF
 REM  © Crown-owned copyright 2024, Defence Science and Technology Laboratory UK
 setlocal EnableDelayedExpansion
--- a/docs/source/about.rst
+++ b/docs/source/about.rst
@@ -7,311 +7,66 @@
 About PrimAITE
 ==============
-PrimAITE is a simulation environment for training agents to protect a computer network from cyber attacks.
+Architecture
 ^^^^^^^^^^^^
-Features
+PrimAITE is a Python application and will operate on multiple Operating Systems (Windows, Linux and Mac);
-********
+a comprehensive installation and user guide is provided with each release to support its usage.
-PrimAITE provides the following features:
+Configuration of PrimAITE is achieved via included YAML files which support full control over the network / system laydown being modelled, background pattern of life, adversarial (red agent) behaviour, and step and episode count.
 A Simulation Controller layer manages the overall running of the simulation, keeping track of all low-level objects.
-* A flexible system for defining network layouts and host configurations
+It is agnostic to the number of agents, their action / observation spaces, and the RL library being used.
-* Highly configurable network hosts, including definition of software, file system, and network interfaces,
+
-* Realistic network traffic simulation, including address and sending packets via internet protocols like TCP, UDP, ICMP, etc.
+It presents a public API providing a method for describing the current state of the simulation, a method that accepts action requests and provides responses, and a method that triggers a timestep advancement.
-* Routers with traffic routing and firewall capabilities
+The Game Layer converts the simulation into a playable game for the agent(s).
-* Simulation of customisable deterministic agents
+
-* Support for multiple agents, each having their own customisable observation space, action space, and reward function definition.
+It translates between simulation state and Gymnasium.Spaces to pass action / observation data between the agent(s) and the simulation. It is responsible for calculating rewards, managing Multi-Agent RL (MARL) action turns, and via a single agent interface can interact with Blue, Red and Green agents.
 Agents can either generate their own scripted behaviour or accept input behaviour from an RL agent.
 Finally, a Gymnasium / Ray RLlib Environment Layer forwards requests to the Game Layer as the agent sends them. This layer also manages most of the I/O, such as reading in the configuration files and saving agent logs.
 .. image:: ../../_static/primAITE_architecture.png
    :width: 500
    :align: center
-Structure
+Training & Evaluation Capability
-*********
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-PrimAITE consists of a simulator and a 'game' layer that allows agents to interact with the simulator. The simulator is built in a modular way where each component such as network hosts, links, networking devices, softwares, etc. are implemented as instances of a base class, meaning they all support the same interface. This allows for standardised configuration using either the Python API or YAML files.
+PrimAITE provides a training and evaluation capability to AI agents in the context of cyber-attack, via its Gymnasium / Ray RLlib compliant interface.
-The game layer is built on top of the simulator and it consumes the simulation action/state interface to allow agents to interact with the simulator. The game layer is also responsible for defining the reward function and observation space for the agents.
+
 Scenarios can be constructed to reflect network / system laydowns consisting of any configuration of nodes (e.g., PCs, servers etc.) and the networking equipment and links between them.
 All nodes can be configured to contain applications, services, folders, and files (and their status), including a powerful terminal simulation for SSH tunnelling and remote command execution.
 Realistic network traffic generated by software or by users. Packets move through the network devices (switches, routers, firewalls, network interfaces) in accordance to control rules such as: internet protocols, Access control lists (ACLs), and routing tables.
 Highlights of PrimAITE's training and evaluation capability are:
 - Fully configurable (network / system laydown, green pattern-of-life, red personas, reward function, ACL rules for each device, number of episodes / steps, action / observation space) and repeatable to suit the requirements of AI agents;
 - Domain randomisation through stochastic agent behaviour and the ability to switch between scenario variants between environment episodes.
 - Extensible through plugins to model any network behaviour.
 - Can integrate with any Gymnasium / Ray RLlib compliant AI agent.
-..
+PrimAITE provides a number of use cases (network and red/green action configurations) by default which the user is able to extend and modify as required.
-  Architecture - Nodes and Links
+
-  ******************************
+What is PrimAITE built with
-  **Nodes**
+---------------------------
-  An inheritance model has been adopted in order to model nodes. All nodes have the following base attributes (Class: Node):
+
-  * ID
+* `Gymnasium <https://gymnasium.farama.org/>`_ is used as the basis for AI blue agent interaction with the PrimAITE environment
-  * Name
+* `Pydantic <https://docs.pydantic.dev/latest/>`_ is used for data validation
-  * Type (e.g. computer, switch, RTU - enumeration)
+* `Platformdirs <https://github.com/platformdirs/platformdirs>`_ is used for storing user data and configuration correctly between platforms
-  * Priority (P1, P2, P3, P4 or P5 - enumeration)
+* `Typer <https://github.com/tiangolo/typer>`_ is used for the Command Line Interface
-  * Hardware State (ON, OFF, RESETTING, SHUTTING_DOWN, BOOTING - enumeration)
+* `Jupyterlab <https://github.com/jupyterlab/jupyterlab>`_ is used as an extensible environment for interactive and reproducible computing, based on the Jupyter Notebook Architecture
-  Active Nodes also have the following attributes (Class: Active Node):
+* `Plotly <https://github.com/plotly/plotly.py>`_ is used for building high level charts
-  * IP Address
+* `Stable Baselines 3 <https://github.com/DLR-RM/stable-baselines3>`_ is used for ensuring compatibility with RL libraries
-  * Software State (GOOD, FIXING, COMPROMISED - enumeration)
+* `Ray RLlib <https://github.com/ray-project/ray>`_ is also used for ensuring compatibility with RL libraries
-  * File System State (GOOD, CORRUPT, DESTROYED, REPAIRING, RESTORING - enumeration)
+
-  Service Nodes also have the following attributes (Class: Service Node):
+
-  * List of Services (where service is composed of service name and port). There is no theoretical limit on the number of services that can be modelled. Services and protocols are currently intrinsically linked (i.e. a service is an application on a node transmitting traffic of this protocol type)
+Getting Started with PrimAITE
-  * Service state (GOOD, FIXING, COMPROMISED, OVERWHELMED - enumeration)
+-----------------------------
-  Passive Nodes are currently not used (but may be employed for non IP-based components such as machinery actuators in future releases).
+
-  **Links**
+Head over to the :ref:`getting-started` page to install and setup PrimAITE!
  Links are modelled both as network edges (networkx) and as Python classes, in order to extend their functionality. Links include the following attributes:
  * ID
  * Name
  * Bandwidth (bits/s)
  * Source node ID
  * Destination node ID
  * Protocol list (containing the loading of protocols currently running on the link)
  When the simulation runs, IERs are applied to the links in order to model traffic loading, individually assigned to each protocol. This allows green (background) and red agent behaviour to be modelled, and defensive agents to identify suspicious traffic patterns at a protocol / traffic loading level of fidelity.
  Information Exchange Requirements (IERs)
  ****************************************
  PrimAITE adopts the concept of Information Exchange Requirements (IERs) to model both green agent (background) and red agent (adversary) behaviour. IERs are used to initiate modelling of traffic loading on the network, and have the following attributes:
  * ID
  * Start step (i.e. which step in the training episode should the IER start)
  * End step (i.e. which step in the training episode should the IER end)
  * Source node ID
  * Destination node ID
  * Load (bits/s)
  * Protocol
  * Port
  * Running status (i.e. on / off)
  The application of green agent IERs between a source and destination follows a number of rules. Specifically:
  1. Does the current simulation time step fall between IER start and end step
  2. Is the source node operational (both physically and at an O/S level), and is the service (protocol / port) associated with the IER (a) present on this node, and (b) in an operational state (i.e. not FIXING)
  3. Is the destination node operational (both physically and at an O/S level), and is the service (protocol / port) associated with the IER (a) present on this node, and (b) in an operational state (i.e. not FIXING)
  4. Are there any Access Control List rules in place that prevent the application of this IER
  5. Are all switches in the (OSPF) path between source and destination operational (both physically and at an O/S level)
  For red agent IERs, the application of IERs between a source and destination follows a number of subtly different rules. Specifically:
  1. Does the current simulation time step fall between IER start and end step
  2. Is the source node operational, and is the service (protocol / port) associated with the IER (a) present on that node and (b) already in a compromised state
  3. Is the destination node operational, and is the service (protocol / port) associated with the IER present on that node
  4. Are there any Access Control List rules in place that prevent the application of this IER
  5. Are all switches in the (OSPF) path between source and destination operational (both physically and at an O/S level)
  Assuming the rules pass, the IER is applied to all relevant links (based on use of OSPF) between source and destination.
  Node Pattern-of-Life
  ********************
  Every node can be impacted (i.e. have a status change applied to it) by either green agent pattern-of-life or red agent pattern-of-life. This is distinct from IERs, and allows for attacks (and defence) to be modelled purely within the confines of a node.
  The status changes that can be made to a node are as follows:
  * All Nodes:
    * Hardware State:
        * ON
        * OFF
        * RESETTING - when a status of resetting is entered, the node will automatically exit this state after a number of steps (as defined by the nodeResetDuration configuration item) after which it returns to an ON state
        * BOOTING
        * SHUTTING_DOWN
  * Active Nodes and Service Nodes:
    * Software State:
        * GOOD
        * FIXING - when a status of FIXING is entered, the node will automatically exit this state after a number of steps (as defined by the osFIXINGDuration configuration item) after which it returns to a GOOD state
        * COMPROMISED
    * File System State:
        * GOOD
        * CORRUPT (can be resolved by repair or restore)
        * DESTROYED (can be resolved by restore only)
        * REPAIRING - when a status of repairing is entered, the node will automatically exit this state after a number of steps (as defined by the fileSystemRepairingLimit configuration item) after which it returns to a GOOD state
        * RESTORING - when a status of repairing is entered, the node will automatically exit this state after a number of steps (as defined by the fileSystemRestoringLimit configuration item) after which it returns to a GOOD state
  * Service Nodes only:
    * Service State (for any associated service):
        * GOOD
        * FIXING - when a status of FIXING is entered, the service will automatically exit this state after a number of steps (as defined by the serviceFIXINGDuration configuration item) after which it returns to a GOOD state
        * COMPROMISED
        * OVERWHELMED
  Red agent pattern-of-life has an additional feature not found in the green pattern-of-life. This is the ability to influence the state of the attributes of a node via a number of different conditions:
    * DIRECT:
    The pattern-of-life described by the configuration file item will be applied regardless of any other conditions in the network. This is particularly useful for direct red agent entry into the network.
    * IER:
    The pattern-of-life described by the configuration file item will be applied to the service on the node, only if there is an IER of the same protocol / service type incoming at the specified timestep.
    * SERVICE:
    The pattern-of-life described by the configuration file item will be applied to the node based on the state of a service. The service can either be on the same node, or a different node within the network.
  Access Control List modelling
  *****************************
  An Access Control List (ACL) is modelled to provide the means to manage traffic flows in the system. This will allow defensive agents the means to turn on / off rules, or potentially create new rules, to counter an attack.
  The ACL follows a standard network firewall format. For example:
  .. list-table:: ACL example
    :widths: 25 25 25 25 25
    :header-rows: 1
    * - Permission
      - Source IP
      - Dest IP
      - Protocol
      - Port
    * - DENY
      - 192.168.1.2
      - 192.168.1.3
      - HTTPS
      - 443
    * - ALLOW
      - 192.168.1.4
      - ANY
      - SMTP
      - 25
    * - DENY
      - ANY
      - 192.168.1.5
      - ANY
      - ANY
  All ACL rules are considered when applying an IER. Logic follows the order of rules, so a DENY or ALLOW for the same parameters will override an earlier entry.
  Observation Spaces
  ******************
  The observation space provides the blue agent with information about the current status of nodes and links.
  PrimAITE builds on top of Gymnasium Spaces to create an observation space that is easily configurable for users. It's made up of components which are managed by the :py:class:`primaite.environment.observations.ObservationsHandler`. Each training scenario can define its own observation space, and the user can choose which information to inlude, and how it should be formatted.
  NodeLinkTable component
  -----------------------
  For example, the :py:class:`primaite.environment.observations.NodeLinkTable` component represents the status of nodes and links as a ``gym.spaces.Box`` with an example format shown below:
  An example observation space is provided below:
  .. list-table:: Observation Space example
    :widths: 25 25 25 25 25 25 25
    :header-rows: 1
    * -
      - ID
      - Hardware State
      - Software State
      - File System State
      - Service / Protocol A
      - Service / Protocol B
    * - Node A
      - 1
      - 1
      - 1
      - 1
      - 1
      - 1
    * - Node B
      - 2
      - 1
      - 3
      - 1
      - 1
      - 1
    * - Node C
      - 3
      - 2
      - 1
      - 1
      - 3
      - 2
    * - Link 1
      - 5
      - 0
      - 0
      - 0
      - 0
      - 10000
    * - Link 2
      - 6
      - 0
      - 0
      - 0
      - 0
      - 10000
    * - Link 3
      - 7
      - 0
      - 0
      - 0
      - 5000
      - 0
  For the nodes, the following values are represented:
  .. code-block::
    [
      ID
      Hardware State            (1=ON,   2=OFF,  3=RESETTING,  4=SHUTTING_DOWN, 5=BOOTING)
      Operating System State    (0=none, 1=GOOD, 2=PATCHING,   3=COMPROMISED)
      File System State         (0=none, 1=GOOD, 2=CORRUPT,    3=DESTROYED,  4=REPAIRING, 5=RESTORING)
      Service1/Protocol1 state  (0=none, 1=GOOD, 2=FIXING,   3=COMPROMISED)
      Service2/Protocol2 state  (0=none, 1=GOOD, 2=FIXING,   3=COMPROMISED)
    ]
  (Note that each service available in the network is provided as a column, although not all nodes may utilise all services)
  For the links, the following statuses are represented:
  .. code-block::
    [
      ID
      Hardware State            (0=not applicable)
      Operating System State    (0=not applicable)
      File System State         (0=not applicable)
      Service1/Protocol1 state  (Traffic load from this protocol on this link)
      Service2/Protocol2 state  (Traffic load from this protocol on this link)
    ]
  NodeStatus component
  ----------------------
  This is a MultiDiscrete observation space that can be though of as a one-dimensional vector of discrete states.
  The example above would have the following structure:
  .. code-block::
    [
      node1_info
      node2_info
      node3_info
    ]
  Each ``node_info`` contains the following:
  .. code-block::
    [
      hardware_state    (0=none, 1=ON,   2=OFF,      3=RESETTING, 4=SHUTTING_DOWN, 5=BOOTING)
      software_state    (0=none, 1=GOOD, 2=PATCHING, 3=COMPROMISED)
      file_system_state (0=none, 1=GOOD, 2=CORRUPT,  3=DESTROYED, 4=REPAIRING, 5=RESTORING)
      service1_state    (0=none, 1=GOOD, 2=FIXING, 3=COMPROMISED)
      service2_state    (0=none, 1=GOOD, 2=FIXING, 3=COMPROMISED)
    ]
  In a network with three nodes and two services, the full observation space would have 15 elements. It can be written with ``gym`` notation to indicate the number of discrete options for each of the elements of the observation space. For example:
  .. code-block::
    gym.spaces.MultiDiscrete([4,5,6,4,4,4,5,6,4,4,4,5,6,4,4])
  .. note::
    NodeStatus observation component provides information only about nodes. Links are not considered.
  LinkTrafficLevels
  -----------------
  This component is a MultiDiscrete space showing the traffic flow levels on the links in the network, after applying a threshold to convert it from a continuous to a discrete value.
  There are two configurable parameters:
  * ``quantisation_levels`` determines how many discrete bins to use for converting the continuous traffic value to discrete (default is 5).
  * ``combine_service_traffic`` determines whether to separately output traffic use for each network protocol or whether to combine them into an overall value for the link. (default is ``True``)
  For example, with default parameters and a network with three links, the structure of this component would be:
  .. code-block::
    [
      link1_status
      link2_status
      link3_status
    ]
  Each ``link_status`` is a number from 0-4 representing the network load in relation to bandwidth.
  .. code-block::
    0 = No traffic (0%)
    1 = low traffic (1%-33%)
    2 = medium traffic (33%-66%)
    3 = high traffic (66%-99%)
    4 = max traffic/ overwhelmed (100%)
  Using ``gym`` notation, the shape of the obs space is: ``gym.spaces.MultiDiscrete([5,5,5])``.
  Action Spaces
  **************
  The action space available to the blue agent comes in two types:
  1. Node-based
  2. Access Control List
  3. Any (Agent can take both node-based and ACL-based actions)
  The choice of action space used during a training session is determined in the config_[name].yaml file.
  **Node-Based**
  The agent is able to influence the status of nodes by switching them off, resetting, or FIXING operating systems and services. In this instance, the action space is a Gymnasium spaces.Discrete type, as follows:
  * Dictionary item {... ,1: [x1, x2, x3,x4] ...}
    The placeholders inside the list under the key '1' mean the following:
      * [0, num nodes] - Node ID (0 = nothing, node ID)
      * [0, 4] - What property it's acting on (0 = nothing, 1 = state, 2 = SoftwareState, 3 = service state, 4 = file system state)
      * [0, 3] - Action on property (0 = nothing, 1 = on / scan, 2 = off / repair, 3 = reset / patch / restore)
      * [0, num services] - Resolves to service ID (0 = nothing, resolves to service)
  **Access Control List**
  The blue agent is able to influence the configuration of the Access Control List rule set (which implements a system-wide firewall). In this instance, the action space is an Gymnasium spaces.Discrete type, as follows:
    * Dictionary item {... ,1: [x1, x2, x3, x4, x5, x6] ...}
    The placeholders inside the list under the key '1' mean the following:
      * [0, 2] - Action (0 = do nothing, 1 = create rule, 2 = delete rule)
      * [0, 1] - Permission (0 = DENY, 1 = ALLOW)
      * [0, num nodes] - Source IP (0 = any, then 1 -> x resolving to IP addresses)
      * [0, num nodes] - Dest IP (0 = any, then 1 -> x resolving to IP addresses)
      * [0, num services] - Protocol (0 = any, then 1 -> x resolving to protocol)
      * [0, num ports] - Port (0 = any, then 1 -> x resolving to port)
  **ANY**
  The agent is able to carry out both **Node-Based** and **Access Control List** operations.
  This means the dictionary will contain key-value pairs in the format of BOTH Node-Based and Access Control List as seen above.
  Rewards
  *******
  A reward value is presented back to the blue agent on the conclusion of every step. The reward value is calculated via two methods which combine to give the total value:
  1. Node and service status
  2. IER status
  **Node and service status**
  On every step, the status of each node is compared against both a reference environment (simulating the situation if the red and blue agents had not impacted the environment)
  and the before and after state of the environment. If the comparison against the reference environment shows no difference, then the score provided is "AllOK". If there is a
  difference with respect to the reference environment, the before and after states are compared, and a score determined. See :ref:`config` for details of reward values.
  **IER status**
  On every step, the full IER set is examined to determine whether green and red agent IERs are being permitted to run. Any red agent IERs running incur a penalty; any green agent
  IERs not permitted to run also incur a penalty. See :ref:`config` for details of reward values.
  Future Enhancements
  *******************
  The PrimAITE project has an ambition to include the following enhancements in future releases:
  * Integration with a suitable standardised framework to allow multi-agent integration
  * Integration with external threat emulation tools, either using off-line data, or integrating at runtime
--- a/docs/source/action_masking.rst
+++ b/docs/source/action_masking.rst
@@ -0,0 +1,148 @@
 .. only:: comment
    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _action_masking:
 Action Masking
 **************
 The PrimAITE simulation is able to provide action masks in the environment output. These action masks let the agents know
 about which actions are invalid based on the current environment state. For instance, it's not possible to install
 software on a node that is turned off. Therefore, if an agent has a ``node-software-install`` in it's action map for that node,
 the action mask will show `0` in the corresponding entry.
 *Note: just because an action is available in the action mask does not mean it will be successful when executed. It just means it's possible to try to execute the action at this time.*
 Configuration
 =============
 Action masking is supported for agents that use the `ProxyAgent` class (the class used for connecting to RL algorithms).
 In order to use action masking, set the agent_settings.action_masking parameter to True in the config file.
 Masking Logic
 =============
 The following logic is applied:
 +------------------------------------------+------------------------------------------------+
 | Action                                   | Action Mask Logic                              |
 +==========================================+================================================+
 | **do-nothing**                           | Always Possible.                               |
 +------------------------------------------+------------------------------------------------+
 | **node-service-scan**                    | Node is on. Service is running.                |
 +------------------------------------------+------------------------------------------------+
 | **node-service-stop**                    | Node is on. Service is running.                |
 +------------------------------------------+------------------------------------------------+
 | **node-service-start**                   | Node is on. Service is stopped.                |
 +------------------------------------------+------------------------------------------------+
 | **node-service-pause**                   | Node is on. Service is running.                |
 +------------------------------------------+------------------------------------------------+
 | **node-service-resume**                  | Node is on. Service is paused.                 |
 +------------------------------------------+------------------------------------------------+
 | **node-service-restart**                 | Node is on. Service is running.                |
 +------------------------------------------+------------------------------------------------+
 | **node-service-disable**                 | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **node-service-enable**                  | Node is on. Service is disabled.               |
 +------------------------------------------+------------------------------------------------+
 | **node-service-fix**                     | Node is on. Service is running.                |
 +------------------------------------------+------------------------------------------------+
 | **node-application-execute**             | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **node-application-scan**                | Node is on. Application is running.            |
 +------------------------------------------+------------------------------------------------+
 | **node-application-close**               | Node is on. Application is running.            |
 +------------------------------------------+------------------------------------------------+
 | **node-application-fix**                 | Node is on. Application is running.            |
 +------------------------------------------+------------------------------------------------+
 | **node-application-install**             | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **node-application-remove**              | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **node-file-scan**                       | Node is on. File exists. File not deleted.     |
 +------------------------------------------+------------------------------------------------+
 | **node-file-create**                     | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **node-file-checkhash**                  | Node is on. File exists. File not deleted.     |
 +------------------------------------------+------------------------------------------------+
 | **node-file-delete**                     | Node is on. File exists.                       |
 +------------------------------------------+------------------------------------------------+
 | **node-file-repair**                     | Node is on. File exists. File not deleted.     |
 +------------------------------------------+------------------------------------------------+
 | **node-file-restore**                    | Node is on. File exists. File is deleted.      |
 +------------------------------------------+------------------------------------------------+
 | **node-file-corrupt**                    | Node is on. File exists. File not deleted.     |
 +------------------------------------------+------------------------------------------------+
 | **node-file-access**                     | Node is on. File exists. File not deleted.     |
 +------------------------------------------+------------------------------------------------+
 | **node-folder-create**                   | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **node-folder-scan**                     | Node is on. Folder exists. Folder not deleted. |
 +------------------------------------------+------------------------------------------------+
 | **node-folder-checkhash**                | Node is on. Folder exists. Folder not deleted. |
 +------------------------------------------+------------------------------------------------+
 | **node-folder-repair**                   | Node is on. Folder exists. Folder not deleted. |
 +------------------------------------------+------------------------------------------------+
 | **node-folder-restore**                  | Node is on. Folder exists. Folder is deleted.  |
 +------------------------------------------+------------------------------------------------+
 | **node-os-scan**                         | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **host-nic-enable**                      | NIC is disabled. Node is on.                   |
 +------------------------------------------+------------------------------------------------+
 | **host-nic-disable**                     | NIC is enabled. Node is on.                    |
 +------------------------------------------+------------------------------------------------+
 | **node-shutdown**                        | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **node-startup**                         | Node is off.                                   |
 +------------------------------------------+------------------------------------------------+
 | **node-reset**                           | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **node-nmap-ping-scan**                  | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **node-nmap-port-scan**                  | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **node-network-service-recon**           | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **network-port-enable**                  | Node is on. Router is on.                      |
 +------------------------------------------+------------------------------------------------+
 | **network-port-disable**                 | Router is on.                                  |
 +------------------------------------------+------------------------------------------------+
 | **router-acl-add-rule**                  | Router is on.                                  |
 +------------------------------------------+------------------------------------------------+
 | **router-acl-remove-rule**               | Router is on.                                  |
 +------------------------------------------+------------------------------------------------+
 | **firewall-acl-add-rule**                | Firewall is on.                                |
 +------------------------------------------+------------------------------------------------+
 | **firewall-acl-remove-rule**             | Firewall is on.                                |
 +------------------------------------------+------------------------------------------------+
 | **configure-database-client**            | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **configure-ransomware-script**          | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **c2-server-ransomware-configure**       | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **configure-dos-bot**                    | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **configure-c2-beacon**                  | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **c2-server-ransomware-launch**          | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **c2-server-terminal-command**           | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **c2-server-data-exfiltrate**            | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **node-account-change-password**         | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **node-session-remote-login**            | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **node-session-remote-logoff**           | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 | **node-send-remote-command**             | Node is on.                                    |
 +------------------------------------------+------------------------------------------------+
 Mechanism
 =========
 The environment iterates over the RL agent's ``action_map`` and generates the corresponding simulator :ref:`request <request_system>` string. It uses the :py:meth:`RequestManager.check_valid()<primaite.simulator.core.RequestManager.check_valid>` method to invoke the relevant :py:class:`RequestPermissionValidator <primaite.simulator.core.RequestPermissionValidator>` without actually running the request on the simulation.
 Current Limitations
 ===================
 Currently, action masking only considers whether the action as a whole is possible, it doesn't verify that the exact parameter combination passed to the action make sense in the current context. or instance, if ACL rule 3 on router_1 is already populated, the action for adding another rule at position 3 will be available regardless, as long as that router is turned on. This will never block valid actions. It will just occasionally allow invalid actions.
--- a/docs/source/config.rst
+++ b/docs/source/config.rst
@@ -1,6 +1,8 @@
 .. only:: comment
-    © Crown-owned copyright 2024, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _Configurable_Items:
 PrimAITE |VERSION| Configuration
 ********************************
--- a/docs/source/configuration/agents.rst
+++ b/docs/source/configuration/agents.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 ``agents``
@@ -13,39 +13,19 @@ Agents can be scripted (deterministic and stochastic), or controlled by a reinfo
 .. code-block:: yaml
    agents:
-        - ref: red_agent_example
+    - ref: red_agent_example
-            ...
+        ...
-        - ref: blue_agent_example
+    - ref: blue_agent_example
-            ...
+        ...
-        - ref: green_agent_example
+    - ref: green_agent_example
-            team: GREEN
+    team: GREEN
-            type: ProbabilisticAgent
+    type: probabilistic-agent
            observation_space:
                type: UC2GreenObservation
            action_space:
                action_list:
                - type: DONOTHING
                - type: NODE_APPLICATION_EXECUTE
            options:
                nodes:
                - node_name: client_2
                  applications:
                    - application_name: WebBrowser
                max_folders_per_node: 1
                max_files_per_folder: 1
                max_services_per_node: 1
                max_applications_per_node: 1
-            reward_function:
+    agent_settings:
-                reward_components:
+      start_step: 5
-                - type: DUMMY
+      frequency: 4
-
+      variance: 3
-            agent_settings:
+      flatten_obs: False
                start_settings:
                    start_step: 5
                    frequency: 4
            variance: 3
            flatten_obs: False
 ``ref``
 -------
@@ -57,13 +37,13 @@ Specifies if the agent is malicious (``RED``), benign (``GREEN``), or defensive
 ``type``
 --------
-Specifies which class should be used for the agent. ``ProxyAgent`` is used for agents that receive instructions from an RL algorithm. Scripted agents like ``RedDatabaseCorruptingAgent`` and ``ProbabilisticAgent`` generate their own behaviour.
+Specifies which class should be used for the agent. ``proxy-agent`` is used for agents that receive instructions from an RL algorithm. Scripted agents like ``red-database-corrupting-agent`` and ``probabilistic-agent`` generate their own behaviour.
 Available agent types:
- ``ProbabilisticAgent``
+- ``probabilistic-agent``
- ``ProxyAgent``
+- ``proxy-agent``
- ``RedDatabaseCorruptingAgent``
+- ``red-database-corrupting-agent``
 ``observation_space``
 ---------------------
@@ -79,10 +59,10 @@ selects which python class from the :py:mod:`primaite.game.agent.observation` mo
 Allows configuration of the chosen observation type. These are optional.
-    * ``num_services_per_node``, ``num_folders_per_node``, ``num_files_per_folder``, ``num_nics_per_node`` all define the shape of the observation space. The size and shape of the obs space must remain constant, but the number of files, folders, ACL rules, and other components can change within an episode. Therefore padding is performed and these options set the size of the obs space.
+    * ``num_services_per_node``, ``num_folders_per_node``, ``num_files_per_folder``, ``num_nics_per_node`` all define the shape of the observation space. The size and shape of the obs space must remain constant, but the number of files, folders, acl rules, and other components can change within an episode. Therefore padding is performed and these options set the size of the obs space.
    * ``nodes``: list of nodes that will be present in this agent's observation space. The ``node_ref`` relates to the human-readable unique reference defined later in the ``simulation`` part of the config. Each node can also be configured with services, and files that should be monitored.
    * ``links``: list of links that will be present in this agent's observation space. The ``link_ref`` relates to the human-readable unique reference defined later in the ``simulation`` part of the config.
-    * ``acl``: configure how the agent reads the access control list on the router in the simulation. ``router_node_ref`` is for selecting which router's ACL table should be used. ``ip_list`` sets the encoding of ip addresses as integers within the observation space.
+    * ``acl``: configure how the agent reads the access control list on the router in the simulation. ``router_node_ref`` is for selecting which router's acl table should be used. ``ip_list`` sets the encoding of ip addresses as integers within the observation space.
 For more information see :py:mod:`primaite.game.agent.observations`
@@ -91,10 +71,6 @@ For more information see :py:mod:`primaite.game.agent.observations`
 The action space is configured to be made up of individual action types. Once configured, the agent can select an action type and some optional action parameters at every step. For example: The ``NODE_SERVICE_SCAN`` action takes the parameters ``node_id`` and ``service_id``.
 ``action_list``
 ^^^^^^^^^^^^^^^
 A list of action modules. The options are listed in the :py:mod:`primaite.game.agent.actions.ActionManager.act_class_identifiers` module.
 ``action_map``
 ^^^^^^^^^^^^^^
@@ -120,16 +96,21 @@ Similar to action space, this is defined as a list of components from the :py:mo
 ``reward_components``
 ^^^^^^^^^^^^^^^^^^^^^
-
+A list of available reward types from :py:mod:`primaite.game.agent.rewards.RewardFunction.rew_class_identifiers`
 A list of reward types from :py:mod:`primaite.game.agent.rewards.RewardFunction.rew_class_identifiers`
 e.g.
 .. code-block:: yaml
    reward_components:
-        - type: DUMMY
+        - type: dummy
-        - type: DATABASE_FILE_INTEGRITY
+          weight: 1.0
        - type: database-file-integrity
          weight: 0.40
          options:
            node_hostname: database_server
            folder_name: database
            file_name: database.db
 ``agent_settings``
@@ -142,10 +123,9 @@ e.g.
 .. code-block:: yaml
    agent_settings:
-        start_settings:
+        start_step: 25
-            start_step: 25
+        frequency: 20
-            frequency: 20
+        variance: 5
            variance: 5
 ``start_step``
 ^^^^^^^^^^^^^^
@@ -172,3 +152,9 @@ The amount of timesteps that the frequency can randomly change.
 ---------------
 If ``True``, gymnasium flattening will be performed on the observation space before sending to the agent. Set this to ``True`` if your agent does not support nested observation spaces.
 ``Agent History``
 -----------------
 Agents will record their action log for each step. This is a summary of what the agent did, along with response information from requests within the simulation.
 A summary of the actions taken by the agent can be viewed using the `show_history()` function. By default, this will display all actions taken apart from ``do-nothing``.
--- a/docs/source/configuration/game.rst
+++ b/docs/source/configuration/game.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 ``game``
@@ -28,6 +28,7 @@ This section defines high-level settings that apply across the game, currently i
                high: 10
                medium: 5
                low: 0
        seed: 1
 ``max_episode_length``
 ----------------------
@@ -41,16 +42,21 @@ The maximum number of episodes a Reinforcement Learning agent(s) can be trained
 A list of ports that the Reinforcement Learning agent(s) are able to see in the observation space.
-See :ref:`List of Ports <List of Ports>` for a list of ports.
+See :py:const:`primaite.utils.validation.port.PORT_LOOKUP` for a list of ports.
 ``protocols``
 -------------
 A list of protocols that the Reinforcement Learning agent(s) are able to see in the observation space.
-See :ref:`List of IPProtocols <List of IPProtocols>` for a list of protocols.
+See :py:const:`primaite.utils.validation.ip_protocol.PROTOCOL_LOOKUP` for a list of protocols.
 ``thresholds``
 --------------
 These are used to determine the thresholds of high, medium and low categories for counted observation occurrences.
 ``seed``
 --------
 Used to configure the random seeds used within PrimAITE, ensuring determinism within episode/session runs. If empty or set to -1, no seed is set. The given seed value is logged (by default) in ``primaite/<VERSION>/sessions/<DATE>/<TIME>/simulation_output``.
--- a/docs/source/configuration/io_settings.rst
+++ b/docs/source/configuration/io_settings.rst
@@ -1,7 +1,8 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _io_settings:
 ``io_settings``
 ===============
@@ -13,20 +14,17 @@ This section configures how PrimAITE saves data during simulation and training.
 .. code-block:: yaml
    io_settings:
        # save_logs: True
        save_agent_actions: True
        save_step_metadata: False
        save_pcap_logs: False
        save_sys_logs: False
        save_agent_logs: False
        write_sys_log_to_terminal: False
        write_agent_log_to_terminal: False
        sys_log_level: WARNING
        agent_log_level: INFO
 ``save_logs``
 -------------
 *currently unused*.
 ``save_agent_actions``
 ----------------------
@@ -57,6 +55,12 @@ Optional. Default value is ``False``.
 If ``True``, then the log files which contain all node actions during the simulation will be saved.
 ``save_agent_logs``
 -----------------
 Optional. Default value is ``False``.
 If ``True``, then the log files which contain all human readable agent behaviour during the simulation will be saved.
 ``write_sys_log_to_terminal``
 -----------------------------
@@ -65,16 +69,25 @@ Optional. Default value is ``False``.
 If ``True``, PrimAITE will print sys log to the terminal.
 ``write_agent_log_to_terminal``
 -----------------------------
-``sys_log_level``
+Optional. Default value is ``False``.
-------------
+
 If ``True``, PrimAITE will print all human readable agent behaviour logs to the terminal.
 ``sys_log_level & agent_log_level``
 ---------------------------------
 Optional. Default value is ``WARNING``.
-The level of logging that should be visible in the sys logs or the logs output to the terminal.
+The level of logging that should be visible in the syslog, agent logs or the logs output to the terminal.
 ``save_sys_logs`` or ``write_sys_log_to_terminal`` has to be set to ``True`` for this setting to be used.
 This is also true for agent behaviour logging.
 Available options are:
 - ``DEBUG``: Debug level items and the items below
--- a/docs/source/configuration/simulation.rst
+++ b/docs/source/configuration/simulation.rst
@@ -1,13 +1,12 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 ``simulation``
 ==============
-In this section the network layout is defined. This part of the config follows a hierarchical structure. Almost every component defines a ``ref`` field which acts as a human-readable unique identifier, used by other parts of the config, such as agents.
+In this section the network layout is defined. This part of the config follows a hierarchical structure.
-
+At the top level of the network are ``nodes``, ``links`` and ``airspace``.
 At the top level of the network are ``nodes`` and ``links``.
 e.g.
@@ -19,11 +18,14 @@ e.g.
            ...
            links:
            ...
            airspace:
            ...
 ``nodes``
 ---------
-This is where the list of nodes are defined. Some items will differ according to the node type, however, there will be common items such as a node's reference (which is used by the agent), the node's ``type`` and ``hostname``
+This is where the list of nodes are defined. Some items will differ according to the node type, however, there will be common items such as a node's hostname (which is used by the agent) and the node's ``type``.
 To see the configuration for these nodes, refer to the following:
@@ -71,10 +73,6 @@ this results in:
        endpoint_b_port: 2 # port 2 on switch
        bandwidth: 100
 ``ref``
 ^^^^^^^
 The human readable name for the link. Not used in code, however is useful for a human to understand what the link is for.
 ``endpoint_a_hostname``
 ^^^^^^^^^^^^^^^^^^^^^^^
@@ -101,3 +99,27 @@ This accepts an integer value e.g. if port 1 is to be connected, the configurati
 ``bandwidth``
 This is an integer value specifying the allowed bandwidth across the connection. Units are in Mbps.
 ``airspace``
 ------------
 This section configures settings specific to the wireless network's virtual airspace.
 ``frequency_max_capacity_mbps``
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 This setting allows the user to override the default maximum bandwidth capacity set for each frequency. The key should
 be the AirSpaceFrequency name and the value be the desired maximum bandwidth capacity in mbps (megabits per second) for
 a single timestep.
 The below example would permit 123.45 megabits to be transmit across the WiFi 2.4 GHz frequency in a single timestep.
 Setting a frequencies max capacity to 0.0 blocks that frequency on the airspace.
 .. code-block:: yaml
    simulation:
      network:
        airspace:
          frequency_max_capacity_mbps:
            WIFI_2_4: 123.45
            WIFI_5: 0.0
--- a/docs/source/configuration/simulation/nodes/common/common.rst
+++ b/docs/source/configuration/simulation/nodes/common/common.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _Node Attributes:
--- a/docs/source/configuration/simulation/nodes/common/common_host_node_attributes.rst
+++ b/docs/source/configuration/simulation/nodes/common/common_host_node_attributes.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _common_host_node_attributes:
--- a/docs/source/configuration/simulation/nodes/common/common_network_node_attributes.rst
+++ b/docs/source/configuration/simulation/nodes/common/common_network_node_attributes.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _common_network_node_attributes:
--- a/docs/source/configuration/simulation/nodes/common/common_node_attributes.rst
+++ b/docs/source/configuration/simulation/nodes/common/common_node_attributes.rst
@@ -1,14 +1,9 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _common_node_attributes:
 ``ref``
 -------
 Human readable name used as reference for the |NODE|. Not used in code.
 ``hostname``
 ------------
@@ -53,3 +48,60 @@ The number of time steps required to occur in order for the node to cycle from `
 Optional. Default value is ``3``.
 The number of time steps required to occur in order for the node to cycle from ``ON`` to ``SHUTTING_DOWN`` and then finally ``OFF``.
 ``file_system``
 ---------------
 Optional.
 The file system of the node. This configuration allows nodes to be initialised with files and/or folders.
 The file system takes a list of folders and files.
 Example:
 .. code-block:: yaml
    simulation:
      network:
        nodes:
        - hostname: client_1
          type: computer
          ip_address: 192.168.10.11
          subnet_mask: 255.255.255.0
          default_gateway: 192.168.10.1
          file_system:
            - empty_folder  # example of an empty folder
            - downloads:
              - "test_1.txt"  # files in the downloads folder
              - "test_2.txt"
            - root:
              - passwords:  # example of file with size and type
                  size: 69  # size in bytes
                  type: TXT  # See FileType for list of available file types
 List of file types: :py:mod:`primaite.simulator.file_system.file_type.FileType`
 ``users``
 ---------
 The list of pre-existing users that are additional to the default admin user (``username=admin``, ``password=admin``).
 Additional users are configured as an array and must contain a ``username``, ``password``, and can contain an optional
 boolean ``is_admin``.
 Example of adding two additional users to a node:
 .. code-block:: yaml
    simulation:
      network:
        nodes:
        - hostname: [hostname]
          type: [Node Type]
          users:
            - username: jane.doe
              password: '1234'
              is_admin: true
            - username: john.doe
              password: password_1
              is_admin: false
--- a/docs/source/configuration/simulation/nodes/common/node_type_list.rst
+++ b/docs/source/configuration/simulation/nodes/common/node_type_list.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 ``type``
 --------
@@ -12,6 +12,7 @@ Available options are:
 - ``computer``
 - ``firewall``
 - ``router``
 - ``wireless_router``
 - ``server``
 - ``switch``
--- a/docs/source/configuration/simulation/nodes/computer.rst
+++ b/docs/source/configuration/simulation/nodes/computer.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _computer_configuration:
@@ -17,19 +17,18 @@ example computer
 .. code-block:: yaml
    simulation:
-        network:
+      network:
-            nodes:
+        nodes:
-                - ref: client_1
+        - hostname: client_1
-                hostname: client_1
+          type: computer
-                type: computer
+          ip_address: 192.168.0.10c
-                ip_address: 192.168.0.10
+          subnet_mask: 255.255.255.0
-                subnet_mask: 255.255.255.0
+          default_gateway: 192.168.0.1
-                default_gateway: 192.168.0.1
+          dns_server: 192.168.1.10
-                dns_server: 192.168.1.10
+          applications:
-                applications:
+            ...
-                    ...
+          services:
-                services:
+            ...
                    ...
 .. include:: common/common_node_attributes.rst
--- a/docs/source/configuration/simulation/nodes/firewall.rst
+++ b/docs/source/configuration/simulation/nodes/firewall.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _firewall_configuration:
@@ -19,38 +19,35 @@ example firewall
 .. code-block:: yaml
    simulation:
-        network:
+      network:
-            nodes:
+        nodes:
-                - ref: firewall
+          - hostname: firewall
-                    hostname: firewall
+            type: firewall
-                    type: firewall
+            ports:
-                    start_up_duration: 0
+              external_port: # port 1
-                    shut_down_duration: 0
+                ip_address: 192.168.20.1
-                    ports:
+                subnet_mask: 255.255.255.0
-                        external_port: # port 1
+              internal_port: # port 2
-                            ip_address: 192.168.20.1
+                ip_address: 192.168.1.2
-                            subnet_mask: 255.255.255.0
+                subnet_mask: 255.255.255.0
-                        internal_port: # port 2
+              dmz_port: # port 3
-                            ip_address: 192.168.1.2
+                ip_address: 192.168.10.1
-                            subnet_mask: 255.255.255.0
+                subnet_mask: 255.255.255.0
-                        dmz_port: # port 3
+            acl:
-                            ip_address: 192.168.10.1
+              internal_inbound_acl:
-                            subnet_mask: 255.255.255.0
+                ...
-                    acl:
+              internal_outbound_acl:
-                        internal_inbound_acl:
+                ...
-                            ...
+              dmz_inbound_acl:
-                        internal_outbound_acl:
+                ...
-                            ...
+              dmz_outbound_acl:
-                        dmz_inbound_acl:
+                ...
-                            ...
+              external_inbound_acl:
-                        dmz_outbound_acl:
+                ...
-                            ...
+              external_outbound_acl:
-                        external_inbound_acl:
+                ...
-                            ...
+            routes:
-                        external_outbound_acl:
+                ...
                            ...
                    routes:
                        ...
 .. include:: common/common_node_attributes.rst
@@ -70,18 +67,18 @@ The ports should be defined with an ip address and subnet mask e.g.
 .. code-block:: yaml
    nodes:
-        - ref: firewall
+      - hostname: firewall
-        ...
+      ...
        ports:
-            external_port: # port 1
+          external_port: # port 1
-                ip_address: 192.168.20.1
+            ip_address: 192.168.20.1
-                subnet_mask: 255.255.255.0
+            subnet_mask: 255.255.255.0
-            internal_port: # port 2
+          internal_port: # port 2
-                ip_address: 192.168.1.2
+            ip_address: 192.168.1.2
-                subnet_mask: 255.255.255.0
+            subnet_mask: 255.255.255.0
-            dmz_port: # port 3
+          dmz_port: # port 3
-                ip_address: 192.168.10.1
+            ip_address: 192.168.10.1
-                subnet_mask: 255.255.255.0
+            subnet_mask: 255.255.255.0
 ``ip_address``
 """"""""""""""
@@ -129,21 +126,21 @@ example:
 .. code-block:: yaml
    nodes:
-        - ref: firewall
+      - hostname: firewall
        ...
        acl:
-            internal_inbound_acl:
+          internal_inbound_acl:
-                21: # position 21 on ACL list
+            21: # position 21 on ACL list
-                    action: PERMIT  # allow packets that
+              action: PERMIT  # allow packets that
-                    src_port: POSTGRES_SERVER   # are emitted from the POSTGRES_SERVER port
+              src_port: POSTGRES_SERVER   # are emitted from the POSTGRES_SERVER port
-                    dst_port: POSTGRES_SERVER   # are going towards an POSTGRES_SERVER port
+              dst_port: POSTGRES_SERVER   # are going towards an POSTGRES_SERVER port
-                22: # position 22 on ACL list
+            22: # position 22 on ACL list
-                    action: PERMIT  # allow packets that
+              action: PERMIT  # allow packets that
-                    src_port: ARP   # are emitted from the ARP port
+              src_port: ARP   # are emitted from the ARP port
-                    dst_port: ARP   # are going towards an ARP port
+              dst_port: ARP   # are going towards an ARP port
-                23: # position 23 on ACL list
+            23: # position 23 on ACL list
-                    action: PERMIT  # allow packets that
+              action: PERMIT  # allow packets that
-                    protocol: ICMP  # are ICMP
+              protocol: ICMP  # are ICMP
 ``internal_outbound_acl``
 """""""""""""""""""""""""
@@ -155,21 +152,21 @@ example:
 .. code-block:: yaml
    nodes:
-        - ref: firewall
+      - hostname: firewall
-        ...
+      ...
        acl:
-            internal_outbound_acl:
+          internal_outbound_acl:
-                21: # position 21 on ACL list
+            21: # position 21 on ACL list
-                    action: PERMIT  # allow packets that
+              action: PERMIT  # allow packets that
-                    src_port: POSTGRES_SERVER   # are emitted from the POSTGRES_SERVER port
+              src_port: POSTGRES_SERVER   # are emitted from the POSTGRES_SERVER port
-                    dst_port: POSTGRES_SERVER   # are going towards an POSTGRES_SERVER port
+              dst_port: POSTGRES_SERVER   # are going towards an POSTGRES_SERVER port
-                22: # position 22 on ACL list
+            22: # position 22 on ACL list
-                    action: PERMIT  # allow packets that
+              action: PERMIT  # allow packets that
-                    src_port: ARP   # are emitted from the ARP port
+              src_port: ARP   # are emitted from the ARP port
-                    dst_port: ARP   # are going towards an ARP port
+              dst_port: ARP   # are going towards an ARP port
-                23: # position 23 on ACL list
+            23: # position 23 on ACL list
-                    action: PERMIT  # allow packets that
+              action: PERMIT  # allow packets that
-                    protocol: ICMP  # are ICMP
+              protocol: ICMP  # are ICMP
 ``dmz_inbound_acl``
@@ -216,29 +213,29 @@ example:
 .. code-block:: yaml
    nodes:
-        - ref: firewall
+      - hostname: firewall
-        ...
+      ...
-        acl:
+      acl:
-            dmz_outbound_acl:
+        dmz_outbound_acl:
-                19: # position 19 on ACL list
+          19: # position 19 on ACL list
-                    action: PERMIT  # allow packets that
+            action: PERMIT  # allow packets that
-                    src_port: POSTGRES_SERVER   # are emitted from the POSTGRES_SERVER port
+            src_port: POSTGRES_SERVER   # are emitted from the POSTGRES_SERVER port
-                    dst_port: POSTGRES_SERVER   # are going towards an POSTGRES_SERVER port
+            dst_port: POSTGRES_SERVER   # are going towards an POSTGRES_SERVER port
-                20: # position 20 on ACL list
+          20: # position 20 on ACL list
-                    action: PERMIT  # allow packets that
+            action: PERMIT  # allow packets that
-                    src_port: HTTP   # are emitted from the HTTP port
+            src_port: HTTP   # are emitted from the HTTP port
-                    dst_port: HTTP   # are going towards an HTTP port
+            dst_port: HTTP   # are going towards an HTTP port
-                21: # position 21 on ACL list
+          21: # position 21 on ACL list
-                    action: PERMIT  # allow packets that
+            action: PERMIT  # allow packets that
-                    src_port: HTTPS   # are emitted from the HTTPS port
+            src_port: HTTPS   # are emitted from the HTTPS port
-                    dst_port: HTTPS   # are going towards an HTTPS port
+            dst_port: HTTPS   # are going towards an HTTPS port
-                22: # position 22 on ACL list
+          22: # position 22 on ACL list
-                    action: PERMIT  # allow packets that
+            action: PERMIT  # allow packets that
-                    src_port: ARP   # are emitted from the ARP port
+            src_port: ARP   # are emitted from the ARP port
-                    dst_port: ARP   # are going towards an ARP port
+            dst_port: ARP   # are going towards an ARP port
-                23: # position 23 on ACL list
+          23: # position 23 on ACL list
-                    action: PERMIT  # allow packets that
+            action: PERMIT  # allow packets that
-                    protocol: ICMP  # are ICMP
+            protocol: ICMP  # are ICMP
@@ -254,21 +251,21 @@ example:
 .. code-block:: yaml
    nodes:
-        - ref: firewall
+      - hostname: firewall
-        ...
+      ...
-        acl:
+      acl:
-            external_inbound_acl:
+        external_inbound_acl:
-                21: # position 19 on ACL list
+          21: # position 19 on ACL list
-                    action: DENY  # deny packets that
+            action: DENY  # deny packets that
-                    src_port: POSTGRES_SERVER   # are emitted from the POSTGRES_SERVER port
+            src_port: POSTGRES_SERVER   # are emitted from the POSTGRES_SERVER port
-                    dst_port: POSTGRES_SERVER   # are going towards an POSTGRES_SERVER port
+            dst_port: POSTGRES_SERVER   # are going towards an POSTGRES_SERVER port
-                22: # position 22 on ACL list
+          22: # position 22 on ACL list
-                    action: PERMIT  # allow packets that
+            action: PERMIT  # allow packets that
-                    src_port: ARP   # are emitted from the ARP port
+            src_port: ARP   # are emitted from the ARP port
-                    dst_port: ARP   # are going towards an ARP port
+            dst_port: ARP   # are going towards an ARP port
-                23: # position 23 on ACL list
+          23: # position 23 on ACL list
-                    action: PERMIT  # allow packets that
+            action: PERMIT  # allow packets that
-                    protocol: ICMP  # are ICMP
+            protocol: ICMP  # are ICMP
 ``external_outbound_acl``
 """""""""""""""""""""""""
@@ -282,17 +279,17 @@ example:
 .. code-block:: yaml
    nodes:
-        - ref: firewall
+      - hotsname: firewall
-        ...
+      ...
        acl:
-            external_outbound_acl:
+          external_outbound_acl:
-                22: # position 22 on ACL list
+            22: # position 22 on ACL list
-                    action: PERMIT  # allow packets that
+              action: PERMIT  # allow packets that
-                    src_port: ARP   # are emitted from the ARP port
+              src_port: ARP   # are emitted from the ARP port
-                    dst_port: ARP   # are going towards an ARP port
+              dst_port: ARP   # are going towards an ARP port
-                23: # position 23 on ACL list
+            23: # position 23 on ACL list
-                    action: PERMIT  # allow packets that
+              action: PERMIT  # allow packets that
-                    protocol: ICMP  # are ICMP
+              protocol: ICMP  # are ICMP
 .. include:: common/common_network_node_attributes.rst
--- a/docs/source/configuration/simulation/nodes/network_examples.rst
+++ b/docs/source/configuration/simulation/nodes/network_examples.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _network_examples:
@@ -389,7 +389,7 @@ connections, but the ACL that allows the nodes in the LAN to communicate with th
    pc_1 = network.get_node_by_hostname("pc_1")
    pc_1.ping(pc_1.default_gateway)
-pc_1.sys_log.show()
+    pc_1.sys_log.show()
 If SysLog capture is toggled on and the simulation log level is set to INFO, the `pc_1` the result of the ping should be
 captured in the `pc_1` SysLog:
@@ -443,7 +443,8 @@ SomeTech. This extended network includes detailed sub-networks with specialised
 complex routing capabilities, and robust security protocols implemented through Access Control Lists (ACLs). Designed
 to mimic the intricacies of actual network environments, this network provides a detailed look at how various network
 components interact and function together to support both internal corporate activities and external communications.
-
+NB: the network described here is not the same as the UC7 network used by notebooks such as ``UC7-Training,ipynb`` or
 the network in ``Privilege-Escalation-and-Data-Loss-Example.ipynb``.
 .. image:: images/primaite_example_multi_lan_with_internet_network_dark.png
    :align: center
@@ -617,10 +618,10 @@ Each node is configured to ensure it meets the specific security and operational
            default_gateway: 192.168.1.1
            dns_server: 8.8.8.2
            applications:
-              - type: DatabaseClient
+              - type: database-client
                options:
                  db_server_ip: 10.10.1.11
-              - type: WebBrowser
+              - type: web-browser
                options:
                  target_url: http://sometech.ai
@@ -631,10 +632,10 @@ Each node is configured to ensure it meets the specific security and operational
            default_gateway: 192.168.1.1
            dns_server: 8.8.8.2
            applications:
-              - type: DatabaseClient
+              - type: database-client
                options:
                  db_server_ip: 10.10.1.11
-              - type: WebBrowser
+              - type: web-browser
                options:
                  target_url: http://sometech.ai
@@ -700,7 +701,7 @@ Each node is configured to ensure it meets the specific security and operational
            default_gateway: 8.8.8.1
            services:
              - ref: dns_server
-                type: DNSServer
+                type: dns-server
                options:
                  domain_mapping:
                    sometech.ai: 94.10.180.6
@@ -794,9 +795,9 @@ Each node is configured to ensure it meets the specific security and operational
            dns_server: 8.8.8.2
            services:
              - ref: web_server
-                type: WebServer
+                type: web-server
            applications:
-              - type: DatabaseClient
+              - type: database-client
                options:
                  db_server_ip: 10.10.1.11
@@ -903,10 +904,10 @@ Each node is configured to ensure it meets the specific security and operational
            default_gateway: 10.10.1.1
            dns_server: 8.8.8.2
            services:
-              - type: DatabaseService
+              - type: database-service
                options:
                  backup_server_ip: 10.10.1.12 # The some_tech_storage_srv server
-              - type: FTPClient
+              - type: ftp-client
          - hostname: some_tech_storage_srv
            type: server
@@ -915,7 +916,7 @@ Each node is configured to ensure it meets the specific security and operational
            default_gateway: 10.10.1.1
            dns_server: 8.8.8.2
            services:
-              - type: FTPServer
+              - type: ftp-server
          - hostname: some_tech_hr_1
            type: computer
@@ -924,10 +925,10 @@ Each node is configured to ensure it meets the specific security and operational
            default_gateway: 10.10.3.1
            dns_server: 8.8.8.2
            applications:
-              - type: DatabaseClient
+              - type: database-client
                options:
                  db_server_ip: 10.10.1.11
-              - type: WebBrowser
+              - type: web-browser
                options:
                  target_url: http://sometech.ai
@@ -938,10 +939,10 @@ Each node is configured to ensure it meets the specific security and operational
            default_gateway: 10.10.2.1
            dns_server: 8.8.8.2
            applications:
-              - type: DatabaseClient
+              - type: database-client
                options:
                  db_server_ip: 10.10.1.11
-              - type: WebBrowser
+              - type: web-browser
                options:
                  target_url: http://sometech.ai
@@ -952,10 +953,10 @@ Each node is configured to ensure it meets the specific security and operational
            default_gateway: 10.10.2.1
            dns_server: 8.8.8.2
            applications:
-              - type: DatabaseClient
+              - type: database-client
                options:
                  db_server_ip: 10.10.1.11
-              - type: WebBrowser
+              - type: web-browser
                options:
                  target_url: http://sometech.ai
@@ -1177,8 +1178,8 @@ ACLs permitting or denying traffic as per our configured ACL rules.
    some_tech_storage_srv = network.get_node_by_hostname("some_tech_storage_srv")
    some_tech_storage_srv.file_system.create_file(file_name="test.png")
-    pc_1_ftp_client: FTPClient = network.get_node_by_hostname("pc_1").software_manager.software["FTPClient"]
+    pc_1_ftp_client: FTPClient = network.get_node_by_hostname("pc_1").software_manager.software["ftp-client"]
-    pc_2_ftp_client: FTPClient = network.get_node_by_hostname("pc_2").software_manager.software["FTPClient"]
+    pc_2_ftp_client: FTPClient = network.get_node_by_hostname("pc_2").software_manager.software["ftp-client"]
    assert not pc_1_ftp_client.request_file(
        dest_ip_address=some_tech_storage_srv.network_interface[1].ip_address,
@@ -1224,7 +1225,7 @@ ACLs permitting or denying traffic as per our configured ACL rules.
    web_server: Server = network.get_node_by_hostname("some_tech_web_srv")
-    web_ftp_client: FTPClient = web_server.software_manager.software["FTPClient"]
+    web_ftp_client: FTPClient = web_server.software_manager.software["ftp-client"]
    assert not web_ftp_client.request_file(
        dest_ip_address=some_tech_storage_srv.network_interface[1].ip_address,
@@ -1269,7 +1270,7 @@ ACLs permitting or denying traffic as per our configured ACL rules.
    some_tech_storage_srv.file_system.create_file(file_name="test.png")
    some_tech_snr_dev_pc: Computer = network.get_node_by_hostname("some_tech_snr_dev_pc")
-    snr_dev_ftp_client: FTPClient = some_tech_snr_dev_pc.software_manager.software["FTPClient"]
+    snr_dev_ftp_client: FTPClient = some_tech_snr_dev_pc.software_manager.software["ftp-client"]
    assert snr_dev_ftp_client.request_file(
        dest_ip_address=some_tech_storage_srv.network_interface[1].ip_address,
@@ -1294,7 +1295,7 @@ ACLs permitting or denying traffic as per our configured ACL rules.
    some_tech_storage_srv.file_system.create_file(file_name="test.png")
    some_tech_jnr_dev_pc: Computer = network.get_node_by_hostname("some_tech_jnr_dev_pc")
-    jnr_dev_ftp_client: FTPClient = some_tech_jnr_dev_pc.software_manager.software["FTPClient"]
+    jnr_dev_ftp_client: FTPClient = some_tech_jnr_dev_pc.software_manager.software["ftp-client"]
    assert not jnr_dev_ftp_client.request_file(
        dest_ip_address=some_tech_storage_srv.network_interface[1].ip_address,
@@ -1337,7 +1338,7 @@ ACLs permitting or denying traffic as per our configured ACL rules.
    some_tech_storage_srv.file_system.create_file(file_name="test.png")
    some_tech_hr_pc: Computer = network.get_node_by_hostname("some_tech_hr_1")
-    hr_ftp_client: FTPClient = some_tech_hr_pc.software_manager.software["FTPClient"]
+    hr_ftp_client: FTPClient = some_tech_hr_pc.software_manager.software["ftp-client"]
    assert not hr_ftp_client.request_file(
        dest_ip_address=some_tech_storage_srv.network_interface[1].ip_address,
--- a/docs/source/configuration/simulation/nodes/router.rst
+++ b/docs/source/configuration/simulation/nodes/router.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _router_configuration:
@@ -17,16 +17,15 @@ example router
 .. code-block:: yaml
    simulation:
-        network:
+      network:
-            nodes:
+        nodes:
-                - ref: router_1
+          - hostname: router_1
-                hostname: router_1
+            type: router
-                type: router
+            num_ports: 5
-                num_ports: 5
+            ports:
-                ports:
+                ...
-                    ...
+            acl:
-                acl:
+                ...
                    ...
 .. include:: common/common_node_attributes.rst
@@ -49,15 +48,15 @@ Example of setting ports for a router with 2 ports:
 .. code-block:: yaml
    nodes:
-        - ref: router_1
+        - hostname: router_1
        ...
        ports:
-            1:
+          1:
-                ip_address: 192.168.1.1
+            ip_address: 192.168.1.1
-                subnet_mask: 255.255.255.0
+            subnet_mask: 255.255.255.0
-            2:
+          2:
-                ip_address: 192.168.10.1
+            ip_address: 192.168.10.1
-                subnet_mask: 255.255.255.0
+            subnet_mask: 255.255.255.0
 ``ip_address``
 """"""""""""""
@@ -74,23 +73,19 @@ The subnet mask setting for the port.
 ``acl``
 -------
-Sets up the ACL rules for the router.
+Sets up the ACL rules for the router to apply to layer-3 traffic. These are not applied to layer-2 traffic such as ARP.
 e.g.
 .. code-block:: yaml
    nodes:
-        - ref: router_1
+        - hostname: router_1
        ...
        acl:
-            1:
+          1:
-                action: PERMIT
+            action: PERMIT
-                src_port: ARP
+            protocol: ICMP
                dst_port: ARP
            2:
                action: PERMIT
                protocol: ICMP
 See :py:mod:`primaite.simulator.network.hardware.nodes.network.router.AccessControlList`
--- a/docs/source/configuration/simulation/nodes/server.rst
+++ b/docs/source/configuration/simulation/nodes/server.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _server_configuration:
@@ -19,16 +19,15 @@ example server
    simulation:
        network:
            nodes:
-                - ref: server_1
+                - hostname: server_1
-                hostname: server_1
+                  type: server
-                type: server
+                  ip_address: 192.168.10.10
-                ip_address: 192.168.10.10
+                  subnet_mask: 255.255.255.0
-                subnet_mask: 255.255.255.0
+                  default_gateway: 192.168.10.1
-                default_gateway: 192.168.10.1
+                  dns_server: 192.168.1.10
-                dns_server: 192.168.1.10
+                  applications:
                applications:
                    ...
-                services:
+                  services:
                    ...
 .. include:: common/common_node_attributes.rst
--- a/docs/source/configuration/simulation/nodes/switch.rst
+++ b/docs/source/configuration/simulation/nodes/switch.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _switch_configuration:
@@ -17,12 +17,11 @@ example switch
 .. code-block:: yaml
    simulation:
-        network:
+      network:
-            nodes:
+        nodes:
-                - ref: switch_1
+          hostname: switch_1
-                hostname: switch_1
+          type: switch
-                type: switch
+          num_ports: 8
                num_ports: 8
 .. include:: common/common_node_attributes.rst
--- a/docs/source/configuration/simulation/software/applications.rst
+++ b/docs/source/configuration/simulation/software/applications.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 ``applications``
 ----------------
@@ -14,12 +14,10 @@ Applications takes a list of applications as shown in the example below.
 .. code-block:: yaml
-    - ref: client_1
+    - hostname: client_1
-    hostname: client_1
+      type: computer
    type: computer
    ...
    applications:
        - ref: example_application
        type: example_application_type
        options:
            # this section is different for each application
--- a/docs/source/configuration/simulation/software/services.rst
+++ b/docs/source/configuration/simulation/software/services.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 ``services``
 ------------
@@ -14,12 +14,10 @@ Services takes a list of services as shown in the example below.
 .. code-block:: yaml
    - ref: client_1
    hostname: client_1
    type: computer
    ...
    applications:
        - ref: example_service
        type: example_service_type
        options:
            # this section is different for each service
--- a/docs/source/customising_scenarios.rst
+++ b/docs/source/customising_scenarios.rst
@@ -1,4 +0,0 @@
 Customising Agents
 ******************
 For an example of how to customise red agent behaviour in the Data Manipulation scenario, please refer to the notebook ``Data-Manipulation-Customising-Red-Agent.ipynb``.
--- a/docs/source/dependencies.rst
+++ b/docs/source/dependencies.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2024, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. role::  raw-html(raw)
    :format: html
--- a/docs/source/developer_tools.rst
+++ b/docs/source/developer_tools.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2024, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _Developer Tools:
@@ -49,36 +49,68 @@ dev-mode configuration
 The following configures some specific items that the dev-mode overrides, if enabled.
-`--sys-log-level` or `-level`
+`--sys-log-level` or `-slevel`
----------------------------
+-----------------------------
 The level of system logs can be overridden by dev-mode.
 By default, this is set to DEBUG
-The available options are [DEBUG|INFO|WARNING|ERROR|CRITICAL]
+The available options for both system and agent logs are:
-.. code-block::
+-------------------+
-
+| Log Level         |
-    primaite dev-mode config -level INFO
+===================+
-
+| DEBUG             |
-or
+-------------------+
 | INFO              |
 +-------------------+
 | WARNING           |
 +-------------------+
 | ERROR             |
 +-------------------+
 | CRITICAL          |
 +-------------------+
 .. code-block::
    primaite dev-mode config --sys-log-level INFO
 or
 .. code-block::
    primaite dev-mode config -slevel INFO
 `--agent-log-level` or `-alevel`
 --------------------------------
 The level of agent logs can be overridden by dev-mode.
 By default, this is set to DEBUG.
 .. code-block::
    primaite dev-mode config --agent-log-level INFO
 or
 .. code-block::
    primaite dev-mode config -alevel INFO
 `--output-sys-logs` or `-sys`
 -----------------------------
-The outputting of system logs can be overridden by dev-mode.
+The output of system logs can be overridden by dev-mode.
 By default, this is set to False
 Enabling system logs
 """"""""""""""""""""
-To enable outputting of system logs
+To enable output of system logs
 .. code-block::
@@ -93,7 +125,7 @@ or
 Disabling system logs
 """""""""""""""""""""
-To disable outputting of system logs
+To disable output of system logs
 .. code-block::
@@ -105,17 +137,47 @@ or
    primaite dev-mode config -nsys
 Enabling agent logs
 """"""""""""""""""""
 To enable output of system logs
 .. code-block::
    primaite dev-mode config --output-agent-logs
 or
 .. code-block::
    primaite dev-mode config -agent
 Disabling system logs
 """""""""""""""""""""
 To disable output of system logs
 .. code-block::
    primaite dev-mode config --no-agent-logs
 or
 .. code-block::
    primaite dev-mode config -nagent
 `--output-pcap-logs` or `-pcap`
 -------------------------------
-The outputting of packet capture logs can be overridden by dev-mode.
+The output of packet capture logs can be overridden by dev-mode.
 By default, this is set to False
 Enabling PCAP logs
 """"""""""""""""""
-To enable outputting of packet capture logs
+To enable output of packet capture logs
 .. code-block::
@@ -130,7 +192,7 @@ or
 Disabling PCAP logs
 """""""""""""""""""
-To disable outputting of packet capture logs
+To disable output of packet capture logs
 .. code-block::
@@ -145,14 +207,14 @@ or
 `--output-to-terminal` or `-t`
 ------------------------------
-The outputting of system logs to the terminal can be overridden by dev-mode.
+The output of system logs to the terminal can be overridden by dev-mode.
 By default, this is set to False
 Enabling system log output to terminal
 """"""""""""""""""""""""""""""""""""""
-To enable outputting of system logs to terminal
+To enable output of system logs to terminal
 .. code-block::
@@ -167,7 +229,7 @@ or
 Disabling system log output to terminal
 """""""""""""""""""""""""""""""""""""""
-To disable outputting of system logs to terminal
+To disable output of system logs to terminal
 .. code-block::
--- a/docs/source/environment.rst
+++ b/docs/source/environment.rst
@@ -1,3 +1,7 @@
 .. only:: comment
    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 RL Environments
 ***************
--- a/docs/source/example_notebooks.rst
+++ b/docs/source/example_notebooks.rst
@@ -1,6 +1,8 @@
 .. only:: comment
-    © Crown-owned copyright 2024, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _example jupyter notebooks:
 Example Jupyter Notebooks
 =========================
@@ -18,6 +20,7 @@ Running Jupyter Notebooks
 -------------------------
 1. Navigate to the PrimAITE directory
 """""""""""""""""""""""""""""""""""""
 .. code-block:: bash
    :caption: Unix
@@ -29,7 +32,10 @@ Running Jupyter Notebooks
    cd ~\primaite\{VERSION}
-2. Run jupyter notebook (the python environment to which you installed PrimAITE must be active)
+2. Run jupyter notebook
 """""""""""""""""""""""
 **Please note that the python environment to which you installed PrimAITE must be active.**
 .. code-block:: bash
    :caption: Unix
@@ -42,11 +48,13 @@ Running Jupyter Notebooks
    jupyter notebook
 3. Opening the jupyter webpage (optional)
 """""""""""""""""""""""""""""""""""""""""
 The default web browser may automatically open the webpage. However, if that is not the case, click the link shown in your command prompt output. It should look like this: ``http://localhost:8888/?token=0123456798abc0123456789abc``
 4. Navigate to the list of notebooks
 """""""""""""""""""""""""""""""""""""""""
 The example notebooks are located in ``notebooks/example_notebooks/``. The file system shown in the jupyter webpage is relative to the location in which the ``jupyter notebook`` command was used.
--- a/docs/source/game_layer.rst
+++ b/docs/source/game_layer.rst
@@ -1,3 +1,7 @@
 .. only:: comment
    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 PrimAITE Game layer
 *******************
@@ -38,50 +42,50 @@ An agent's reward function is managed by the ``RewardManager``. It calculates re
 Reward Components
 -----------------
-Currently implemented are reward components tailored to the data manipulation scenario. View the full API and description of how they work here: :py:module:`primaite.game.agent.reward`.
+Currently implemented are reward components tailored to the data manipulation scenario. View the full API and description of how they work here: :py:mod:`primaite.game.agent.rewards`.
 Reward Sharing
 --------------
 An agent's reward can be based on rewards of other agents. This is particularly useful for modelling a situation where the blue agent's job is to protect the ability of green agents to perform their pattern-of-life. This can be configured in the YAML file this way:
-```yaml
+.. code-block:: yaml
 green_agent_1: # this agent sometimes tries to access the webpage, and sometimes the database
    # actions, observations, and agent settings go here
    reward_function:
      reward_components:
-        # When the webpage loads, the reward goes up by 0.25 when it fails to load, it goes down to -0.25
+  green_agent_1: # this agent sometimes tries to access the webpage, and sometimes the database
-        - type: WEBPAGE_UNAVAILABLE_PENALTY
+      # actions, observations, and agent settings go here
-          weight: 0.25
+      reward_function:
-          options:
+        reward_components:
            node_hostname: client_2
-        # When the database is reachable, the reward goes up by 0.05, when it is unreachable it goes down to -0.05
+          # When the webpage loads, the reward goes up by 0.25 when it fails to load, it goes down to -0.25
-        - type: GREEN_ADMIN_DATABASE_UNREACHABLE_PENALTY
+          - type: webpage-unavailable-penalty
-          weight: 0.05
+            weight: 0.25
-          options:
+            options:
-            node_hostname: client_2
+              node_hostname: client_2
-blue_agent:
+          # When the database is reachable, the reward goes up by 0.05, when it is unreachable it goes down to -0.05
-    # actions, observations, and agent settings go here
+          - type: green-admin-database-unreachable-penalty
-    reward_function:
+            weight: 0.05
-      reward_components:
+            options:
              node_hostname: client_2
-        # When the database file is in a good state, blue's reward is 0.4, when it's in a corrupted state the reward is -0.4
+  blue_agent:
-        - type: DATABASE_FILE_INTEGRITY
+      # actions, observations, and agent settings go here
-          weight: 0.40
+      reward_function:
-          options:
+        reward_components:
            node_hostname: database_server
            folder_name: database
            file_name: database.db
-        # The green's reward is added onto the blue's reward.
+          # When the database file is in a good state, blue's reward is 0.4, when it's in a corrupted state the reward is -0.4
-        - type: SHARED_REWARD
+          - type: database-file-integrity
-          weight: 1.0
+            weight: 0.40
-          options:
+            options:
-            agent_name: client_2_green_user
+              node_hostname: database_server
              folder_name: database
              file_name: database.db
          # The green's reward is added onto the blue's reward.
          - type: shared-reward
            weight: 1.0
            options:
              agent_name: client_2_green_user
 ```
 When defining agent reward sharing, users must be careful to avoid circular references, as that would lead to an infinite calculation loop. PrimAITE will prevent circular dependencies and provide a helpful error message if they are detected in the yaml.
--- a/docs/source/getting_started.rst
+++ b/docs/source/getting_started.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2024, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _getting-started:
--- a/docs/source/glossary.rst
+++ b/docs/source/glossary.rst
@@ -1,6 +1,6 @@
 .. only:: comment
-    © Crown-owned copyright 2024, Defence Science and Technology Laboratory UK
+    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 Glossary
 =============
@@ -9,10 +9,10 @@ Glossary
    :sorted:
    Network
-        The network in primaite is a logical representation of a computer network containing :term:`Nodes<Node>` and :term:`Links<Link>`.
+        The network in primaite is a logical representation of a computer network containing :term:`Nodes<Node>` and :term:`Links<Link>`. See :ref:`network`.
    Node
-        A Node represents a network endpoint. For example a computer, server, switch, or an actuator.
+        A Node represents a network endpoint. For example a computer, server, switch, or an actuator. See :ref:`node_description`
    Link
        A Link represents the connection between two Nodes. For example, a physical wire between a computer and a switch or a wireless connection.
@@ -21,7 +21,7 @@ Glossary
        Protocols are used by links to separate different types of network traffic. Common examples would be HTTP, TCP, and UDP.
    Service
-        A service represents a piece of software that is installed on a node, such as a web server or a database.
+        A service represents a piece of software that is installed on a node, such as a web server or a database. See :ref:`software`
    Access Control List
        PrimAITE blocks or allows certain traffic on the network by simulating firewall rules, which are defined in the Access Control List.
@@ -42,7 +42,7 @@ Glossary
        PoLs allow agents to change the current hardware, OS, file system, or service statuses of nodes during the course of an episode. For example, a green agent may restart a server node to represent scheduled maintainance. A red agent's Pattern-of-Life can be used to attack nodes by changing their states to CORRUPTED or COMPROMISED.
    Reward
-        The reward is a single number used by the blue agent to understand whether it's performing well or poorly. RL agents change their behaviour in an attempt to increase the expected reward each episode. The reward is generated based on the current states of the environment and is impacted positively by things like green PoL running successfully and negatively by things like nodes being compromised.
+        The reward is a single number used by the blue agent to understand whether it's performing well or poorly. RL agents change their behaviour in an attempt to increase the expected reward each episode. The reward is generated based on the current states of the environment and is impacted positively by things like green PoL running successfully and negatively by things like nodes being compromised. See :ref:`Rewards`
    Observation
        An observation is a representation of the current state of the environment that is given to the learning agent so it can decide on which action to perform. If the environment is 'fully observable', the observation contains information about every possible aspect of the environment. More commonly, the environment is 'partially observable' which means the learning agent has to make decisions without knowing every detail of the current environment state.
@@ -50,6 +50,9 @@ Glossary
    Action
        The learning agent decides on an action to take on every step in the simulation. The action has the chance to positively or negatively impact the environment state. Over time, the agent aims to learn which actions to take when to maximise the expected reward.
    Action mask
        An input to RL algorithms that contains information about which of the actions in the action space are currently valid. See :ref:`action_masking`
    Training
        During training, an RL agent is placed in the simulated network and it learns which actions to take in which scenarios to obtain maximum reward.
@@ -69,4 +72,13 @@ Glossary
        PrimAITE uses the Gymnasium reinforcement learning framework API to create a training environment and interface with RL agents. Gymnasium defines a common way of creating observations, actions, and rewards.
    User app home
-        PrimAITE supports upgrading software version while retaining user data. The user data directory is where configs, notebooks, and results are stored, this location is `~/primaite<version>/` on linux/darwin and `C:\\Users\\<username>\\primaite<version>` on Windows.
+        PrimAITE supports upgrading software version while retaining user data. The user data directory is where configs, notebooks, and results are stored, this location is ``~/primaite/<version>/`` on linux/darwin and ``C:\\Users\\<username>\\primaite\\<version>`` on Windows.
    Episode schedule
        The strategy for selecting different variants around the same scenario when advancing from one episode to another in the environment.
    Discriminator
        A unique string given to extensible components in PrimAITE that allow them to be mapped from a YAML config definition to a simulation class.
    Plugin
        A python package that extends base PrimAITE classes.
--- a/docs/source/how_to.rst
+++ b/docs/source/how_to.rst
@@ -0,0 +1,10 @@
 .. only:: comment
    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 How-To Guides
 =============
 These how-to guides aim to provide a starting point for development within PrimAITE, creating your own custom components and environments for use when training agents. More detailed information for each section can be found within the documentation.
 There are also some additional notebooks which provide a walkthrough of established content. It's encouraged to reference these when developing for PrimAITE.
--- a/docs/source/how_to_guides/custom_actions.rst
+++ b/docs/source/how_to_guides/custom_actions.rst
@@ -0,0 +1,58 @@
 .. only:: comment
    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _custom_actions:
 Creating Custom Actions in PrimAITE
 ***********************************
 PrimAITE contains a selection of possible actions that can be exercised within a training environment. Actions provided as a part of PrimAITE can be seen within `src/primaite/game/agent/actions`. `Note`: Agents are only able to perform the actions listed within it's action_map, defined within it's configuration YAML. See :ref:`custom_environment` for more information.
 Developing Custom Actions
 =========================
 Actions within PrimAITE follow a default format, as seen below and in ``actions.py``. It's important that they have an identifier when declared, as this is used when creating the training environment.
 An example of a custom action is seen below, with key information about what is required for new actions in :ref:`extensible_actions`.
 .. code:: Python
    class ExampleActionClass(AbstractAction, discriminator="ExampleActions"):
        """Example Action Class"""
        config: ExampleAction.ConfigSchema(AbstractAction.ConfigSchema)
        class ConfigSchema(AbstractAction.ConfigSchema)
            node_name: str
        @classmethod
        def form_request(cls, config: ConfigSchema) -> RequestFormat:
            return [config.node_name, "example_action"]
 Integration with PrimAITE ActionManager
 =======================================
 Any custom actions should then be added to the `ActionManager` class, and the `act_class_identifiers` dictionary. This will map the action class to the corresponding action type string that would be passed through the PrimAITE `request_system`.
 Interaction with the PrimAITE Request Manager
 =============================================
 Where an action would cause a request to be sent through the PrimAITE RequestManager, a `form_request` method is expected to be defined within the Action Class. This should format  the action into a format that can be ingested by the `RequestManager`. Examples of this include the `NodeFolderCreateAction`, which sends a formed request to create a folder on a given node (seen below):
 .. code:: Python
    def form_request(self, node_id: int, folder_name: str) -> RequestFormat:
        """Return the action formatted as a request which can be ingested by the PrimAITE simulation."""
        node_name = self.manager.get_node_name_by_idx(node_id)
        if node_name is None or folder_name is None:
            return ["do_nothing"]
        return ["network", "node", node_name, "file_system", "create", "folder", folder_name]
 Action Masking
 ==============
 Agents which use the `ProxyAgent` class within PrimAITE are able to use Action Masking. This allows the agent to know if the actions are valid/invalid based on the current environment.
 Information on how to ensure this can be applied to your custom action can be found in :ref:`action_masking`
--- a/docs/source/how_to_guides/custom_environments.rst
+++ b/docs/source/how_to_guides/custom_environments.rst
@@ -0,0 +1,45 @@
 .. only:: comment
    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _custom_environments:
 Creating Custom Environments for PrimAITE
 *****************************************
 PrimAITE generates it's training configuration/Environments through ingestion of YAML files. A detailed walkthrough of how to create your own environment can be found within the ``Creating-Custom-Environments`` jupyter notebook.
 You configuration file should follow the hierarchy seen below:
 .. code:: yaml
    metadata:
        version: 4.0
    required_plugins:
        - name: Example_Plugin
          version: 1.0
    io_settings:
    ...
    game:
    ...
    agents:
    ...
    simulation:
    ...
 MetaData
 ========
 It's important to include the metadata tag within your YAML file, as this is used to ensure PrimAITE can interpret the configuration correctly. This should also include any plugins that are required for the defined environment, along with their respective version.
 Required Plugins
 ================
 Should your custom environment need any additional PrimAITE plugins, each must be specified under the `required_plugins` tab, as seen in the above example.
 Configuration Items
 ===================
 For detailed information about the remaining configuration items found within the configuration file, see :ref:`Configurable_Items`.
--- a/docs/source/how_to_guides/custom_rewards.rst
+++ b/docs/source/how_to_guides/custom_rewards.rst
@@ -0,0 +1,48 @@
 .. only:: comment
    © Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
 .. _custom_rewards:
 Creating Custom Rewards in PrimAITE
 ***********************************
 Rewards within PrimAITE are contained within ``rewards.py``, which details the rewards available for all agents within training sessions, how they are calculated and any other specific information where necessary.
 Rewards within PrimAITE have been updated to facilitate extensibility and the creation of plugins with the release of PrimAITE version 4.0. Additional information about this is covered within :ref:`extensible_rewards`.
 Custom Rewards within PrimAITE should inherit from the ``AbstractReward`` class, found in ``rewards.py``. It's important to include an identifier for any class created within PrimAITE.
 .. code:: Python
    class ExampleAward(AbstractReward, identifier="ExampleAward"):
        """Example Reward Class """
        def calculate(self, state: Dict, last_action_response: "AgentHistoryItem") -> float:
            """Calculate the reward for the current state."""
            return 1.0
        @classmethod
        def from_config(cls, config: dict) -> "AbstractReward":
            """Create a reward function component from a config dictionary."""
            return cls()
 Custom rewards that have been created should be added to the ``rew_class_identifiers`` dictionary within the ``RewardFunction`` class in ``rewards.py``.
 Including Custom Rewards within PrimAITE configuration
 ======================================================
 Custom rewards can then be included within an agents configuration by it's inclusion within the training session configuration YAML.
 .. code:: yaml
    agents:
      - ref: agent_name
        reward_function:
          reward_components:
            - type: DUMMY
              weight: 1.0
 More detailed information about rewards within PrimAITE can be found within :ref:`Rewards`
--- a/Show More
+++ b/Show More