Merged PR 221: Version 3 beta 2 doc changes

## Summary *Replace this text with an explanation of what the changes are and how you implemented them. Can this impact any other parts of the codebase that we should keep in mind?* ## Test process *How have you tested this (if applicable)?* ## Checklist - [Y] PR is linked to a **work item** - [Y] **acceptance criteria** of linked ticket are met - [Y ] performed **self-review** of the code - [N] written **tests** for any new functionality added with this PR - [Y] updated the **documentation** if this PR changes or adds functionality - [N] written/updated **design docs** if this PR implements new functionality - [N] updated the **change log** - [Y] ran **pre-commit** checks for code style - [N] attended to any **TO-DOs** left in the code Related work items: #2068
2023-11-27 21:35:37 +00:00
parent 58a84da7af 6fd37a609a
commit 95f6cf6691
15 changed files with 29 additions and 81 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -39,7 +39,7 @@ SessionManager.

 ### Removed
 - Removed legacy simulation modules: `acl`, `common`, `environment`, `links`, `nodes`, `pol`
- Removed legacy training modules, they are replaced by the new ARCD GATE dependency
+- Removed legacy training modules
 - Removed tests for legacy code


--- a/README.md
+++ b/README.md
@@ -22,8 +22,6 @@ PrimAITE presents the following features:

 - Routers with traffic routing and firewall capabilities

- Integration with ARCD GATE for agent training
-
 - Support for multiple agents, each having their own customisable observation space, action space, and reward function definition, and either deterministic or RL-directed behaviour

 ## Getting Started with PrimAITE
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -30,7 +30,7 @@ PrimAITE incorporates the following features:
 - Uses the concept of Information Exchange Requirements (IERs) to model background pattern of life and adversarial behaviour;
 - An Access Control List (ACL) function, mimicking the behaviour of a network firewall, is applied across the model, following standard ACL rule format (e.g. DENY/ALLOW, source IP address, destination IP address, protocol and port);
 - Application of traffic to the links of the platform / system laydown adheres to the ACL ruleset;
- Presents both an OpenAI gym and Ray RLLib interface to the environment, allowing integration with any compliant defensive agents;
+- Presents both a Gymnasium and Ray RLLib interface to the environment, allowing integration with any compliant defensive agents;
 - Allows for the saving and loading of trained defensive agents;
 - Stochastic adversarial agent behaviour;
 - Full capture of discrete logs relating to agent training or evaluation (system state, agent actions taken, instantaneous and average reward for every step of every episode);
@@ -40,18 +40,18 @@ PrimAITE incorporates the following features:
 Architecture
 ^^^^^^^^^^^^

-PrimAITE is a Python application and is therefore Operating System agnostic. The OpenAI gym and Ray RLLib frameworks are employed to provide an interface and source for AI agents. Configuration of PrimAITE is achieved via included YAML files which support full control over the platform / system laydown being modelled, background pattern of life, adversarial (red agent) behaviour, and step and episode count. NetworkX based nodes and links host Python classes to present attributes and methods, and hence a more representative platform / system can be modelled within the simulation.
+PrimAITE is a Python application and is therefore Operating System agnostic. The Gymnasium and Ray RLLib frameworks are employed to provide an interface and source for AI agents. Configuration of PrimAITE is achieved via included YAML files which support full control over the platform / system laydown being modelled, background pattern of life, adversarial (red agent) behaviour, and step and episode count. NetworkX based nodes and links host Python classes to present attributes and methods, and hence a more representative platform / system can be modelled within the simulation.



 Training & Evaluation Capability
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

-PrimAITE provides a training and evaluation capability to AI agents in the context of cyber-attack, via its OpenAI Gym and RLLib compliant interface. Scenarios can be constructed to reflect platform / system laydowns consisting of any configuration of nodes (e.g. PCs, servers, switches etc.) and network links between them. All nodes can be configured to model services (and their status) and the traffic loading between them over the network links. Traffic loading is broken down into a per service granularity, relating directly to a protocol (e.g. Service A would be configured as a TCP service, and TCP traffic then flows between instances of Service A under the direction of a tailored IER). Highlights of PrimAITE’s training and evaluation capability are:
+PrimAITE provides a training and evaluation capability to AI agents in the context of cyber-attack, via its Gymnasium and RLLib compliant interface. Scenarios can be constructed to reflect platform / system laydowns consisting of any configuration of nodes (e.g. PCs, servers, switches etc.) and network links between them. All nodes can be configured to model services (and their status) and the traffic loading between them over the network links. Traffic loading is broken down into a per service granularity, relating directly to a protocol (e.g. Service A would be configured as a TCP service, and TCP traffic then flows between instances of Service A under the direction of a tailored IER). Highlights of PrimAITE’s training and evaluation capability are:

 - The scenario is not bound to a representation of any platform, system, or technology;
 - Fully configurable (network / system laydown, IERs, node pattern-of-life, ACL, number of episodes, steps per episode) and repeatable to suit the requirements of AI agents;
- Can integrate with any OpenAI Gym or RLLib compliant AI agent.
+- Can integrate with any Gymnasium or RLLib compliant AI agent.

 Use of PrimAITE default scenarios within ARCD is supported by a “Use Case Profile” tailored to the scenario.

@@ -75,7 +75,7 @@ Logs are available in CSV format and provide coverage of the above data for ever
 What is PrimAITE built with
 ---------------------------

-* `OpenAI's Gym <https://gym.openai.com/>`_ is used as the basis for AI blue agent interaction with the PrimAITE environment
+* `Gymnasium <https://gymnasium.farama.org/>`_ is used as the basis for AI blue agent interaction with the PrimAITE environment
 * `Networkx <https://github.com/networkx/networkx>`_ is used as the underlying data structure used for the PrimAITE environment
 * `Stable Baselines 3 <https://github.com/DLR-RM/stable-baselines3>`_ is used as a default source of RL algorithms (although PrimAITE is not limited to SB3 agents)
 * `Ray RLlib <https://github.com/ray-project/ray>`_ is used as an additional source of RL algorithms
@@ -107,7 +107,6 @@ Head over to the :ref:`getting-started` page to install and setup PrimAITE!
   source/primaite_session
   source/simulation
   source/game_layer
-   source/custom_agent
   source/config

 .. toctree::
--- a/docs/source/about.rst
+++ b/docs/source/about.rst
@@ -18,7 +18,6 @@ PrimAITE provides the following features:
 * Highly configurable network hosts, including definition of software, file system, and network interfaces,
 * Realistic network traffic simulation, including address and sending packets via internet protocols like TCP, UDP, ICMP, etc.
 * Routers with traffic routing and firewall capabilities
-* Interfaces with ARCD GATE to allow training of agents
 * Simulation of customisable deterministic agents
 * Support for multiple agents, each having their own customisable observation space, action space, and reward function definition.

@@ -148,7 +147,7 @@ The game layer is built on top of the simulator and it consumes the simulation a
  Observation Spaces
  ******************
  The observation space provides the blue agent with information about the current status of nodes and links.
-  PrimAITE builds on top of Gym Spaces to create an observation space that is easily configurable for users. It's made up of components which are managed by the :py:class:`primaite.environment.observations.ObservationsHandler`. Each training scenario can define its own observation space, and the user can choose which information to inlude, and how it should be formatted.
+  PrimAITE builds on top of Gymnasium Spaces to create an observation space that is easily configurable for users. It's made up of components which are managed by the :py:class:`primaite.environment.observations.ObservationsHandler`. Each training scenario can define its own observation space, and the user can choose which information to inlude, and how it should be formatted.
  NodeLinkTable component
  -----------------------
  For example, the :py:class:`primaite.environment.observations.NodeLinkTable` component represents the status of nodes and links as a ``gym.spaces.Box`` with an example format shown below:
@@ -279,7 +278,7 @@ The game layer is built on top of the simulator and it consumes the simulation a
  3. Any (Agent can take both node-based and ACL-based actions)
  The choice of action space used during a training session is determined in the config_[name].yaml file.
  **Node-Based**
-  The agent is able to influence the status of nodes by switching them off, resetting, or patching operating systems and services. In this instance, the action space is an OpenAI Gym spaces.Discrete type, as follows:
+  The agent is able to influence the status of nodes by switching them off, resetting, or patching operating systems and services. In this instance, the action space is a Gymnasium spaces.Discrete type, as follows:
  * Dictionary item {... ,1: [x1, x2, x3,x4] ...}
    The placeholders inside the list under the key '1' mean the following:
      * [0, num nodes] - Node ID (0 = nothing, node ID)
@@ -287,7 +286,7 @@ The game layer is built on top of the simulator and it consumes the simulation a
      * [0, 3] - Action on property (0 = nothing, 1 = on / scan, 2 = off / repair, 3 = reset / patch / restore)
      * [0, num services] - Resolves to service ID (0 = nothing, resolves to service)
  **Access Control List**
-  The blue agent is able to influence the configuration of the Access Control List rule set (which implements a system-wide firewall). In this instance, the action space is an OpenAI spaces.Discrete type, as follows:
+  The blue agent is able to influence the configuration of the Access Control List rule set (which implements a system-wide firewall). In this instance, the action space is an Gymnasium spaces.Discrete type, as follows:
    * Dictionary item {... ,1: [x1, x2, x3, x4, x5, x6] ...}
    The placeholders inside the list under the key '1' mean the following:
      * [0, 2] - Action (0 = do nothing, 1 = create rule, 2 = delete rule)
--- a/docs/source/custom_agent.rst
+++ b/docs/source/custom_agent.rst
@@ -1,14 +0,0 @@
-.. only:: comment
-
-    © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
-
-Custom Agents
-=============
-
-
-Integrating a user defined blue agent
-*************************************
-
-.. note::
-
-    PrimAITE uses ARCD GATE for agent integration. In order to use a custom agent with PrimAITE, you must integrate it with ARCD GATE. Please look at the ARCD GATE documentation for more information.
--- a/docs/source/game_layer.rst
+++ b/docs/source/game_layer.rst
@@ -4,9 +4,9 @@ PrimAITE Game layer
 The Primaite codebase consists of two main modules:

 * ``simulator``: The simulation logic including the network topology, the network state, and behaviour of various hardware and software classes.
-* ``game``: The agent-training infrastructure which helps reinforcement learning agents interface with the simulation. This includes the observation, action, and rewards, for RL agents, but also scripted deterministic agents. The game layer orchestrates all the interactions between modules, including ARCD GATE.
+* ``game``: The agent-training infrastructure which helps reinforcement learning agents interface with the simulation. This includes the observation, action, and rewards, for RL agents, but also scripted deterministic agents. The game layer orchestrates all the interactions between modules.

-These two components have been decoupled to allow the agent training code in ARCD GATE to be reused with other simulators. The simulator and game layer communicate using the PrimAITE State API and the PrimAITE Request API. The game layer communicates with ARCD gate using the `Farama Gymnasium Spaces API <https://gymnasium.farama.org/api/spaces/>`_.
+ The simulator and game layer communicate using the PrimAITE State API and the PrimAITE Request API.

 ..
    TODO: write up these APIs and link them here.
@@ -20,13 +20,14 @@ The game layer is responsible for managing agents and getting them to interface
 PrimAITE Session
 ^^^^^^^^^^^^^^^

-``PrimaiteSession`` is the main entry point into Primaite and it allows the simultaneous coordination of a simulation and agents that interact with it. It also sends messages to ARCD GATE to perform reinforcement learning. ``PrimaiteSession`` keeps track of multiple agents of different types.
+``PrimaiteSession`` is the main entry point into Primaite and it allows the simultaneous coordination of a simulation and agents that interact with it. ``PrimaiteSession`` keeps track of multiple agents of different types.

 Agents
 ^^^^^^

 All agents inherit from the :py:class:`primaite.game.agent.interface.AbstractAgent` class, which mandates that they have an ObservationManager, ActionManager, and RewardManager. The agent behaviour depends on the type of agent, but there are two main types:
-* RL agents action during each step is decided by an RL algorithm which lives inside of ARCD GATE. The agent within PrimAITE just acts to format and forward actions decided by an RL policy.
+
+* RL agents action during each step is decided by an appropriate RL algorithm. The agent within PrimAITE just acts to format and forward actions decided by an RL policy.
 * Deterministic agents perform all of their decision making within the PrimAITE game layer. They typically have a scripted policy which always performs the same action or a rule-based policy which performs actions based on the current state of the simulation. They can have a stochastic element, and their seed will be settable.

 ..
--- a/docs/source/getting_started.rst
+++ b/docs/source/getting_started.rst
@@ -87,22 +87,7 @@ Install PrimAITE

    pip install path\to\your\primaite.whl

-
-5. Install ARCD GATE from wheel file
-
-
-.. code-block:: bash
-    :caption: Unix
-
-    pip install path/to/your/arcd_gate-0.1.0-py3-none-any.whl
-
-.. code-block:: powershell
-    :caption: Windows (Powershell)
-
-    pip install path\to\your\arcd_gate-0.1.0-py3-none-any.whl
-
-
-6. Perform the PrimAITE setup
+5. Perform the PrimAITE setup

 .. code-block:: bash
    :caption: Unix
@@ -153,17 +138,4 @@ of your choice:

    pip install -e .[dev]

-
-4. Install ARCD GATE from wheel file
-
-.. code-block:: bash
-    :caption: Unix
-
-    pip install GATE/arcd_gate-0.1.0-py3-none-any.whl
-
-.. code-block:: powershell
-    :caption: Windows (Powershell)
-
-    pip install GATE\arcd_gate-0.1.0-py3-none-any.whl
-
 To view the complete list of packages installed during PrimAITE installation, go to the dependencies page (:ref:`Dependencies`).
--- a/docs/source/glossary.rst
+++ b/docs/source/glossary.rst
@@ -74,8 +74,8 @@ Glossary
    Laydown
        The laydown is a file which defines the training scenario. It contains the network topology, firewall rules, services, protocols, and details about green and red agent behaviours.

-    Gym
-        PrimAITE uses the Gym reinforcement learning framework API to create a training environment and interface with RL agents. Gym defines a common way of creating observations, actions, and rewards.
+    Gymnasium
+        PrimAITE uses the Gymnasium reinforcement learning framework API to create a training environment and interface with RL agents. Gymnasium defines a common way of creating observations, actions, and rewards.

    User app home
-        PrimAITE supports upgrading software version while retaining user data. The user data directory is where configs, notebooks, and results are stored, this location is `~/primaite<version>` on linux/darwin and `C:\Users\<username>\primaite\<version>` on Windows.
+        PrimAITE supports upgrading software version while retaining user data. The user data directory is where configs, notebooks, and results are stored, this location is `~/primaite<version>` on linux/darwin and `C:\\Users\\<username>\\primaite\\<version>` on Windows.
--- a/docs/source/request_system.rst
+++ b/docs/source/request_system.rst
@@ -5,12 +5,12 @@
 Request System
 ==============

-``SimComponent`` in the simulation are decoupled from the agent training logic. However, they still need a managed means of accepting requests to perform actions. For this, they use ``RequestManager`` and ``RequestType``.
+``SimComponent`` objects in the simulation are decoupled from the agent training logic. However, they still need a managed means of accepting requests to perform actions. For this, they use ``RequestManager`` and ``RequestType``.

-Just like other aspects of SimComponent, the request typess are not managed centrally for the whole simulation, but instead they are dynamically created and updated based on the nodes, links, and other components that currently exist. This was achieved in the following way:
+Just like other aspects of SimComponent, the request types are not managed centrally for the whole simulation, but instead they are dynamically created and updated based on the nodes, links, and other components that currently exist. This was achieved in the following way:

 - API
-    An ``RequestType`` contains two elements:
+    A ``RequestType`` contains two elements:

    1. ``request`` - selects which action you want to take on this ``SimComponent``. This is formatted as a list of strings such as `['network', 'node', '<node-uuid>', 'service', '<service-uuid>', 'restart']`.
    2. ``context`` - optional extra information that can be used to decide how to process the request. This is formatted as a dictionary. For example, if the request requires authentication, the context can include information about the user that initiated the request to decide if their permissions are sufficient.
--- a/docs/source/simulation.rst
+++ b/docs/source/simulation.rst
@@ -23,8 +23,3 @@ Contents
   simulation_components/network/network
   simulation_components/system/internal_frame_processing
   simulation_components/system/software
-   simulation_components/system/data_manipulation_bot
-   simulation_components/system/database_client_server
-   simulation_components/system/dns_client_server
-   simulation_components/system/ftp_client_server
-   simulation_components/system/web_browser_and_web_server_service
--- a/docs/source/simulation_components/system/data_manipulation_bot.rst
+++ b/docs/source/simulation_components/system/data_manipulation_bot.rst
@@ -16,15 +16,17 @@ The bot is intended to simulate a malicious actor carrying out attacks like:
 - Dropping tables
 - Deleting records
 - Modifying data
-On a database server by abusing an application's trusted database connectivity.
+on a database server by abusing an application's trusted database connectivity.

 Usage
 -----

 - Create an instance and call ``configure`` to set:
+
  - Target database server IP
  - Database password (if needed)
  - SQL statement payload
+
 - Call ``run`` to connect and execute the statement.

 The bot handles connecting, executing the statement, and disconnecting.
--- a/docs/source/simulation_components/system/database_client_server.rst
+++ b/docs/source/simulation_components/system/database_client_server.rst
@@ -45,17 +45,14 @@ Key features
 ^^^^^^^^^^^^

 - Connects to the ``DatabaseService`` via the ``SoftwareManager``.
+- Handles connecting and disconnecting.
 - Executes SQL queries and retrieves result sets.
- Handles connecting, querying, and disconnecting.
- Provides a simple ``query`` method for running SQL.
-

 Usage
 ^^^^^

 - Initialise with server IP address and optional password.
 - Connect to the ``DatabaseService`` with ``connect``.
- Execute SQL queries via ``query``.
 - Retrieve results in a dictionary.
 - Disconnect when finished.

@@ -70,6 +67,5 @@ Implementation

 - Leverages ``SoftwareManager`` for sending payloads over the network.
 - Connect and disconnect methods manage sessions.
- Provides easy interface for applications to query database.
 - Payloads serialised as dictionaries for transmission.
 - Extends base Application class.
--- a/docs/source/state_system.rst
+++ b/docs/source/state_system.rst
@@ -5,9 +5,9 @@
 Simulation State
 ==============

-``SimComponent`` in the simulation have a method called ``describe_state`` which returns a dictionary of the state of the component. This is used to report pertinent data that could impact agent's actions or rewards. For instance, the name and health status of a node is reported, which can be used by a reward function to punish corrupted or compromised nodes and reward healthy nodes. Each ``SimComponent`` reports not only it's own attributes in the state but also that of its child components. I.e. a computer node will report the state of its ``FileSystem``, and the ``FileSystem`` will report the state of its files and folders. This happens by recursively calling childrens' own ``describe_state`` methods.
+``SimComponent`` objects in the simulation have a method called ``describe_state`` which return a dictionary of the state of the component. This is used to report pertinent data that could impact an agent's actions or rewards. For instance, the name and health status of a node is reported, which can be used by a reward function to punish corrupted or compromised nodes and reward healthy nodes. Each ``SimComponent`` object reports not only its own attributes in the state but also those of its child components. I.e. a computer node will report the state of its ``FileSystem`` and the ``FileSystem`` will report the state of its files and folders. This happens by recursively calling the childrens' own ``describe_state`` methods.

-The game layer calls ``describe_state`` on the trunk ``SimComponent`` (the top-level parent) and then pass the state to the agents once per simulation step. For this reason, all ``SimComponent`` must have a ``describe_state`` method, and they must all be linked to the trunk ``SimComponent``.
+The game layer calls ``describe_state`` on the trunk ``SimComponent`` (the top-level parent) and then passes the state to the agents once per simulation step. For this reason, all ``SimComponent`` objetcs must have a ``describe_state`` method, and they must all be linked to the trunk ``SimComponent``.

 This code snippet demonstrates how the state information is defined within the ``SimComponent`` class:

--- a/src/primaite/VERSION
+++ b/src/primaite/VERSION
@@ -1 +1 @@
-3.0.0a1
+3.0.0b2dev
--- a/src/primaite/exceptions.py
+++ b/src/primaite/exceptions.py
@@ -1,6 +1,6 @@
 # © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
 class PrimaiteError(Exception):
-    """The root PrimAITe Error."""
+    """The root PrimAITE Error."""

    pass