#2257: update sphinx version + cleaning up some errors + splitting configuration page into multiple pages

This commit is contained in:
Czar Echavez
2024-02-16 16:14:36 +00:00
parent e390d8385c
commit 2e2d83c3e9
15 changed files with 220 additions and 103 deletions

View File

@@ -1,3 +1,5 @@
:orphan:
.. only:: comment
© Crown-owned copyright 2023, Defence Science and Technology Laboratory UK

View File

@@ -15,7 +15,6 @@ import furo # noqa
sys.path.insert(0, os.path.abspath("../"))
# -- Project information -----------------------------------------------------
year = datetime.datetime.now().year
project = "PrimAITE"
@@ -45,13 +44,17 @@ extensions = [
"sphinx_copybutton", # Adds a copy button to code blocks
]
templates_path = ["_templates"]
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
exclude_patterns = [
"_build",
"Thumbs.db",
".DS_Store",
]
# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
html_theme = "furo"
html_static_path = ["_static"]
html_theme_options = {"globaltoc_collapse": True, "globaltoc_maxdepth": 2}
html_copy_source = False

View File

@@ -1,3 +1,7 @@
.. only:: comment
© Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
Primaite v3 config
******************
@@ -5,98 +9,13 @@ PrimAITE uses a single configuration file to define everything needed to train a
The entire config is used by the ``PrimaiteSession`` object for users who wish to let PrimAITE handle the agent definition and training. If you wish to define custom agents and control the training loop yourself, you can use the config with the ``PrimaiteGame``, and ``PrimaiteGymEnv`` objects instead. That way, only the network configuration and agent setup parts of the config are used, and the training section is ignored.
Configurable items
==================
##################
``training_config``
-------------------
This section allows selecting which training framework and algorithm to use, and set some training hyperparameters.
.. toctree::
:maxdepth: 1
``io_settings``
---------------
This section configures how PrimAITE saves data during simulation and training.
**save_final_model**: Only used if training with PrimaiteSession, if true, the policy will be saved after the final training iteration.
**save_checkpoints**: Only used if training with PrimaiteSession, if true, the policy will be saved periodically during training.
**checkpoint_interval**: Only used if training with PrimaiteSession and if ``save_checkpoints`` is true. Defines how often to save the policy during training.
**save_logs**: *currently unused*.
**save_transactions**: *currently unused*.
**save_tensorboard_logs**: *currently unused*.
**save_step_metadata**: Whether to save the RL agents' action, environment state, and other data at every single step.
**save_pcap_logs**: Whether to save pcap files of all network traffic during the simulation.
**save_sys_logs**: Whether to save system logs from all nodes during the simulation.
``game``
--------
This section defines high-level settings that apply across the game, currently it's used to help shape the action and observation spaces by restricting which ports and internet protocols should be considered. Here, users can also set the maximum number of steps in an episode.
``agents``
----------
Agents can be scripted (deterministic and stochastic), or controlled by a reinforcement learning algorithm. Not to be confused with an RL agent, the term agent here is used to refer to an entity that sends requests to the simulated network. In this part of the config, each agent's action space, observation space, and reward function can be defined. All three are defined in a modular way.
**type**: Specifies which class should be used for the agent. ``ProxyAgent`` is used for agents that receive instructions from an RL algorithm. Scripted agents like ``RedDatabaseCorruptingAgent`` and ``GreenWebBrowsingAgent`` generate their own behaviour.
**team**: Specifies if the agent is malicious (RED), benign (GREEN), or defensive (BLUE). Currently this value is not used for anything.
**observation space:**
* ``type``: selects which python class from the ``primaite.game.agent.observation`` module is used for the overall observation structure.
* ``options``: allows configuring the chosen observation type. The ``UC2BlueObservation`` should be used for RL Agents.
* ``num_services_per_node``, ``num_folders_per_node``, ``num_files_per_folder``, ``num_nics_per_node`` all define the shape of the observation space. The size and shape of the obs space must remain constant, but the number of files, folders, ACL rules, and other components can change within an episode. Therefore padding is performed and these options set the size of the obs space.
* ``nodes``: list of nodes that will be present in this agent's observation space. The ``node_ref`` relates to the human-readable unique reference defined later in the ``simulation`` part of the config. Each node can also be configured with services, and files that should be monitored.
* ``links``: list of links that will be present in this agent's observation space. The ``link_ref`` relates to the human-readable unique reference defined later in the ``simulation`` part of the config.
* ``acl``: configure how the agent reads the access control list on the router in the simulation. ``router_node_ref`` is for selecting which router's ACL table should be used. ``ip_address_order`` sets the encoding of ip addresses as integers within the observation space.
**action space:**
The action space is configured to be made up of individual action types. Once configured, the agent can select an action type and some optional action parameters at every step. For example: The ``NODE_SERVICE_SCAN`` action takes the parameters ``node_id`` and ``service_id``.
Description of configurable items:
* ``action_list``: a list of action modules. The options are listed in the ``primaite.game.agent.actions`` module.
* ``action_map``: (optional). Restricts the possible combinations of action type / action parameter values to reduce the overall size of the action space. By default, every possible combination of actions and parameters will be assigned an integer for the agent's ``MultiDiscrete`` action space. Instead, the ``action_map`` allows you to list the actions corresponding to each integer in the ``MultiDiscrete`` space.
* ``options``: Options that apply too all action components.
* ``nodes``: list the nodes that the agent can act on, the order of this list defines the mapping between nodes and ``node_id`` integers.
* ``max_folders_per_node``, ``max_files_per_folder``, ``max_services_per_node``, ``max_nics_per_node``, ``max_acl_rules`` all are used to define the size of the action space.
**reward function:**
Similar to action space, this is defined as a list of components.
Description of configurable items:
* ``reward_components`` a list of reward components from the ``primaite.game.agent.reward`` module.
* ``weight``: relative importance of this reward component. The total reward for a step is a weighted sum of all reward components.
* ``options``: list of options passed to the reward component during initialisation, the exact options required depend on the reward component.
**agent_settings**:
Settings passed to the agent during initialisation. These depend on the agent class.
Reinforcement learning agents use the ``ProxyAgent`` class, they accept these agent settings:
**flatten_obs**: If true, gymnasium flattening will be performed on the observation space before sending to the agent. Set this to true if your agent does not support nested observation spaces.
``simulation``
--------------
In this section the network layout is defined. This part of the config follows a hierarchical structure. Almost every component defines a ``ref`` field which acts as a human-readable unique identifier, used by other parts of the config, such as agents.
At the top level of the network are ``nodes`` and ``links``.
**nodes:**
* ``type``: one of ``router``, ``switch``, ``computer``, or ``server``, this affects what other sub-options should be defined.
* ``hostname`` - a non-unique name used for logging and outputs.
* ``num_ports`` (optional, routers and switches only): number of network interfaces present on the device.
* ``ports`` (optional, routers and switches only): configuration for each network interface, including IP address and subnet mask.
* ``acl`` (Router only): Define the ACL rules at each index of the ACL on the router. the possible options are: ``action`` (PERMIT or DENY), ``src_port``, ``dst_port``, ``protocol``, ``src_ip``, ``dst_ip``. Any options left blank default to none which usually means that it will apply across all options. For example leaving ``src_ip`` blank will apply the rule to all IP addresses.
* ``services`` (computers and servers only): a list of services to install on the node. They must define a ``ref``, ``type``, and ``options`` that depend on which ``type`` was selected.
* ``applications`` (computer and servers only): Similar to services. A list of application to install on the node.
* ``network_interfaces`` (computers and servers only): If the node has multiple networking devices, the second, third, fourth, etc... must be defined here with an ``ip_address`` and ``subnet_mask``.
**links:**
* ``ref``: unique identifier for this link
* ``endpoint_a_ref``: Reference to the node at the first end of the link
* ``endpoint_a_port``: The ethernet port or switch port index of the second node
* ``endpoint_b_ref``: Reference to the node at the second end of the link
* ``endpoint_b_port``: The ethernet port or switch port index on the second node
configuration/training_config.rst
configuration/io_settings.rst
configuration/game.rst
configuration/agents.rst
configuration/simulation.rst

View File

@@ -0,0 +1,45 @@
.. only:: comment
© Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
``agents``
==========
Agents can be scripted (deterministic and stochastic), or controlled by a reinforcement learning algorithm. Not to be confused with an RL agent, the term agent here is used to refer to an entity that sends requests to the simulated network. In this part of the config, each agent's action space, observation space, and reward function can be defined. All three are defined in a modular way.
**type**: Specifies which class should be used for the agent. ``ProxyAgent`` is used for agents that receive instructions from an RL algorithm. Scripted agents like ``RedDatabaseCorruptingAgent`` and ``GreenWebBrowsingAgent`` generate their own behaviour.
**team**: Specifies if the agent is malicious (RED), benign (GREEN), or defensive (BLUE). Currently this value is not used for anything.
**observation space:**
* ``type``: selects which python class from the ``primaite.game.agent.observation`` module is used for the overall observation structure.
* ``options``: allows configuring the chosen observation type. The ``UC2BlueObservation`` should be used for RL Agents.
* ``num_services_per_node``, ``num_folders_per_node``, ``num_files_per_folder``, ``num_nics_per_node`` all define the shape of the observation space. The size and shape of the obs space must remain constant, but the number of files, folders, ACL rules, and other components can change within an episode. Therefore padding is performed and these options set the size of the obs space.
* ``nodes``: list of nodes that will be present in this agent's observation space. The ``node_ref`` relates to the human-readable unique reference defined later in the ``simulation`` part of the config. Each node can also be configured with services, and files that should be monitored.
* ``links``: list of links that will be present in this agent's observation space. The ``link_ref`` relates to the human-readable unique reference defined later in the ``simulation`` part of the config.
* ``acl``: configure how the agent reads the access control list on the router in the simulation. ``router_node_ref`` is for selecting which router's ACL table should be used. ``ip_address_order`` sets the encoding of ip addresses as integers within the observation space.
**action space:**
The action space is configured to be made up of individual action types. Once configured, the agent can select an action type and some optional action parameters at every step. For example: The ``NODE_SERVICE_SCAN`` action takes the parameters ``node_id`` and ``service_id``.
Description of configurable items:
* ``action_list``: a list of action modules. The options are listed in the ``primaite.game.agent.actions`` module.
* ``action_map``: (optional). Restricts the possible combinations of action type / action parameter values to reduce the overall size of the action space. By default, every possible combination of actions and parameters will be assigned an integer for the agent's ``MultiDiscrete`` action space. Instead, the ``action_map`` allows you to list the actions corresponding to each integer in the ``MultiDiscrete`` space.
* ``options``: Options that apply too all action components.
* ``nodes``: list the nodes that the agent can act on, the order of this list defines the mapping between nodes and ``node_id`` integers.
* ``max_folders_per_node``, ``max_files_per_folder``, ``max_services_per_node``, ``max_nics_per_node``, ``max_acl_rules`` all are used to define the size of the action space.
**reward function:**
Similar to action space, this is defined as a list of components.
Description of configurable items:
* ``reward_components`` a list of reward components from the ``primaite.game.agent.reward`` module.
* ``weight``: relative importance of this reward component. The total reward for a step is a weighted sum of all reward components.
* ``options``: list of options passed to the reward component during initialisation, the exact options required depend on the reward component.
**agent_settings**:
Settings passed to the agent during initialisation. These depend on the agent class.
Reinforcement learning agents use the ``ProxyAgent`` class, they accept these agent settings:
**flatten_obs**: If true, gymnasium flattening will be performed on the observation space before sending to the agent. Set this to true if your agent does not support nested observation spaces.

View File

@@ -0,0 +1,8 @@
.. only:: comment
© Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
``game``
========
This section defines high-level settings that apply across the game, currently it's used to help shape the action and observation spaces by restricting which ports and internet protocols should be considered. Here, users can also set the maximum number of steps in an episode.

View File

@@ -0,0 +1,26 @@
.. only:: comment
© Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
``io_settings``
===============
This section configures how PrimAITE saves data during simulation and training.
**save_final_model**: Only used if training with PrimaiteSession, if true, the policy will be saved after the final training iteration.
**save_checkpoints**: Only used if training with PrimaiteSession, if true, the policy will be saved periodically during training.
**checkpoint_interval**: Only used if training with PrimaiteSession and if ``save_checkpoints`` is true. Defines how often to save the policy during training.
**save_logs**: *currently unused*.
**save_transactions**: *currently unused*.
**save_tensorboard_logs**: *currently unused*.
**save_step_metadata**: Whether to save the RL agents' action, environment state, and other data at every single step.
**save_pcap_logs**: Whether to save pcap files of all network traffic during the simulation.
**save_sys_logs**: Whether to save system logs from all nodes during the simulation.

View File

@@ -0,0 +1,27 @@
.. only:: comment
© Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
``simulation``
==============
In this section the network layout is defined. This part of the config follows a hierarchical structure. Almost every component defines a ``ref`` field which acts as a human-readable unique identifier, used by other parts of the config, such as agents.
At the top level of the network are ``nodes`` and ``links``.
**nodes:**
* ``type``: one of ``router``, ``switch``, ``computer``, or ``server``, this affects what other sub-options should be defined.
* ``hostname`` - a non-unique name used for logging and outputs.
* ``num_ports`` (optional, routers and switches only): number of network interfaces present on the device.
* ``ports`` (optional, routers and switches only): configuration for each network interface, including IP address and subnet mask.
* ``acl`` (Router only): Define the ACL rules at each index of the ACL on the router. the possible options are: ``action`` (PERMIT or DENY), ``src_port``, ``dst_port``, ``protocol``, ``src_ip``, ``dst_ip``. Any options left blank default to none which usually means that it will apply across all options. For example leaving ``src_ip`` blank will apply the rule to all IP addresses.
* ``services`` (computers and servers only): a list of services to install on the node. They must define a ``ref``, ``type``, and ``options`` that depend on which ``type`` was selected.
* ``applications`` (computer and servers only): Similar to services. A list of application to install on the node.
* ``network_interfaces`` (computers and servers only): If the node has multiple networking devices, the second, third, fourth, etc... must be defined here with an ``ip_address`` and ``subnet_mask``.
**links:**
* ``ref``: unique identifier for this link
* ``endpoint_a_ref``: Reference to the node at the first end of the link
* ``endpoint_a_port``: The ethernet port or switch port index of the second node
* ``endpoint_b_ref``: Reference to the node at the second end of the link
* ``endpoint_b_port``: The ethernet port or switch port index on the second node

View File

@@ -0,0 +1,25 @@
.. only:: comment
© Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
``training_config``
===================
``rl_framework``
----------------
The RL (Reinforcement Learning) Framework to use in the training session
Options available are:
- ``SB3`` (Stable Baselines 3)
- ``RLLIB_single_agent`` (Single Agent Ray RLLib)
- ``RLLIB_multi_agent`` (Multi Agent Ray RLLib)
``rl_algorithm``
----------------
The Reinforcement Learning Algorithm to use in the training session
Options available are:
- ``PPO`` (Proximal Policy Optimisation)
- ``A2C`` (Advantage Actor Critic)

View File

@@ -5,6 +5,8 @@
.. role:: raw-html(raw)
:format: html
.. _Dependencies:
Dependencies
============

View File

@@ -36,7 +36,7 @@ Technical Detail
This system was achieved by implementing two classes, :py:class:`primaite.simulator.core.RequestType`, and :py:class:`primaite.simulator.core.RequestManager`.
``RequestType``
------
---------------
The ``RequestType`` object stores a reference to a method that executes the request, for example a node could have a request type that stores a reference to ``self.turn_on()``. Technically, this can be any callable that accepts `request, context` as it's parameters. In practice, this is often defined using ``lambda`` functions within a component's ``self._init_request_manager()`` method. Optionally, the ``RequestType`` object can also hold a validator that will permit/deny the request depending on context.
@@ -60,7 +60,7 @@ A simple example without chaining can be seen in the :py:class:`primaite.simulat
*ellipses (``...``) used to omit code impertinent to this explanation*
Chaining RequestManagers
-----------------------
------------------------
A request function needs to be a callable that accepts ``request, context`` as parameters. Since the request manager resolves requests by invoking it with ``request, context`` as parameter, it is possible to use a ``RequestManager`` as a ``RequestType``.

View File

@@ -22,9 +22,9 @@ Contents
simulation_components/network/nodes/host_node
simulation_components/network/nodes/network_node
simulation_components/network/nodes/router
simulation_components/network/nodes/switch
simulation_components/network/nodes/wireless_router
simulation_components/network/nodes/firewall
simulation_components/network/switch
simulation_components/network/network
simulation_components/system/internal_frame_processing
simulation_components/system/sys_log

View File

@@ -41,6 +41,65 @@ Node Attributes
- **session_manager**: Manages user sessions within the node.
- **software_manager**: Controls the installation and management of software and services on the node.
.. _Node Start up and Shut down:
Node Start up and Shut down
---------------------------
Nodes are powered on and off over multiple timesteps. By default, the node ``start_up_duration`` and ``shut_down_duration`` is 3 timesteps.
Example code where a node is turned on:
.. code-block:: python
from primaite.simulator.network.hardware.base import Node
from primaite.simulator.network.hardware.node_operating_state import NodeOperatingState
node = Node(hostname="pc_a")
assert node.operating_state is NodeOperatingState.OFF # By default, node is instantiated in an OFF state
node.power_on() # power on the node
assert node.operating_state is NodeOperatingState.BOOTING # node is booting up
for i in range(node.start_up_duration + 1):
# apply timestep until the node start up duration
node.apply_timestep(timestep=i)
assert node.operating_state is NodeOperatingState.ON # node is in ON state
If the node needs to be instantiated in an on state:
.. code-block:: python
from primaite.simulator.network.hardware.base import Node
from primaite.simulator.network.hardware.node_operating_state import NodeOperatingState
node = Node(hostname="pc_a", operating_state=NodeOperatingState.ON)
assert node.operating_state is NodeOperatingState.ON # node is in ON state
Setting ``start_up_duration`` and/or ``shut_down_duration`` to ``0`` will allow for the ``power_on`` and ``power_off`` methods to be completed instantly without applying timesteps:
.. code-block:: python
from primaite.simulator.network.hardware.base import Node
from primaite.simulator.network.hardware.node_operating_state import NodeOperatingState
node = Node(hostname="pc_a", start_up_duration=0, shut_down_duration=0)
assert node.operating_state is NodeOperatingState.OFF # node is in OFF state
node.power_on()
assert node.operating_state is NodeOperatingState.ON # node is in ON state
node.power_off()
assert node.operating_state is NodeOperatingState.OFF # node is in OFF state
Node Behaviours/Functions
-------------------------

View File

@@ -50,4 +50,5 @@ Services, Processes and Applications:
data_manipulation_bot
dns_client_server
ftp_client_server
ntp_client_server
web_browser_and_web_server_service

View File

@@ -3,7 +3,7 @@
© Crown-owned copyright 2023, Defence Science and Technology Laboratory UK
Simulation State
==============
================
``SimComponent`` objects in the simulation have a method called ``describe_state`` which return a dictionary of the state of the component. This is used to report pertinent data that could impact an agent's actions or rewards. For instance, the name and health status of a node is reported, which can be used by a reward function to punish corrupted or compromised nodes and reward healthy nodes. Each ``SimComponent`` object reports not only its own attributes in the state but also those of its child components. I.e. a computer node will report the state of its ``FileSystem`` and the ``FileSystem`` will report the state of its files and folders. This happens by recursively calling the childrens' own ``describe_state`` methods.

View File

@@ -57,7 +57,7 @@ dev = [
"build==0.10.0",
"flake8==6.0.0",
"flake8-annotations",
"furo==2023.3.27",
"furo==2024.01.29",
"gputil==1.4.0",
"pip-licenses==4.3.0",
"pre-commit==2.20.0",
@@ -67,7 +67,7 @@ dev = [
"pytest-cov==4.0.0",
"pytest-flake8==1.1.1",
"setuptools==66",
"Sphinx==6.1.3",
"Sphinx==7.2.6",
"sphinx-copybutton==0.5.2",
"wheel==0.38.4"
]