diff --git a/docs/api.rst b/docs/api.rst index aeaef4e2..13f3a1ec 100644 --- a/docs/api.rst +++ b/docs/api.rst @@ -1,3 +1,5 @@ +:orphan: + .. only:: comment © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK diff --git a/docs/conf.py b/docs/conf.py index efd60b49..6cdc0ac4 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -15,7 +15,6 @@ import furo # noqa sys.path.insert(0, os.path.abspath("../")) - # -- Project information ----------------------------------------------------- year = datetime.datetime.now().year project = "PrimAITE" @@ -45,13 +44,17 @@ extensions = [ "sphinx_copybutton", # Adds a copy button to code blocks ] - templates_path = ["_templates"] -exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"] - +exclude_patterns = [ + "_build", + "Thumbs.db", + ".DS_Store", +] # -- Options for HTML output ------------------------------------------------- # https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output html_theme = "furo" html_static_path = ["_static"] +html_theme_options = {"globaltoc_collapse": True, "globaltoc_maxdepth": 2} +html_copy_source = False diff --git a/docs/source/config.rst b/docs/source/config.rst index 575a3139..46631ab9 100644 --- a/docs/source/config.rst +++ b/docs/source/config.rst @@ -1,3 +1,7 @@ +.. only:: comment + + © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK + Primaite v3 config ****************** @@ -5,98 +9,13 @@ PrimAITE uses a single configuration file to define everything needed to train a The entire config is used by the ``PrimaiteSession`` object for users who wish to let PrimAITE handle the agent definition and training. If you wish to define custom agents and control the training loop yourself, you can use the config with the ``PrimaiteGame``, and ``PrimaiteGymEnv`` objects instead. That way, only the network configuration and agent setup parts of the config are used, and the training section is ignored. Configurable items -================== +################## -``training_config`` -------------------- -This section allows selecting which training framework and algorithm to use, and set some training hyperparameters. +.. toctree:: + :maxdepth: 1 -``io_settings`` ---------------- -This section configures how PrimAITE saves data during simulation and training. - -**save_final_model**: Only used if training with PrimaiteSession, if true, the policy will be saved after the final training iteration. - -**save_checkpoints**: Only used if training with PrimaiteSession, if true, the policy will be saved periodically during training. - -**checkpoint_interval**: Only used if training with PrimaiteSession and if ``save_checkpoints`` is true. Defines how often to save the policy during training. - -**save_logs**: *currently unused*. - -**save_transactions**: *currently unused*. - -**save_tensorboard_logs**: *currently unused*. - -**save_step_metadata**: Whether to save the RL agents' action, environment state, and other data at every single step. - -**save_pcap_logs**: Whether to save pcap files of all network traffic during the simulation. - -**save_sys_logs**: Whether to save system logs from all nodes during the simulation. - -``game`` --------- -This section defines high-level settings that apply across the game, currently it's used to help shape the action and observation spaces by restricting which ports and internet protocols should be considered. Here, users can also set the maximum number of steps in an episode. - -``agents`` ----------- -Agents can be scripted (deterministic and stochastic), or controlled by a reinforcement learning algorithm. Not to be confused with an RL agent, the term agent here is used to refer to an entity that sends requests to the simulated network. In this part of the config, each agent's action space, observation space, and reward function can be defined. All three are defined in a modular way. - -**type**: Specifies which class should be used for the agent. ``ProxyAgent`` is used for agents that receive instructions from an RL algorithm. Scripted agents like ``RedDatabaseCorruptingAgent`` and ``GreenWebBrowsingAgent`` generate their own behaviour. - -**team**: Specifies if the agent is malicious (RED), benign (GREEN), or defensive (BLUE). Currently this value is not used for anything. - -**observation space:** - * ``type``: selects which python class from the ``primaite.game.agent.observation`` module is used for the overall observation structure. - * ``options``: allows configuring the chosen observation type. The ``UC2BlueObservation`` should be used for RL Agents. - * ``num_services_per_node``, ``num_folders_per_node``, ``num_files_per_folder``, ``num_nics_per_node`` all define the shape of the observation space. The size and shape of the obs space must remain constant, but the number of files, folders, ACL rules, and other components can change within an episode. Therefore padding is performed and these options set the size of the obs space. - * ``nodes``: list of nodes that will be present in this agent's observation space. The ``node_ref`` relates to the human-readable unique reference defined later in the ``simulation`` part of the config. Each node can also be configured with services, and files that should be monitored. - * ``links``: list of links that will be present in this agent's observation space. The ``link_ref`` relates to the human-readable unique reference defined later in the ``simulation`` part of the config. - * ``acl``: configure how the agent reads the access control list on the router in the simulation. ``router_node_ref`` is for selecting which router's ACL table should be used. ``ip_address_order`` sets the encoding of ip addresses as integers within the observation space. - -**action space:** -The action space is configured to be made up of individual action types. Once configured, the agent can select an action type and some optional action parameters at every step. For example: The ``NODE_SERVICE_SCAN`` action takes the parameters ``node_id`` and ``service_id``. - -Description of configurable items: - * ``action_list``: a list of action modules. The options are listed in the ``primaite.game.agent.actions`` module. - * ``action_map``: (optional). Restricts the possible combinations of action type / action parameter values to reduce the overall size of the action space. By default, every possible combination of actions and parameters will be assigned an integer for the agent's ``MultiDiscrete`` action space. Instead, the ``action_map`` allows you to list the actions corresponding to each integer in the ``MultiDiscrete`` space. - * ``options``: Options that apply too all action components. - * ``nodes``: list the nodes that the agent can act on, the order of this list defines the mapping between nodes and ``node_id`` integers. - * ``max_folders_per_node``, ``max_files_per_folder``, ``max_services_per_node``, ``max_nics_per_node``, ``max_acl_rules`` all are used to define the size of the action space. - -**reward function:** -Similar to action space, this is defined as a list of components. - -Description of configurable items: - * ``reward_components`` a list of reward components from the ``primaite.game.agent.reward`` module. - * ``weight``: relative importance of this reward component. The total reward for a step is a weighted sum of all reward components. - * ``options``: list of options passed to the reward component during initialisation, the exact options required depend on the reward component. - -**agent_settings**: -Settings passed to the agent during initialisation. These depend on the agent class. - -Reinforcement learning agents use the ``ProxyAgent`` class, they accept these agent settings: - -**flatten_obs**: If true, gymnasium flattening will be performed on the observation space before sending to the agent. Set this to true if your agent does not support nested observation spaces. - -``simulation`` --------------- -In this section the network layout is defined. This part of the config follows a hierarchical structure. Almost every component defines a ``ref`` field which acts as a human-readable unique identifier, used by other parts of the config, such as agents. - -At the top level of the network are ``nodes`` and ``links``. - -**nodes:** - * ``type``: one of ``router``, ``switch``, ``computer``, or ``server``, this affects what other sub-options should be defined. - * ``hostname`` - a non-unique name used for logging and outputs. - * ``num_ports`` (optional, routers and switches only): number of network interfaces present on the device. - * ``ports`` (optional, routers and switches only): configuration for each network interface, including IP address and subnet mask. - * ``acl`` (Router only): Define the ACL rules at each index of the ACL on the router. the possible options are: ``action`` (PERMIT or DENY), ``src_port``, ``dst_port``, ``protocol``, ``src_ip``, ``dst_ip``. Any options left blank default to none which usually means that it will apply across all options. For example leaving ``src_ip`` blank will apply the rule to all IP addresses. - * ``services`` (computers and servers only): a list of services to install on the node. They must define a ``ref``, ``type``, and ``options`` that depend on which ``type`` was selected. - * ``applications`` (computer and servers only): Similar to services. A list of application to install on the node. - * ``network_interfaces`` (computers and servers only): If the node has multiple networking devices, the second, third, fourth, etc... must be defined here with an ``ip_address`` and ``subnet_mask``. - -**links:** - * ``ref``: unique identifier for this link - * ``endpoint_a_ref``: Reference to the node at the first end of the link - * ``endpoint_a_port``: The ethernet port or switch port index of the second node - * ``endpoint_b_ref``: Reference to the node at the second end of the link - * ``endpoint_b_port``: The ethernet port or switch port index on the second node + configuration/training_config.rst + configuration/io_settings.rst + configuration/game.rst + configuration/agents.rst + configuration/simulation.rst diff --git a/docs/source/configuration/agents.rst b/docs/source/configuration/agents.rst new file mode 100644 index 00000000..4d81c89d --- /dev/null +++ b/docs/source/configuration/agents.rst @@ -0,0 +1,45 @@ +.. only:: comment + + © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK + + +``agents`` +========== +Agents can be scripted (deterministic and stochastic), or controlled by a reinforcement learning algorithm. Not to be confused with an RL agent, the term agent here is used to refer to an entity that sends requests to the simulated network. In this part of the config, each agent's action space, observation space, and reward function can be defined. All three are defined in a modular way. + +**type**: Specifies which class should be used for the agent. ``ProxyAgent`` is used for agents that receive instructions from an RL algorithm. Scripted agents like ``RedDatabaseCorruptingAgent`` and ``GreenWebBrowsingAgent`` generate their own behaviour. + +**team**: Specifies if the agent is malicious (RED), benign (GREEN), or defensive (BLUE). Currently this value is not used for anything. + +**observation space:** + * ``type``: selects which python class from the ``primaite.game.agent.observation`` module is used for the overall observation structure. + * ``options``: allows configuring the chosen observation type. The ``UC2BlueObservation`` should be used for RL Agents. + * ``num_services_per_node``, ``num_folders_per_node``, ``num_files_per_folder``, ``num_nics_per_node`` all define the shape of the observation space. The size and shape of the obs space must remain constant, but the number of files, folders, ACL rules, and other components can change within an episode. Therefore padding is performed and these options set the size of the obs space. + * ``nodes``: list of nodes that will be present in this agent's observation space. The ``node_ref`` relates to the human-readable unique reference defined later in the ``simulation`` part of the config. Each node can also be configured with services, and files that should be monitored. + * ``links``: list of links that will be present in this agent's observation space. The ``link_ref`` relates to the human-readable unique reference defined later in the ``simulation`` part of the config. + * ``acl``: configure how the agent reads the access control list on the router in the simulation. ``router_node_ref`` is for selecting which router's ACL table should be used. ``ip_address_order`` sets the encoding of ip addresses as integers within the observation space. + +**action space:** +The action space is configured to be made up of individual action types. Once configured, the agent can select an action type and some optional action parameters at every step. For example: The ``NODE_SERVICE_SCAN`` action takes the parameters ``node_id`` and ``service_id``. + +Description of configurable items: + * ``action_list``: a list of action modules. The options are listed in the ``primaite.game.agent.actions`` module. + * ``action_map``: (optional). Restricts the possible combinations of action type / action parameter values to reduce the overall size of the action space. By default, every possible combination of actions and parameters will be assigned an integer for the agent's ``MultiDiscrete`` action space. Instead, the ``action_map`` allows you to list the actions corresponding to each integer in the ``MultiDiscrete`` space. + * ``options``: Options that apply too all action components. + * ``nodes``: list the nodes that the agent can act on, the order of this list defines the mapping between nodes and ``node_id`` integers. + * ``max_folders_per_node``, ``max_files_per_folder``, ``max_services_per_node``, ``max_nics_per_node``, ``max_acl_rules`` all are used to define the size of the action space. + +**reward function:** +Similar to action space, this is defined as a list of components. + +Description of configurable items: + * ``reward_components`` a list of reward components from the ``primaite.game.agent.reward`` module. + * ``weight``: relative importance of this reward component. The total reward for a step is a weighted sum of all reward components. + * ``options``: list of options passed to the reward component during initialisation, the exact options required depend on the reward component. + +**agent_settings**: +Settings passed to the agent during initialisation. These depend on the agent class. + +Reinforcement learning agents use the ``ProxyAgent`` class, they accept these agent settings: + +**flatten_obs**: If true, gymnasium flattening will be performed on the observation space before sending to the agent. Set this to true if your agent does not support nested observation spaces. diff --git a/docs/source/configuration/game.rst b/docs/source/configuration/game.rst new file mode 100644 index 00000000..797c3813 --- /dev/null +++ b/docs/source/configuration/game.rst @@ -0,0 +1,8 @@ +.. only:: comment + + © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK + + +``game`` +======== +This section defines high-level settings that apply across the game, currently it's used to help shape the action and observation spaces by restricting which ports and internet protocols should be considered. Here, users can also set the maximum number of steps in an episode. diff --git a/docs/source/configuration/io_settings.rst b/docs/source/configuration/io_settings.rst new file mode 100644 index 00000000..11d044bb --- /dev/null +++ b/docs/source/configuration/io_settings.rst @@ -0,0 +1,26 @@ +.. only:: comment + + © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK + + +``io_settings`` +=============== +This section configures how PrimAITE saves data during simulation and training. + +**save_final_model**: Only used if training with PrimaiteSession, if true, the policy will be saved after the final training iteration. + +**save_checkpoints**: Only used if training with PrimaiteSession, if true, the policy will be saved periodically during training. + +**checkpoint_interval**: Only used if training with PrimaiteSession and if ``save_checkpoints`` is true. Defines how often to save the policy during training. + +**save_logs**: *currently unused*. + +**save_transactions**: *currently unused*. + +**save_tensorboard_logs**: *currently unused*. + +**save_step_metadata**: Whether to save the RL agents' action, environment state, and other data at every single step. + +**save_pcap_logs**: Whether to save pcap files of all network traffic during the simulation. + +**save_sys_logs**: Whether to save system logs from all nodes during the simulation. diff --git a/docs/source/configuration/simulation.rst b/docs/source/configuration/simulation.rst new file mode 100644 index 00000000..eb13e2be --- /dev/null +++ b/docs/source/configuration/simulation.rst @@ -0,0 +1,27 @@ +.. only:: comment + + © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK + + +``simulation`` +============== +In this section the network layout is defined. This part of the config follows a hierarchical structure. Almost every component defines a ``ref`` field which acts as a human-readable unique identifier, used by other parts of the config, such as agents. + +At the top level of the network are ``nodes`` and ``links``. + +**nodes:** + * ``type``: one of ``router``, ``switch``, ``computer``, or ``server``, this affects what other sub-options should be defined. + * ``hostname`` - a non-unique name used for logging and outputs. + * ``num_ports`` (optional, routers and switches only): number of network interfaces present on the device. + * ``ports`` (optional, routers and switches only): configuration for each network interface, including IP address and subnet mask. + * ``acl`` (Router only): Define the ACL rules at each index of the ACL on the router. the possible options are: ``action`` (PERMIT or DENY), ``src_port``, ``dst_port``, ``protocol``, ``src_ip``, ``dst_ip``. Any options left blank default to none which usually means that it will apply across all options. For example leaving ``src_ip`` blank will apply the rule to all IP addresses. + * ``services`` (computers and servers only): a list of services to install on the node. They must define a ``ref``, ``type``, and ``options`` that depend on which ``type`` was selected. + * ``applications`` (computer and servers only): Similar to services. A list of application to install on the node. + * ``network_interfaces`` (computers and servers only): If the node has multiple networking devices, the second, third, fourth, etc... must be defined here with an ``ip_address`` and ``subnet_mask``. + +**links:** + * ``ref``: unique identifier for this link + * ``endpoint_a_ref``: Reference to the node at the first end of the link + * ``endpoint_a_port``: The ethernet port or switch port index of the second node + * ``endpoint_b_ref``: Reference to the node at the second end of the link + * ``endpoint_b_port``: The ethernet port or switch port index on the second node diff --git a/docs/source/configuration/training_config.rst b/docs/source/configuration/training_config.rst new file mode 100644 index 00000000..cde6cf52 --- /dev/null +++ b/docs/source/configuration/training_config.rst @@ -0,0 +1,25 @@ +.. only:: comment + + © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK + +``training_config`` +=================== + +``rl_framework`` +---------------- +The RL (Reinforcement Learning) Framework to use in the training session + +Options available are: + +- ``SB3`` (Stable Baselines 3) +- ``RLLIB_single_agent`` (Single Agent Ray RLLib) +- ``RLLIB_multi_agent`` (Multi Agent Ray RLLib) + +``rl_algorithm`` +---------------- +The Reinforcement Learning Algorithm to use in the training session + +Options available are: + +- ``PPO`` (Proximal Policy Optimisation) +- ``A2C`` (Advantage Actor Critic) diff --git a/docs/source/dependencies.rst b/docs/source/dependencies.rst index 942ccfd8..ddea27fa 100644 --- a/docs/source/dependencies.rst +++ b/docs/source/dependencies.rst @@ -5,6 +5,8 @@ .. role:: raw-html(raw) :format: html +.. _Dependencies: + Dependencies ============ diff --git a/docs/source/request_system.rst b/docs/source/request_system.rst index 392bc792..e4c5584e 100644 --- a/docs/source/request_system.rst +++ b/docs/source/request_system.rst @@ -36,7 +36,7 @@ Technical Detail This system was achieved by implementing two classes, :py:class:`primaite.simulator.core.RequestType`, and :py:class:`primaite.simulator.core.RequestManager`. ``RequestType`` ------- +--------------- The ``RequestType`` object stores a reference to a method that executes the request, for example a node could have a request type that stores a reference to ``self.turn_on()``. Technically, this can be any callable that accepts `request, context` as it's parameters. In practice, this is often defined using ``lambda`` functions within a component's ``self._init_request_manager()`` method. Optionally, the ``RequestType`` object can also hold a validator that will permit/deny the request depending on context. @@ -60,7 +60,7 @@ A simple example without chaining can be seen in the :py:class:`primaite.simulat *ellipses (``...``) used to omit code impertinent to this explanation* Chaining RequestManagers ------------------------ +------------------------ A request function needs to be a callable that accepts ``request, context`` as parameters. Since the request manager resolves requests by invoking it with ``request, context`` as parameter, it is possible to use a ``RequestManager`` as a ``RequestType``. diff --git a/docs/source/simulation.rst b/docs/source/simulation.rst index c703b299..c4bf1bf0 100644 --- a/docs/source/simulation.rst +++ b/docs/source/simulation.rst @@ -22,9 +22,9 @@ Contents simulation_components/network/nodes/host_node simulation_components/network/nodes/network_node simulation_components/network/nodes/router + simulation_components/network/nodes/switch simulation_components/network/nodes/wireless_router simulation_components/network/nodes/firewall - simulation_components/network/switch simulation_components/network/network simulation_components/system/internal_frame_processing simulation_components/system/sys_log diff --git a/docs/source/simulation_components/network/base_hardware.rst b/docs/source/simulation_components/network/base_hardware.rst index c7545810..3aa6b073 100644 --- a/docs/source/simulation_components/network/base_hardware.rst +++ b/docs/source/simulation_components/network/base_hardware.rst @@ -41,6 +41,65 @@ Node Attributes - **session_manager**: Manages user sessions within the node. - **software_manager**: Controls the installation and management of software and services on the node. +.. _Node Start up and Shut down: + +Node Start up and Shut down +--------------------------- +Nodes are powered on and off over multiple timesteps. By default, the node ``start_up_duration`` and ``shut_down_duration`` is 3 timesteps. + +Example code where a node is turned on: + +.. code-block:: python + + from primaite.simulator.network.hardware.base import Node + from primaite.simulator.network.hardware.node_operating_state import NodeOperatingState + + node = Node(hostname="pc_a") + + assert node.operating_state is NodeOperatingState.OFF # By default, node is instantiated in an OFF state + + node.power_on() # power on the node + + assert node.operating_state is NodeOperatingState.BOOTING # node is booting up + + for i in range(node.start_up_duration + 1): + # apply timestep until the node start up duration + node.apply_timestep(timestep=i) + + assert node.operating_state is NodeOperatingState.ON # node is in ON state + + +If the node needs to be instantiated in an on state: + + +.. code-block:: python + + from primaite.simulator.network.hardware.base import Node + from primaite.simulator.network.hardware.node_operating_state import NodeOperatingState + + node = Node(hostname="pc_a", operating_state=NodeOperatingState.ON) + + assert node.operating_state is NodeOperatingState.ON # node is in ON state + +Setting ``start_up_duration`` and/or ``shut_down_duration`` to ``0`` will allow for the ``power_on`` and ``power_off`` methods to be completed instantly without applying timesteps: + +.. code-block:: python + + from primaite.simulator.network.hardware.base import Node + from primaite.simulator.network.hardware.node_operating_state import NodeOperatingState + + node = Node(hostname="pc_a", start_up_duration=0, shut_down_duration=0) + + assert node.operating_state is NodeOperatingState.OFF # node is in OFF state + + node.power_on() + + assert node.operating_state is NodeOperatingState.ON # node is in ON state + + node.power_off() + + assert node.operating_state is NodeOperatingState.OFF # node is in OFF state + Node Behaviours/Functions ------------------------- diff --git a/docs/source/simulation_components/system/software.rst b/docs/source/simulation_components/system/software.rst index cd6b0aa3..7a1359f4 100644 --- a/docs/source/simulation_components/system/software.rst +++ b/docs/source/simulation_components/system/software.rst @@ -50,4 +50,5 @@ Services, Processes and Applications: data_manipulation_bot dns_client_server ftp_client_server + ntp_client_server web_browser_and_web_server_service diff --git a/docs/source/state_system.rst b/docs/source/state_system.rst index 860c9827..0bbbdd34 100644 --- a/docs/source/state_system.rst +++ b/docs/source/state_system.rst @@ -3,7 +3,7 @@ © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK Simulation State -============== +================ ``SimComponent`` objects in the simulation have a method called ``describe_state`` which return a dictionary of the state of the component. This is used to report pertinent data that could impact an agent's actions or rewards. For instance, the name and health status of a node is reported, which can be used by a reward function to punish corrupted or compromised nodes and reward healthy nodes. Each ``SimComponent`` object reports not only its own attributes in the state but also those of its child components. I.e. a computer node will report the state of its ``FileSystem`` and the ``FileSystem`` will report the state of its files and folders. This happens by recursively calling the childrens' own ``describe_state`` methods. diff --git a/pyproject.toml b/pyproject.toml index 3e5b959a..44ce75c6 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -57,7 +57,7 @@ dev = [ "build==0.10.0", "flake8==6.0.0", "flake8-annotations", - "furo==2023.3.27", + "furo==2024.01.29", "gputil==1.4.0", "pip-licenses==4.3.0", "pre-commit==2.20.0", @@ -67,7 +67,7 @@ dev = [ "pytest-cov==4.0.0", "pytest-flake8==1.1.1", "setuptools==66", - "Sphinx==6.1.3", + "Sphinx==7.2.6", "sphinx-copybutton==0.5.2", "wheel==0.38.4" ]