From afa7916db08f03407d40cf372bcd729fec69148a Mon Sep 17 00:00:00 2001 From: Marek Wolan Date: Wed, 25 Oct 2023 23:32:52 +0100 Subject: [PATCH] Updates to documentation --- docs/index.rst | 28 +- docs/source/about.rst | 704 ++++++++++++--------------- docs/source/action_system.rst | 88 ---- docs/source/config(v2).rst | 491 ------------------- docs/source/custom_agent.rst | 130 +---- docs/source/getting_started.rst | 60 ++- docs/source/migration_1.2_-_2.0.rst | 57 --- docs/source/primaite_session.rst | 359 +++++++------- docs/source/request_system.rst | 90 ++++ docs/source/simulation.rst | 6 +- docs/source/simulation_structure.rst | 6 +- docs/source/state_system.rst | 31 ++ 12 files changed, 673 insertions(+), 1377 deletions(-) delete mode 100644 docs/source/action_system.rst delete mode 100644 docs/source/config(v2).rst delete mode 100644 docs/source/migration_1.2_-_2.0.rst create mode 100644 docs/source/request_system.rst create mode 100644 docs/source/state_system.rst diff --git a/docs/index.rst b/docs/index.rst index ed66797d..fa877064 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -92,26 +92,34 @@ Head over to the :ref:`getting-started` page to install and setup PrimAITE! .. toctree:: :maxdepth: 8 - :caption: Contents: + :caption: About PrimAITE: + :hidden: + + source/about + source/dependencies + source/glossary + +.. toctree:: + :caption: Usage: :hidden: source/getting_started - source/about - source/config - source/config(v2) + source/primaite_session source/simulation source/game_layer - source/primaite_session source/custom_agent + source/config + +.. toctree:: + :caption: Developer information: + :hidden: + + source/state_system + source/request_system PrimAITE API PrimAITE Tests - source/dependencies - source/glossary - source/migration_1.2_-_2.0 -.. TODO: Add project links once public repo has been created - .. toctree:: :caption: Project Links: :hidden: diff --git a/docs/source/about.rst b/docs/source/about.rst index d12a59de..993dec0c 100644 --- a/docs/source/about.rst +++ b/docs/source/about.rst @@ -7,408 +7,312 @@ About PrimAITE ============== +PrimAITE is a simulation environment for training agents to protect a computer network from cyber attacks. + Features ******** PrimAITE provides the following features: -* A flexible network / system laydown based on the Python networkx framework -* Nodes and links (edges) host Python classes in order to present attributes and methods (and hence, a more representative model of a platform / system) -* A 'green agent' Information Exchange Requirement (IER) function allows the representation of traffic (protocols and loading) on any / all links. Application of IERs is based on the status of node operating systems and services -* A 'green agent' node Pattern-of-Life (PoL) function allows the representation of core behaviours on nodes (e.g. changing the Hardware state, Software State, Service state, or File System state) -* An Access Control List (ACL) function, mimicking the behaviour of a network firewall, is applied across the model, following standard ACL rule format (e.g. DENY/ALLOW, source IP, destination IP, protocol and port). Application of IERs adheres to any ACL restrictions -* Presents an OpenAI Gym interface to the environment, allowing integration with any OpenAI Gym compliant defensive agents -* Red agent activity based on 'red' IERs and 'red' PoL -* Defined reward function for use with RL agents (based on nodes status, and green / red IER success) -* Fully configurable (network / system laydown, IERs, node PoL, ACL, episode step period, episode max steps) and repeatable to suit the training requirements of agents. Therefore, not bound to a representation of any particular platform, system or technology -* Full capture of discrete metrics relating to agent training (full system state, agent actions taken, average reward) -* Networkx provides laydown visualisation capability - -Architecture - Nodes and Links -****************************** - -**Nodes** - -An inheritance model has been adopted in order to model nodes. All nodes have the following base attributes (Class: Node): - -* ID -* Name -* Type (e.g. computer, switch, RTU - enumeration) -* Priority (P1, P2, P3, P4 or P5 - enumeration) -* Hardware State (ON, OFF, RESETTING, SHUTTING_DOWN, BOOTING - enumeration) - -Active Nodes also have the following attributes (Class: Active Node): - -* IP Address -* Software State (GOOD, PATCHING, COMPROMISED - enumeration) -* File System State (GOOD, CORRUPT, DESTROYED, REPAIRING, RESTORING - enumeration) - -Service Nodes also have the following attributes (Class: Service Node): - -* List of Services (where service is composed of service name and port). There is no theoretical limit on the number of services that can be modelled. Services and protocols are currently intrinsically linked (i.e. a service is an application on a node transmitting traffic of this protocol type) -* Service state (GOOD, PATCHING, COMPROMISED, OVERWHELMED - enumeration) - -Passive Nodes are currently not used (but may be employed for non IP-based components such as machinery actuators in future releases). - -**Links** - -Links are modelled both as network edges (networkx) and as Python classes, in order to extend their functionality. Links include the following attributes: - -* ID -* Name -* Bandwidth (bits/s) -* Source node ID -* Destination node ID -* Protocol list (containing the loading of protocols currently running on the link) - -When the simulation runs, IERs are applied to the links in order to model traffic loading, individually assigned to each protocol. This allows green (background) and red agent behaviour to be modelled, and defensive agents to identify suspicious traffic patterns at a protocol / traffic loading level of fidelity. - -Information Exchange Requirements (IERs) -**************************************** - -PrimAITE adopts the concept of Information Exchange Requirements (IERs) to model both green agent (background) and red agent (adversary) behaviour. IERs are used to initiate modelling of traffic loading on the network, and have the following attributes: - -* ID -* Start step (i.e. which step in the training episode should the IER start) -* End step (i.e. which step in the training episode should the IER end) -* Source node ID -* Destination node ID -* Load (bits/s) -* Protocol -* Port -* Running status (i.e. on / off) - -The application of green agent IERs between a source and destination follows a number of rules. Specifically: - -1. Does the current simulation time step fall between IER start and end step -2. Is the source node operational (both physically and at an O/S level), and is the service (protocol / port) associated with the IER (a) present on this node, and (b) in an operational state (i.e. not PATCHING) -3. Is the destination node operational (both physically and at an O/S level), and is the service (protocol / port) associated with the IER (a) present on this node, and (b) in an operational state (i.e. not PATCHING) -4. Are there any Access Control List rules in place that prevent the application of this IER -5. Are all switches in the (OSPF) path between source and destination operational (both physically and at an O/S level) - -For red agent IERs, the application of IERs between a source and destination follows a number of subtly different rules. Specifically: - -1. Does the current simulation time step fall between IER start and end step -2. Is the source node operational, and is the service (protocol / port) associated with the IER (a) present on that node and (b) already in a compromised state -3. Is the destination node operational, and is the service (protocol / port) associated with the IER present on that node -4. Are there any Access Control List rules in place that prevent the application of this IER -5. Are all switches in the (OSPF) path between source and destination operational (both physically and at an O/S level) - -Assuming the rules pass, the IER is applied to all relevant links (based on use of OSPF) between source and destination. - -Node Pattern-of-Life -******************** - -Every node can be impacted (i.e. have a status change applied to it) by either green agent pattern-of-life or red agent pattern-of-life. This is distinct from IERs, and allows for attacks (and defence) to be modelled purely within the confines of a node. - -The status changes that can be made to a node are as follows: - -* All Nodes: - - * Hardware State: - - * ON - * OFF - * RESETTING - when a status of resetting is entered, the node will automatically exit this state after a number of steps (as defined by the nodeResetDuration configuration item) after which it returns to an ON state - * BOOTING - * SHUTTING_DOWN - -* Active Nodes and Service Nodes: - - * Software State: - - * GOOD - * PATCHING - when a status of patching is entered, the node will automatically exit this state after a number of steps (as defined by the osPatchingDuration configuration item) after which it returns to a GOOD state - * COMPROMISED - - * File System State: - - * GOOD - * CORRUPT (can be resolved by repair or restore) - * DESTROYED (can be resolved by restore only) - * REPAIRING - when a status of repairing is entered, the node will automatically exit this state after a number of steps (as defined by the fileSystemRepairingLimit configuration item) after which it returns to a GOOD state - * RESTORING - when a status of repairing is entered, the node will automatically exit this state after a number of steps (as defined by the fileSystemRestoringLimit configuration item) after which it returns to a GOOD state - -* Service Nodes only: - - * Service State (for any associated service): - - * GOOD - * PATCHING - when a status of patching is entered, the service will automatically exit this state after a number of steps (as defined by the servicePatchingDuration configuration item) after which it returns to a GOOD state - * COMPROMISED - * OVERWHELMED - -Red agent pattern-of-life has an additional feature not found in the green pattern-of-life. This is the ability to influence the state of the attributes of a node via a number of different conditions: - - * DIRECT: - - The pattern-of-life described by the configuration file item will be applied regardless of any other conditions in the network. This is particularly useful for direct red agent entry into the network. - - * IER: - - The pattern-of-life described by the configuration file item will be applied to the service on the node, only if there is an IER of the same protocol / service type incoming at the specified timestep. - - * SERVICE: - - The pattern-of-life described by the configuration file item will be applied to the node based on the state of a service. The service can either be on the same node, or a different node within the network. - -Access Control List modelling -***************************** - -An Access Control List (ACL) is modelled to provide the means to manage traffic flows in the system. This will allow defensive agents the means to turn on / off rules, or potentially create new rules, to counter an attack. - -The ACL follows a standard network firewall format. For example: - -.. list-table:: ACL example - :widths: 25 25 25 25 25 - :header-rows: 1 - - * - Permission - - Source IP - - Dest IP - - Protocol - - Port - * - DENY - - 192.168.1.2 - - 192.168.1.3 - - HTTPS - - 443 - * - ALLOW - - 192.168.1.4 - - ANY - - SMTP - - 25 - * - DENY - - ANY - - 192.168.1.5 - - ANY - - ANY - -All ACL rules are considered when applying an IER. Logic follows the order of rules, so a DENY or ALLOW for the same parameters will override an earlier entry. - -Observation Spaces -****************** -The observation space provides the blue agent with information about the current status of nodes and links. - -PrimAITE builds on top of Gym Spaces to create an observation space that is easily configurable for users. It's made up of components which are managed by the :py:class:`primaite.environment.observations.ObservationsHandler`. Each training scenario can define its own observation space, and the user can choose which information to inlude, and how it should be formatted. - -NodeLinkTable component ------------------------ -For example, the :py:class:`primaite.environment.observations.NodeLinkTable` component represents the status of nodes and links as a ``gym.spaces.Box`` with an example format shown below: - -An example observation space is provided below: - -.. list-table:: Observation Space example - :widths: 25 25 25 25 25 25 25 - :header-rows: 1 - - * - - - ID - - Hardware State - - Software State - - File System State - - Service / Protocol A - - Service / Protocol B - * - Node A - - 1 - - 1 - - 1 - - 1 - - 1 - - 1 - * - Node B - - 2 - - 1 - - 3 - - 1 - - 1 - - 1 - * - Node C - - 3 - - 2 - - 1 - - 1 - - 3 - - 2 - * - Link 1 - - 5 - - 0 - - 0 - - 0 - - 0 - - 10000 - * - Link 2 - - 6 - - 0 - - 0 - - 0 - - 0 - - 10000 - * - Link 3 - - 7 - - 0 - - 0 - - 0 - - 5000 - - 0 - -For the nodes, the following values are represented: - -.. code-block:: - - [ - ID - Hardware State (1=ON, 2=OFF, 3=RESETTING, 4=SHUTTING_DOWN, 5=BOOTING) - Operating System State (0=none, 1=GOOD, 2=PATCHING, 3=COMPROMISED) - File System State (0=none, 1=GOOD, 2=CORRUPT, 3=DESTROYED, 4=REPAIRING, 5=RESTORING) - Service1/Protocol1 state (0=none, 1=GOOD, 2=PATCHING, 3=COMPROMISED) - Service2/Protocol2 state (0=none, 1=GOOD, 2=PATCHING, 3=COMPROMISED) - ] - -(Note that each service available in the network is provided as a column, although not all nodes may utilise all services) - -For the links, the following statuses are represented: - -.. code-block:: - - [ - ID - Hardware State (0=not applicable) - Operating System State (0=not applicable) - File System State (0=not applicable) - Service1/Protocol1 state (Traffic load from this protocol on this link) - Service2/Protocol2 state (Traffic load from this protocol on this link) - ] - -NodeStatus component ----------------------- -This is a MultiDiscrete observation space that can be though of as a one-dimensional vector of discrete states. -The example above would have the following structure: - -.. code-block:: - - [ - node1_info - node2_info - node3_info - ] - -Each ``node_info`` contains the following: - -.. code-block:: - - [ - hardware_state (0=none, 1=ON, 2=OFF, 3=RESETTING, 4=SHUTTING_DOWN, 5=BOOTING) - software_state (0=none, 1=GOOD, 2=PATCHING, 3=COMPROMISED) - file_system_state (0=none, 1=GOOD, 2=CORRUPT, 3=DESTROYED, 4=REPAIRING, 5=RESTORING) - service1_state (0=none, 1=GOOD, 2=PATCHING, 3=COMPROMISED) - service2_state (0=none, 1=GOOD, 2=PATCHING, 3=COMPROMISED) - ] - -In a network with three nodes and two services, the full observation space would have 15 elements. It can be written with ``gym`` notation to indicate the number of discrete options for each of the elements of the observation space. For example: - -.. code-block:: - - gym.spaces.MultiDiscrete([4,5,6,4,4,4,5,6,4,4,4,5,6,4,4]) - -.. note:: - NodeStatus observation component provides information only about nodes. Links are not considered. - -LinkTrafficLevels ------------------ -This component is a MultiDiscrete space showing the traffic flow levels on the links in the network, after applying a threshold to convert it from a continuous to a discrete value. -There are two configurable parameters: -* ``quantisation_levels`` determines how many discrete bins to use for converting the continuous traffic value to discrete (default is 5). -* ``combine_service_traffic`` determines whether to separately output traffic use for each network protocol or whether to combine them into an overall value for the link. (default is ``True``) - -For example, with default parameters and a network with three links, the structure of this component would be: - -.. code-block:: - - [ - link1_status - link2_status - link3_status - ] - -Each ``link_status`` is a number from 0-4 representing the network load in relation to bandwidth. - -.. code-block:: - - 0 = No traffic (0%) - 1 = low traffic (1%-33%) - 2 = medium traffic (33%-66%) - 3 = high traffic (66%-99%) - 4 = max traffic/ overwhelmed (100%) - -Using ``gym`` notation, the shape of the obs space is: ``gym.spaces.MultiDiscrete([5,5,5])``. - - -Action Spaces -************** - -The action space available to the blue agent comes in two types: - - 1. Node-based - 2. Access Control List - 3. Any (Agent can take both node-based and ACL-based actions) - -The choice of action space used during a training session is determined in the config_[name].yaml file. - -**Node-Based** - -The agent is able to influence the status of nodes by switching them off, resetting, or patching operating systems and services. In this instance, the action space is an OpenAI Gym spaces.Discrete type, as follows: - - * Dictionary item {... ,1: [x1, x2, x3,x4] ...} - The placeholders inside the list under the key '1' mean the following: - - * [0, num nodes] - Node ID (0 = nothing, node ID) - * [0, 4] - What property it's acting on (0 = nothing, 1 = state, 2 = SoftwareState, 3 = service state, 4 = file system state) - * [0, 3] - Action on property (0 = nothing, 1 = on / scan, 2 = off / repair, 3 = reset / patch / restore) - * [0, num services] - Resolves to service ID (0 = nothing, resolves to service) - -**Access Control List** - -The blue agent is able to influence the configuration of the Access Control List rule set (which implements a system-wide firewall). In this instance, the action space is an OpenAI spaces.Discrete type, as follows: - - * Dictionary item {... ,1: [x1, x2, x3, x4, x5, x6] ...} - The placeholders inside the list under the key '1' mean the following: - - * [0, 2] - Action (0 = do nothing, 1 = create rule, 2 = delete rule) - * [0, 1] - Permission (0 = DENY, 1 = ALLOW) - * [0, num nodes] - Source IP (0 = any, then 1 -> x resolving to IP addresses) - * [0, num nodes] - Dest IP (0 = any, then 1 -> x resolving to IP addresses) - * [0, num services] - Protocol (0 = any, then 1 -> x resolving to protocol) - * [0, num ports] - Port (0 = any, then 1 -> x resolving to port) - -**ANY** -The agent is able to carry out both **Node-Based** and **Access Control List** operations. - -This means the dictionary will contain key-value pairs in the format of BOTH Node-Based and Access Control List as seen above. - -Rewards -******* - -A reward value is presented back to the blue agent on the conclusion of every step. The reward value is calculated via two methods which combine to give the total value: - - 1. Node and service status - 2. IER status - -**Node and service status** - -On every step, the status of each node is compared against both a reference environment (simulating the situation if the red and blue agents had not impacted the environment) -and the before and after state of the environment. If the comparison against the reference environment shows no difference, then the score provided is "AllOK". If there is a -difference with respect to the reference environment, the before and after states are compared, and a score determined. See :ref:`config` for details of reward values. - -**IER status** - -On every step, the full IER set is examined to determine whether green and red agent IERs are being permitted to run. Any red agent IERs running incur a penalty; any green agent -IERs not permitted to run also incur a penalty. See :ref:`config` for details of reward values. - -Future Enhancements -******************* - -The PrimAITE project has an ambition to include the following enhancements in future releases: - -* Integration with a suitable standardised framework to allow multi-agent integration -* Integration with external threat emulation tools, either using off-line data, or integrating at runtime +* A flexible system for defining network layouts and host configurations +* Highly configurable network hosts, including definition of software, file system, and network interfaces, +* Realistic network traffic simulation, including address and sending packets via internet protocols like TCP, UDP, ICMP, etc. +* Routers with traffic routing and firewall capabilities +* Interfaces with ARCD GATE to allow training of agents +* Simulation of customisable deterministic agents +* Support for multiple agents, each having their own customisable observation space, action space, and reward function definition. + + +Structure +********* + +PrimAITE consists of a simulator and a 'game' layer that allows agents to interact with the simulator. The simulator is built in a modular way where each component such as network hosts, links, networking devices, softwares, etc. are implemented as instances of a base class, meaning they all support the same interface. This allows for standardised configuration using either the Python API or YAML files. +The game layer is built on top of the simulator and it consumes the simulation action/state interface to allow agents to interact with the simulator. The game layer is also responsible for defining the reward function and observation space for the agents. + + +.. + Architecture - Nodes and Links + ****************************** + **Nodes** + An inheritance model has been adopted in order to model nodes. All nodes have the following base attributes (Class: Node): + * ID + * Name + * Type (e.g. computer, switch, RTU - enumeration) + * Priority (P1, P2, P3, P4 or P5 - enumeration) + * Hardware State (ON, OFF, RESETTING, SHUTTING_DOWN, BOOTING - enumeration) + Active Nodes also have the following attributes (Class: Active Node): + * IP Address + * Software State (GOOD, PATCHING, COMPROMISED - enumeration) + * File System State (GOOD, CORRUPT, DESTROYED, REPAIRING, RESTORING - enumeration) + Service Nodes also have the following attributes (Class: Service Node): + * List of Services (where service is composed of service name and port). There is no theoretical limit on the number of services that can be modelled. Services and protocols are currently intrinsically linked (i.e. a service is an application on a node transmitting traffic of this protocol type) + * Service state (GOOD, PATCHING, COMPROMISED, OVERWHELMED - enumeration) + Passive Nodes are currently not used (but may be employed for non IP-based components such as machinery actuators in future releases). + **Links** + Links are modelled both as network edges (networkx) and as Python classes, in order to extend their functionality. Links include the following attributes: + * ID + * Name + * Bandwidth (bits/s) + * Source node ID + * Destination node ID + * Protocol list (containing the loading of protocols currently running on the link) + When the simulation runs, IERs are applied to the links in order to model traffic loading, individually assigned to each protocol. This allows green (background) and red agent behaviour to be modelled, and defensive agents to identify suspicious traffic patterns at a protocol / traffic loading level of fidelity. + Information Exchange Requirements (IERs) + **************************************** + PrimAITE adopts the concept of Information Exchange Requirements (IERs) to model both green agent (background) and red agent (adversary) behaviour. IERs are used to initiate modelling of traffic loading on the network, and have the following attributes: + * ID + * Start step (i.e. which step in the training episode should the IER start) + * End step (i.e. which step in the training episode should the IER end) + * Source node ID + * Destination node ID + * Load (bits/s) + * Protocol + * Port + * Running status (i.e. on / off) + The application of green agent IERs between a source and destination follows a number of rules. Specifically: + 1. Does the current simulation time step fall between IER start and end step + 2. Is the source node operational (both physically and at an O/S level), and is the service (protocol / port) associated with the IER (a) present on this node, and (b) in an operational state (i.e. not PATCHING) + 3. Is the destination node operational (both physically and at an O/S level), and is the service (protocol / port) associated with the IER (a) present on this node, and (b) in an operational state (i.e. not PATCHING) + 4. Are there any Access Control List rules in place that prevent the application of this IER + 5. Are all switches in the (OSPF) path between source and destination operational (both physically and at an O/S level) + For red agent IERs, the application of IERs between a source and destination follows a number of subtly different rules. Specifically: + 1. Does the current simulation time step fall between IER start and end step + 2. Is the source node operational, and is the service (protocol / port) associated with the IER (a) present on that node and (b) already in a compromised state + 3. Is the destination node operational, and is the service (protocol / port) associated with the IER present on that node + 4. Are there any Access Control List rules in place that prevent the application of this IER + 5. Are all switches in the (OSPF) path between source and destination operational (both physically and at an O/S level) + Assuming the rules pass, the IER is applied to all relevant links (based on use of OSPF) between source and destination. + Node Pattern-of-Life + ******************** + Every node can be impacted (i.e. have a status change applied to it) by either green agent pattern-of-life or red agent pattern-of-life. This is distinct from IERs, and allows for attacks (and defence) to be modelled purely within the confines of a node. + The status changes that can be made to a node are as follows: + * All Nodes: + * Hardware State: + * ON + * OFF + * RESETTING - when a status of resetting is entered, the node will automatically exit this state after a number of steps (as defined by the nodeResetDuration configuration item) after which it returns to an ON state + * BOOTING + * SHUTTING_DOWN + * Active Nodes and Service Nodes: + * Software State: + * GOOD + * PATCHING - when a status of patching is entered, the node will automatically exit this state after a number of steps (as defined by the osPatchingDuration configuration item) after which it returns to a GOOD state + * COMPROMISED + * File System State: + * GOOD + * CORRUPT (can be resolved by repair or restore) + * DESTROYED (can be resolved by restore only) + * REPAIRING - when a status of repairing is entered, the node will automatically exit this state after a number of steps (as defined by the fileSystemRepairingLimit configuration item) after which it returns to a GOOD state + * RESTORING - when a status of repairing is entered, the node will automatically exit this state after a number of steps (as defined by the fileSystemRestoringLimit configuration item) after which it returns to a GOOD state + * Service Nodes only: + * Service State (for any associated service): + * GOOD + * PATCHING - when a status of patching is entered, the service will automatically exit this state after a number of steps (as defined by the servicePatchingDuration configuration item) after which it returns to a GOOD state + * COMPROMISED + * OVERWHELMED + Red agent pattern-of-life has an additional feature not found in the green pattern-of-life. This is the ability to influence the state of the attributes of a node via a number of different conditions: + * DIRECT: + The pattern-of-life described by the configuration file item will be applied regardless of any other conditions in the network. This is particularly useful for direct red agent entry into the network. + * IER: + The pattern-of-life described by the configuration file item will be applied to the service on the node, only if there is an IER of the same protocol / service type incoming at the specified timestep. + * SERVICE: + The pattern-of-life described by the configuration file item will be applied to the node based on the state of a service. The service can either be on the same node, or a different node within the network. + Access Control List modelling + ***************************** + An Access Control List (ACL) is modelled to provide the means to manage traffic flows in the system. This will allow defensive agents the means to turn on / off rules, or potentially create new rules, to counter an attack. + The ACL follows a standard network firewall format. For example: + .. list-table:: ACL example + :widths: 25 25 25 25 25 + :header-rows: 1 + * - Permission + - Source IP + - Dest IP + - Protocol + - Port + * - DENY + - 192.168.1.2 + - 192.168.1.3 + - HTTPS + - 443 + * - ALLOW + - 192.168.1.4 + - ANY + - SMTP + - 25 + * - DENY + - ANY + - 192.168.1.5 + - ANY + - ANY + All ACL rules are considered when applying an IER. Logic follows the order of rules, so a DENY or ALLOW for the same parameters will override an earlier entry. + Observation Spaces + ****************** + The observation space provides the blue agent with information about the current status of nodes and links. + PrimAITE builds on top of Gym Spaces to create an observation space that is easily configurable for users. It's made up of components which are managed by the :py:class:`primaite.environment.observations.ObservationsHandler`. Each training scenario can define its own observation space, and the user can choose which information to inlude, and how it should be formatted. + NodeLinkTable component + ----------------------- + For example, the :py:class:`primaite.environment.observations.NodeLinkTable` component represents the status of nodes and links as a ``gym.spaces.Box`` with an example format shown below: + An example observation space is provided below: + .. list-table:: Observation Space example + :widths: 25 25 25 25 25 25 25 + :header-rows: 1 + * - + - ID + - Hardware State + - Software State + - File System State + - Service / Protocol A + - Service / Protocol B + * - Node A + - 1 + - 1 + - 1 + - 1 + - 1 + - 1 + * - Node B + - 2 + - 1 + - 3 + - 1 + - 1 + - 1 + * - Node C + - 3 + - 2 + - 1 + - 1 + - 3 + - 2 + * - Link 1 + - 5 + - 0 + - 0 + - 0 + - 0 + - 10000 + * - Link 2 + - 6 + - 0 + - 0 + - 0 + - 0 + - 10000 + * - Link 3 + - 7 + - 0 + - 0 + - 0 + - 5000 + - 0 + For the nodes, the following values are represented: + .. code-block:: + [ + ID + Hardware State (1=ON, 2=OFF, 3=RESETTING, 4=SHUTTING_DOWN, 5=BOOTING) + Operating System State (0=none, 1=GOOD, 2=PATCHING, 3=COMPROMISED) + File System State (0=none, 1=GOOD, 2=CORRUPT, 3=DESTROYED, 4=REPAIRING, 5=RESTORING) + Service1/Protocol1 state (0=none, 1=GOOD, 2=PATCHING, 3=COMPROMISED) + Service2/Protocol2 state (0=none, 1=GOOD, 2=PATCHING, 3=COMPROMISED) + ] + (Note that each service available in the network is provided as a column, although not all nodes may utilise all services) + For the links, the following statuses are represented: + .. code-block:: + [ + ID + Hardware State (0=not applicable) + Operating System State (0=not applicable) + File System State (0=not applicable) + Service1/Protocol1 state (Traffic load from this protocol on this link) + Service2/Protocol2 state (Traffic load from this protocol on this link) + ] + NodeStatus component + ---------------------- + This is a MultiDiscrete observation space that can be though of as a one-dimensional vector of discrete states. + The example above would have the following structure: + .. code-block:: + [ + node1_info + node2_info + node3_info + ] + Each ``node_info`` contains the following: + .. code-block:: + [ + hardware_state (0=none, 1=ON, 2=OFF, 3=RESETTING, 4=SHUTTING_DOWN, 5=BOOTING) + software_state (0=none, 1=GOOD, 2=PATCHING, 3=COMPROMISED) + file_system_state (0=none, 1=GOOD, 2=CORRUPT, 3=DESTROYED, 4=REPAIRING, 5=RESTORING) + service1_state (0=none, 1=GOOD, 2=PATCHING, 3=COMPROMISED) + service2_state (0=none, 1=GOOD, 2=PATCHING, 3=COMPROMISED) + ] + In a network with three nodes and two services, the full observation space would have 15 elements. It can be written with ``gym`` notation to indicate the number of discrete options for each of the elements of the observation space. For example: + .. code-block:: + gym.spaces.MultiDiscrete([4,5,6,4,4,4,5,6,4,4,4,5,6,4,4]) + .. note:: + NodeStatus observation component provides information only about nodes. Links are not considered. + LinkTrafficLevels + ----------------- + This component is a MultiDiscrete space showing the traffic flow levels on the links in the network, after applying a threshold to convert it from a continuous to a discrete value. + There are two configurable parameters: + * ``quantisation_levels`` determines how many discrete bins to use for converting the continuous traffic value to discrete (default is 5). + * ``combine_service_traffic`` determines whether to separately output traffic use for each network protocol or whether to combine them into an overall value for the link. (default is ``True``) + For example, with default parameters and a network with three links, the structure of this component would be: + .. code-block:: + [ + link1_status + link2_status + link3_status + ] + Each ``link_status`` is a number from 0-4 representing the network load in relation to bandwidth. + .. code-block:: + 0 = No traffic (0%) + 1 = low traffic (1%-33%) + 2 = medium traffic (33%-66%) + 3 = high traffic (66%-99%) + 4 = max traffic/ overwhelmed (100%) + Using ``gym`` notation, the shape of the obs space is: ``gym.spaces.MultiDiscrete([5,5,5])``. + Action Spaces + ************** + The action space available to the blue agent comes in two types: + 1. Node-based + 2. Access Control List + 3. Any (Agent can take both node-based and ACL-based actions) + The choice of action space used during a training session is determined in the config_[name].yaml file. + **Node-Based** + The agent is able to influence the status of nodes by switching them off, resetting, or patching operating systems and services. In this instance, the action space is an OpenAI Gym spaces.Discrete type, as follows: + * Dictionary item {... ,1: [x1, x2, x3,x4] ...} + The placeholders inside the list under the key '1' mean the following: + * [0, num nodes] - Node ID (0 = nothing, node ID) + * [0, 4] - What property it's acting on (0 = nothing, 1 = state, 2 = SoftwareState, 3 = service state, 4 = file system state) + * [0, 3] - Action on property (0 = nothing, 1 = on / scan, 2 = off / repair, 3 = reset / patch / restore) + * [0, num services] - Resolves to service ID (0 = nothing, resolves to service) + **Access Control List** + The blue agent is able to influence the configuration of the Access Control List rule set (which implements a system-wide firewall). In this instance, the action space is an OpenAI spaces.Discrete type, as follows: + * Dictionary item {... ,1: [x1, x2, x3, x4, x5, x6] ...} + The placeholders inside the list under the key '1' mean the following: + * [0, 2] - Action (0 = do nothing, 1 = create rule, 2 = delete rule) + * [0, 1] - Permission (0 = DENY, 1 = ALLOW) + * [0, num nodes] - Source IP (0 = any, then 1 -> x resolving to IP addresses) + * [0, num nodes] - Dest IP (0 = any, then 1 -> x resolving to IP addresses) + * [0, num services] - Protocol (0 = any, then 1 -> x resolving to protocol) + * [0, num ports] - Port (0 = any, then 1 -> x resolving to port) + **ANY** + The agent is able to carry out both **Node-Based** and **Access Control List** operations. + This means the dictionary will contain key-value pairs in the format of BOTH Node-Based and Access Control List as seen above. + Rewards + ******* + A reward value is presented back to the blue agent on the conclusion of every step. The reward value is calculated via two methods which combine to give the total value: + 1. Node and service status + 2. IER status + **Node and service status** + On every step, the status of each node is compared against both a reference environment (simulating the situation if the red and blue agents had not impacted the environment) + and the before and after state of the environment. If the comparison against the reference environment shows no difference, then the score provided is "AllOK". If there is a + difference with respect to the reference environment, the before and after states are compared, and a score determined. See :ref:`config` for details of reward values. + **IER status** + On every step, the full IER set is examined to determine whether green and red agent IERs are being permitted to run. Any red agent IERs running incur a penalty; any green agent + IERs not permitted to run also incur a penalty. See :ref:`config` for details of reward values. + Future Enhancements + ******************* + The PrimAITE project has an ambition to include the following enhancements in future releases: + * Integration with a suitable standardised framework to allow multi-agent integration + * Integration with external threat emulation tools, either using off-line data, or integrating at runtime diff --git a/docs/source/action_system.rst b/docs/source/action_system.rst deleted file mode 100644 index 88baf232..00000000 --- a/docs/source/action_system.rst +++ /dev/null @@ -1,88 +0,0 @@ -.. only:: comment - - © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK - -Actions System -============== - -``SimComponent``s in the simulation are decoupled from the agent training logic. However, they still need a managed means of accepting requests to perform actions. For this, they use ``RequestManager`` and ``Action``. - -Just like other aspects of SimComponent, the actions are not managed centrally for the whole simulation, but instead they are dynamically created and updated based on the nodes, links, and other components that currently exist. This was achieved with the following design decisions: - -- API - An 'action' contains two elements: - - 1. ``request`` - selects which action you want to take on this ``SimComponent``. This is formatted as a list of strings such as `['network', 'node', '', 'service', '', 'restart']`. - 2. ``context`` - optional extra information that can be used to decide how to process the action. This is formatted as a dictionary. For example, if the action requires authentication, the context can include information about the user that initiated the request to decide if their permissions are sufficient. - -- request - The request is a list of strings which help specify who should handle the request. The strings in the request list help RequestManagers traverse the 'ownership tree' of SimComponent. The example given above would be handled in the following way: - - 1. ``Simulation`` receives `['network', 'node', '', 'service', '', 'restart']`. - The first element of the action is ``network``, therefore it passes the action down to its network. - 2. ``Network`` receives `['node', '', 'service', '', 'restart']`. - The first element of the action is ``node``, therefore the network looks at the node uuid and passes the action down to the node with that uuid. - 3. ``Node`` receives `['service', '', 'restart']`. - The first element of the action is ``service``, therefore the node looks at the service uuid and passes the rest of the action to the service with that uuid. - 4. ``Service`` receives ``['restart']``. - Since ``restart`` is a defined action in the service's own RequestManager, the service performs a restart. - -Technical Detail -================ - -This system was achieved by implementing two classes, :py:class:`primaite.simulator.core.Action`, and :py:class:`primaite.simulator.core.RequestManager`. - -Action ------- - -The ``Action`` object stores a reference to a method that performs the action, for example a node could have an action that stores a reference to ``self.turn_on()``. Technically, this can be any callable that accepts `request, context` as it's parameters. In practice, this is often defined using ``lambda`` functions within a component's ``self._init_request_manager()`` method. Optionally, the ``Action`` object can also hold a validator that will permit/deny the action depending on context. - -RequestManager -------------- - -The ``RequestManager`` object stores a mapping between strings and actions. It is responsible for processing the ``request`` and passing it down the ownership tree. Technically, the ``RequestManager`` is itself a callable that accepts `request, context` tuple, and so it can be chained with other action managers. - -A simple example without chaining can be seen in the :py:class:`primaite.simulator.file_system.file_system.File` class. - -.. code-block:: python - - class File(FileSystemItemABC): - ... - def _init_request_manager(self): - ... - request_manager.add_request("scan", Action(func=lambda request, context: self.scan())) - request_manager.add_request("repair", Action(func=lambda request, context: self.repair())) - request_manager.add_request("restore", Action(func=lambda request, context: self.restore())) - -*ellipses (``...``) used to omit code impertinent to this explanation* - -Chaining RequestManagers ------------------------ - -Since the method for performing an action needs to accept `request, context` as parameters, and RequestManager itself is a callable that accepts `request, context` as parameters, it possible to use RequestManager as an action. In fact, that is how PrimAITE deals with traversing the ownership tree. Each time an RequestManager accepts a request, it pops the first elements and uses it to decide to which Action it should send the remaining request. However, the Action could have another RequestManager as it's function, therefore the request will be routed again. Each time the request is passed to a new action manager, the first element is popped. - -An example of how this works is in the :py:class:`primaite.simulator.network.hardware.base.Node` class. - -.. code-block:: python - - class Node(SimComponent): - ... - def _init_request_manager(self): - ... - # a regular action which is processed by the Node itself - request_manager.add_request("turn_on", Action(func=lambda request, context: self.turn_on())) - - # if the Node receives a request where the first word is 'service', it will use a dummy manager - # called self._service_request_manager to pass on the reqeust to the relevant service. This dummy - # manager is simply here to map the service UUID that that service's own action manager. This is - # done because the next string after "service" is always the uuid of that service, so we need an - # RequestManager to pop that string before sending it onto the relevant service's RequestManager. - self._service_request_manager = RequestManager() - request_manager.add_request("service", Action(func=self._service_request_manager)) - ... - - def install_service(self, service): - self.services[service.uuid] = service - ... - # Here, the service UUID is registered to allow passing actions between the node and the service. - self._service_request_manager.add_request(service.uuid, Action(func=service._request_manager)) diff --git a/docs/source/config(v2).rst b/docs/source/config(v2).rst deleted file mode 100644 index 35233cf5..00000000 --- a/docs/source/config(v2).rst +++ /dev/null @@ -1,491 +0,0 @@ -.. only:: comment - - © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK - -.. _config: - -The Config Files Explained -========================== - -Note: This file describes the config files used in legacy PrimAITE v2.0. This file will be removed soon. - -PrimAITE uses two configuration files for its operation: - -* **The Training Config** - - Used to define the top-level settings of the PrimAITE environment, the reward values, and the session that is to be run. - -* **The Lay Down Config** - - Used to define the low-level settings of a session, including the network laydown, green / red agent information exchange requirements (IERSs) and Access Control Rules. - -Training Config: -******************* - -The Training Config file consists of the following attributes: - -**Generic Config Values** - - -* **agent_framework** [enum] - - This identifies the agent framework to be used to instantiate the agent algorithm. Select from one of the following: - - * NONE - Where a user developed agent is to be used - * SB3 - Stable Baselines3 - * RLLIB - Ray RLlib. - -* **agent_identifier** - - This identifies the agent to use for the session. Select from one of the following: - - * A2C - Advantage Actor Critic - * PPO - Proximal Policy Optimization - * HARDCODED - A custom built deterministic agent - * RANDOM - A Stochastic random agent - - -* **random_red_agent** [bool] - - Determines if the session should be run with a random red agent - -* **action_type** [enum] - - Determines whether a NODE, ACL, or ANY (combined NODE & ACL) action space format is adopted for the session - - -* **OBSERVATION_SPACE** [dict] - - Allows for user to configure observation space by combining one or more observation components. List of available - components is in :py:mod:`primaite.environment.observations`. - - The observation space config item should have a ``components`` key which is a list of components. Each component - config must have a ``name`` key, and can optionally have an ``options`` key. The ``options`` are passed to the - component while it is being initialised. - - This example illustrates the correct format for the observation space config item - - .. code-block:: yaml - - observation_space: - components: - - name: NODE_LINK_TABLE - - name: NODE_STATUSES - - name: LINK_TRAFFIC_LEVELS - - name: ACCESS_CONTROL_LIST - options: - combine_service_traffic : False - quantisation_levels: 99 - - - Currently available components are: - - * :py:mod:`NODE_LINK_TABLE` this does not accept any additional options - * :py:mod:`NODE_STATUSES`, this does not accept any additional options - * :py:mod:`ACCESS_CONTROL_LIST`, this does not accept additional options - * :py:mod:`LINK_TRAFFIC_LEVELS`, this accepts the following options: - - * ``combine_service_traffic`` - whether to consider bandwidth use separately for each network protocol or combine them into a single bandwidth reading (boolean) - * ``quantisation_levels`` - how many discrete bandwidth usage levels to use for encoding. This can be an integer equal to or greater than 3. - - The other configurable item is ``flatten`` which is false by default. When set to true, the observation space is flattened (turned into a 1-D vector). You should use this if your RL agent does not natively support observation space types like ``gym.Spaces.Tuple``. - -* **num_train_episodes** [int] - - This defines the number of episodes that the agent will train for. - - -* **num_train_steps** [int] - - Determines the number of steps to run in each episode of the training session. - - -* **num_eval_episodes** [int] - - This defines the number of episodes that the agent will be evaluated over. - - -* **num_eval_steps** [int] - - Determines the number of steps to run in each episode of the evaluation session. - - -* **time_delay** [int] - - The time delay (in milliseconds) to take between each step when running a GENERIC agent session - - -* **session_type** [text] - - Type of session to be run (TRAINING, EVALUATION, or BOTH) - -* **load_agent** [bool] - - Determine whether to load an agent from file - -* **agent_load_file** [text] - - File path and file name of agent if you're loading one in - -* **observation_space_high_value** [int] - - The high value to use for values in the observation space. This is set to 1000000000 by default, and should not need changing in most cases - -* **implicit_acl_rule** [str] - - Determines which Explicit rule the ACL list has - two options are: DENY or ALLOW. - -* **max_number_acl_rules** [int] - - Sets a limit on how many ACL rules there can be in the ACL list throughout the training session. - -**Reward-Based Config Values** - -Rewards are calculated based on the difference between the current state and reference state (the 'should be' state) of the environment. - -* **Generic [all_ok]** [float] - - The score to give when the current situation (for a given component) is no different from that expected in the baseline (i.e. as though no blue or red agent actions had been undertaken) - -* **Node Hardware State [off_should_be_on]** [float] - - The score to give when the node should be on, but is off - -* **Node Hardware State [off_should_be_resetting]** [float] - - The score to give when the node should be resetting, but is off - -* **Node Hardware State [on_should_be_off]** [float] - - The score to give when the node should be off, but is on - -* **Node Hardware State [on_should_be_resetting]** [float] - - The score to give when the node should be resetting, but is on - -* **Node Hardware State [resetting_should_be_on]** [float] - - The score to give when the node should be on, but is resetting - -* **Node Hardware State [resetting_should_be_off]** [float] - - The score to give when the node should be off, but is resetting - -* **Node Hardware State [resetting]** [float] - - The score to give when the node is resetting - -* **Node Operating System or Service State [good_should_be_patching]** [float] - - The score to give when the state should be patching, but is good - -* **Node Operating System or Service State [good_should_be_compromised]** [float] - - The score to give when the state should be compromised, but is good - -* **Node Operating System or Service State [good_should_be_overwhelmed]** [float] - - The score to give when the state should be overwhelmed, but is good - -* **Node Operating System or Service State [patching_should_be_good]** [float] - - The score to give when the state should be good, but is patching - -* **Node Operating System or Service State [patching_should_be_compromised]** [float] - - The score to give when the state should be compromised, but is patching - -* **Node Operating System or Service State [patching_should_be_overwhelmed]** [float] - - The score to give when the state should be overwhelmed, but is patching - -* **Node Operating System or Service State [patching]** [float] - - The score to give when the state is patching - -* **Node Operating System or Service State [compromised_should_be_good]** [float] - - The score to give when the state should be good, but is compromised - -* **Node Operating System or Service State [compromised_should_be_patching]** [float] - - The score to give when the state should be patching, but is compromised - -* **Node Operating System or Service State [compromised_should_be_overwhelmed]** [float] - - The score to give when the state should be overwhelmed, but is compromised - -* **Node Operating System or Service State [compromised]** [float] - - The score to give when the state is compromised - -* **Node Operating System or Service State [overwhelmed_should_be_good]** [float] - - The score to give when the state should be good, but is overwhelmed - -* **Node Operating System or Service State [overwhelmed_should_be_patching]** [float] - - The score to give when the state should be patching, but is overwhelmed - -* **Node Operating System or Service State [overwhelmed_should_be_compromised]** [float] - - The score to give when the state should be compromised, but is overwhelmed - -* **Node Operating System or Service State [overwhelmed]** [float] - - The score to give when the state is overwhelmed - -* **Node File System State [good_should_be_repairing]** [float] - - The score to give when the state should be repairing, but is good - -* **Node File System State [good_should_be_restoring]** [float] - - The score to give when the state should be restoring, but is good - -* **Node File System State [good_should_be_corrupt]** [float] - - The score to give when the state should be corrupt, but is good - -* **Node File System State [good_should_be_destroyed]** [float] - - The score to give when the state should be destroyed, but is good - -* **Node File System State [repairing_should_be_good]** [float] - - The score to give when the state should be good, but is repairing - -* **Node File System State [repairing_should_be_restoring]** [float] - - The score to give when the state should be restoring, but is repairing - -* **Node File System State [repairing_should_be_corrupt]** [float] - - The score to give when the state should be corrupt, but is repairing - -* **Node File System State [repairing_should_be_destroyed]** [float] - - The score to give when the state should be destroyed, but is repairing - -* **Node File System State [repairing]** [float] - - The score to give when the state is repairing - -* **Node File System State [restoring_should_be_good]** [float] - - The score to give when the state should be good, but is restoring - -* **Node File System State [restoring_should_be_repairing]** [float] - - The score to give when the state should be repairing, but is restoring - -* **Node File System State [restoring_should_be_corrupt]** [float] - - The score to give when the state should be corrupt, but is restoring - -* **Node File System State [restoring_should_be_destroyed]** [float] - - The score to give when the state should be destroyed, but is restoring - -* **Node File System State [restoring]** [float] - - The score to give when the state is restoring - -* **Node File System State [corrupt_should_be_good]** [float] - - The score to give when the state should be good, but is corrupt - -* **Node File System State [corrupt_should_be_repairing]** [float] - - The score to give when the state should be repairing, but is corrupt - -* **Node File System State [corrupt_should_be_restoring]** [float] - - The score to give when the state should be restoring, but is corrupt - -* **Node File System State [corrupt_should_be_destroyed]** [float] - - The score to give when the state should be destroyed, but is corrupt - -* **Node File System State [corrupt]** [float] - - The score to give when the state is corrupt - -* **Node File System State [destroyed_should_be_good]** [float] - - The score to give when the state should be good, but is destroyed - -* **Node File System State [destroyed_should_be_repairing]** [float] - - The score to give when the state should be repairing, but is destroyed - -* **Node File System State [destroyed_should_be_restoring]** [float] - - The score to give when the state should be restoring, but is destroyed - -* **Node File System State [destroyed_should_be_corrupt]** [float] - - The score to give when the state should be corrupt, but is destroyed - -* **Node File System State [destroyed]** [float] - - The score to give when the state is destroyed - -* **Node File System State [scanning]** [float] - - The score to give when the state is scanning - -* **IER Status [red_ier_running]** [float] - - The score to give when a red agent IER is permitted to run - -* **IER Status [green_ier_blocked]** [float] - - The score to give when a green agent IER is prevented from running - -**Patching / Reset Durations** - -* **os_patching_duration** [int] - - The number of steps to take when patching an Operating System - -* **node_reset_duration** [int] - - The number of steps to take when resetting a node's hardware state - -* **service_patching_duration** [int] - - The number of steps to take when patching a service - -* **file_system_repairing_limit** [int]: - - The number of steps to take when repairing the file system - -* **file_system_restoring_limit** [int] - - The number of steps to take when restoring the file system - -* **file_system_scanning_limit** [int] - - The number of steps to take when scanning the file system - -* **deterministic** [bool] - - Set to true if the agent evaluation should be deterministic. Default is ``False`` - -* **seed** [int] - - Seed used in the randomisation in agent training. Default is ``None`` - -The Lay Down Config -******************* - -The lay down config file consists of the following attributes: - - -* **itemType: STEPS** [int] - -* **item_type: PORTS** [int] - - Provides a list of ports modelled in this session - -* **item_type: SERVICES** [freetext] - - Provides a list of services modelled in this session - -* **item_type: NODE** - - Defines a node included in the system laydown being simulated. It should consist of the following attributes: - - * **id** [int]: Unique ID for this YAML item - * **name** [freetext]: Human-readable name of the component - * **node_class** [enum]: Relates to the base type of the node. Can be SERVICE, ACTIVE or PASSIVE. PASSIVE nodes do not have an operating system or services. ACTIVE nodes have an operating system, but no services. SERVICE nodes have both an operating system and one or more services - * **node_type** [enum]: Relates to the component type. Can be one of CCTV, SWITCH, COMPUTER, LINK, MONITOR, PRINTER, LOP, RTU, ACTUATOR or SERVER - * **priority** [enum]: Provides a priority for each node. Can be one of P1, P2, P3, P4 or P5 (which P1 being the highest) - * **hardware_state** [enum]: The initial hardware state of the node. Can be one of ON, OFF or RESETTING - * **ip_address** [IP address]: The IP address of the component in format xxx.xxx.xxx.xxx - * **software_state** [enum]: The intial state of the node operating system. Can be GOOD, PATCHING or COMPROMISED - * **file_system_state** [enum]: The initial state of the node file system. Can be GOOD, CORRUPT, DESTROYED, REPAIRING or RESTORING - * **services**: For each service associated with the node: - - * **name** [freetext]: Free-text name of the service, but must match one of the services defined for the system in the services list - * **port** [int]: Integer value of the port related to this service, but must match one of the ports defined for the system in the ports list - * **state** [enum]: The initial state of the service. Can be one of GOOD, PATCHING, COMPROMISED or OVERWHELMED - -* **item_type: LINK** - - Defines a link included in the system laydown being simulated. It should consist of the following attributes: - - * **id** [int]: Unique ID for this YAML item - * **name** [freetext]: Human-readable name of the component - * **bandwidth** [int]: The bandwidth (in bits/s) of the link - * **source** [int]: The ID of the source node - * **destination** [int]: The ID of the destination node - -* **item_type: GREEN_IER** - - Defines a green agent Information Exchange Requirement (IER). It should consist of: - - * **id** [int]: Unique ID for this YAML item - * **start_step** [int]: The start step (in the episode) for this IER to begin - * **end_step** [int]: The end step (in the episode) for this IER to finish - * **load** [int]: The load (in bits/s) for this IER to apply to links - * **protocol** [freetext]: The protocol to apply to the links. This must match a value in the services list - * **port** [int]: The port that the protocol is running on. This must match a value in the ports list - * **source** [int]: The ID of the source node - * **destination** [int]: The ID of the destination node - * **mission_criticality** [enum]: The mission criticality of this IER (with 5 being highest, 1 lowest) - -* **item_type: RED_IER** - - Defines a red agent Information Exchange Requirement (IER). It should consist of: - - * **id** [int]: Unique ID for this YAML item - * **start_step** [int]: The start step (in the episode) for this IER to begin - * **end_step** [int]: The end step (in the episode) for this IER to finish - * **load** [int]: The load (in bits/s) for this IER to apply to links - * **protocol** [freetext]: The protocol to apply to the links. This must match a value in the services list - * **port** [int]: The port that the protocol is running on. This must match a value in the ports list - * **source** [int]: The ID of the source node - * **destination** [int]: The ID of the destination node - * **mission_criticality** [enum]: Not currently used. Default to 0 - -* **item_type: GREEN_POL** - - Defines a green agent pattern-of-life instruction. It should consist of: - - * **id** [int]: Unique ID for this YAML item - * **start_step** [int]: The start step (in the episode) for this PoL to begin - * **end_step** [int]: Not currently used. Default to same as start step - * **nodeId** [int]: The ID of the node to apply the PoL to - * **type** [enum]: The type of PoL to apply. Can be one of OPERATING, OS or SERVICE - * **protocol** [freetext]: The protocol to be affected if SERVICE type is chosen. Must match a value in the services list - * **state** [enuum]: The state to apply to the node (which represents the PoL change). Can be one of ON, OFF or RESETTING (for node state) or GOOD, PATCHING or COMPROMISED (for Software State) or GOOD, PATCHING, COMPROMISED or OVERWHELMED (for service state) - -* **item_type: RED_POL** - - Defines a red agent pattern-of-life instruction. It should consist of: - - * **id** [int]: Unique ID for this YAML item - * **start_step** [int]: The start step (in the episode) for this PoL to begin - * **end_step** [int]: Not currently used. Default to same as start step - * **targetNodeId** [int]: The ID of the node to apply the PoL to - * **initiator** [enum]: What initiates the PoL. Can be DIRECT, IER or SERVICE - * **type** [enum]: The type of PoL to apply. Can be one of OPERATING, OS or SERVICE - * **protocol** [freetext]: The protocol to be affected if SERVICE type is chosen. Must match a value in the services list - * **state** [enum]: The state to apply to the node (which represents the PoL change). Can be one of ON, OFF or RESETTING (for node state) or GOOD, PATCHING or COMPROMISED (for Software State) or GOOD, PATCHING, COMPROMISED or OVERWHELMED (for service state) or GOOD, CORRUPT, DESTROYED, REPAIRING or RESTORING (for file system state) - * **sourceNodeId** [int] The ID of the source node containing the service to check (used for SERVICE initiator) - * **sourceNodeService** [freetext]: The service on the source node to check (used for SERVICE initiator). Must match a value in the services list for this node - * **sourceNodeServiceState** [enum]: The state of the source node service to check (used for SERVICE initiator). Can be one of GOOD, PATCHING, COMPROMISED or OVERWHELMED - -* **item_type: ACL_RULE** - - Defines an initial Access Control List (ACL) rule. It should consist of: - - * **id** [int]: Unique ID for this YAML item - * **permission** [enum]: Defines either an allow or deny rule. Value must be either DENY or ALLOW - * **source** [IP address]: Defines the source IP address for the rule in xxx.xxx.xxx.xxx format - * **destination** [IP address]: Defines the destination IP address for the rule in xxx.xxx.xxx.xxx format - * **protocol** [freetext]: Defines the protocol for the rule. Must match a value in the services list - * **port** [int]: Defines the port for the rule. Must match a value in the ports list - * **position** [int]: Defines where to place the ACL rule in the list. Lower index or (higher up in the list) means they are checked first. Index starts at 0 (Python indexes). diff --git a/docs/source/custom_agent.rst b/docs/source/custom_agent.rst index 040b4b3d..0a08ae74 100644 --- a/docs/source/custom_agent.rst +++ b/docs/source/custom_agent.rst @@ -11,132 +11,4 @@ Integrating a user defined blue agent .. note:: - If you are planning to implement custom RL agents into PrimAITE, you must use the project as a repository. If you install PrimAITE as a python package from wheel, custom agents are not supported. - -PrimAITE has integration with Ray RLLib and StableBaselines3 agents. All agents interface with PrimAITE through an :py:class:`primaite.agents.agent_abc.AgentSessionABC` which provides Input/Output of agent savefiles, as well as capturing and plotting performance metrics during training and evaluation. If you wish to integrate a custom blue agent, it is recommended to create a subclass of the :py:class:`primaite.agents.agent_abc.AgentSessionABC` and implement the ``__init__()``, ``_setup()``, ``_save_checkpoint()``, ``learn()``, ``evaluate()``, ``_get_latest_checkpoint``, ``load()``, and ``save()`` methods. - -Below is a barebones example of a custom agent implementation: - -.. code:: python - - # src/primaite/agents/my_custom_agent.py - - from primaite.agents.agent_abc import AgentSessionABC - from primaite.common.enums import AgentFramework, AgentIdentifier - - class CustomAgent(AgentSessionABC): - def __init__(self, training_config_path, lay_down_config_path): - super().__init__(training_config_path, lay_down_config_path) - assert self._training_config.agent_framework == AgentFramework.CUSTOM - assert self._training_config.agent_identifier == AgentIdentifier.MY_AGENT - self._setup() - - def _setup(self): - super()._setup() - self._env = Primaite( - training_config_path=self._training_config_path, - lay_down_config_path=self._lay_down_config_path, - session_path=self.session_path, - timestamp_str=self.timestamp_str, - ) - self._agent = ... # your code to setup agent - - def _save_checkpoint(self): - checkpoint_num = self._training_config.checkpoint_every_n_episodes - episode_count = self._env.episode_count - save_checkpoint = False - if checkpoint_num: - save_checkpoint = episode_count % checkpoint_num == 0 - # saves checkpoint if the episode count is not 0 and save_checkpoint flag was set to true - if episode_count and save_checkpoint: - ... - # your code to save checkpoint goes here. - # The path should start with self.checkpoints_path and include the episode number. - - def learn(self): - ... - # call your agent's learning function here. - - super().learn() # this will finalise learning and output session metadata - self.save() - - def evaluate(self): - ... - # call your agent's evaluation function here. - - self._env.close() - super().evaluate() - - def _get_latest_checkpoint(self): - ... - # Load an agent from file. - - @classmethod - def load(cls, path): - ... - # Create a CustomAgent object which loads model weights from file. - - def save(self): - ... - # Call your agent's function that saves it to a file - - -You will also need to modify :py:class:`primaite.primaite_session.PrimaiteSession` and :py:mod:`primaite.common.enums` to capture your new agent identifiers. - -.. code-block:: python - :emphasize-lines: 17, 18 - - # src/primaite/common/enums.py - - class AgentIdentifier(Enum): - """The Red Agent algo/class.""" - A2C = 1 - "Advantage Actor Critic" - PPO = 2 - "Proximal Policy Optimization" - HARDCODED = 3 - "The Hardcoded agents" - DO_NOTHING = 4 - "The DoNothing agents" - RANDOM = 5 - "The RandomAgent" - DUMMY = 6 - "The DummyAgent" - CUSTOM_AGENT = 7 - "Your custom agent" - -.. code-block:: python - :emphasize-lines: 3, 11, 12 - - # src/primaite_session.py - - from primaite.agents.my_custom_agent import CustomAgent - - # ... - - def setup(self): - """Performs the session setup.""" - if self._training_config.agent_framework == AgentFramework.CUSTOM: - _LOGGER.debug(f"PrimaiteSession Setup: Agent Framework = {AgentFramework.CUSTOM}") - if self._training_config.agent_identifier == AgentIdentifier.CUSTOM_AGENT: - self._agent_session = CustomAgent(self._training_config_path, self._lay_down_config_path) - if self._training_config.agent_identifier == AgentIdentifier.HARDCODED: - _LOGGER.debug(f"PrimaiteSession Setup: Agent Identifier =" f" {AgentIdentifier.HARDCODED}") - if self._training_config.action_type == ActionType.NODE: - # Deterministic Hardcoded Agent with Node Action Space - self._agent_session = HardCodedNodeAgent(self._training_config_path, self._lay_down_config_path) - -Finally, specify your agent in your training config. - -.. code-block:: yaml - - # ~/primaite/2.0.0/config/path/to/your/config_main.yaml - - # Training Config File - - agent_framework: CUSTOM - agent_identifier: CUSTOM_AGENT - random_red_agent: False - # ... - -Now you can :ref:`run a primaite session` with your custom agent by passing in the custom ``config_main``. + PrimAITE uses ARCD GATE for agent integration. In order to use a custom agent with PrimAITE, you must integrate it with ARCD GATE. Please look at the ARCD GATE documentation for more information. diff --git a/docs/source/getting_started.rst b/docs/source/getting_started.rst index 0801c79e..aebabf66 100644 --- a/docs/source/getting_started.rst +++ b/docs/source/getting_started.rst @@ -11,7 +11,7 @@ Getting Started Pre-Requisites -In order to get **PrimAITE** installed, you will need to have a python version between 3.8 and 3.10 installed. If you don't already have it, this is how to install it: +In order to get **PrimAITE** installed, you will need to have a python version between 3.8 and 3.11 installed. If you don't already have it, this is how to install it: .. code-block:: bash @@ -33,39 +33,36 @@ In order to get **PrimAITE** installed, you will need to have a python version b Install PrimAITE **************** -1. Create a primaite directory in your home directory: - - +1. Create a directory for your PrimAITE project: .. code-block:: bash :caption: Unix - mkdir ~/primaite/2.0.0 + mkdir ~/primaite/3.0.0 .. code-block:: powershell :caption: Windows (Powershell) - mkdir ~\primaite\2.0.0 + mkdir ~\primaite\3.0.0 + 2. Navigate to the primaite directory and create a new python virtual environment (venv) - - .. code-block:: bash :caption: Unix - cd ~/primaite/2.0.0 + cd ~/primaite/3.0.0 python3 -m venv .venv .. code-block:: powershell :caption: Windows (Powershell) - cd ~\primaite\2.0.0 + cd ~\primaite\3.0.0 python3 -m venv .venv attrib +h .venv /s /d # Hides the .venv directory -3. Activate the venv +3. Activate the venv .. code-block:: bash :caption: Unix @@ -78,21 +75,34 @@ Install PrimAITE .\.venv\Scripts\activate -4. Install PrimAITE using pip from PyPi +4. Install PrimAITE from your saved wheel file + +.. code-block:: bash + :caption: Unix + + pip install path/to/your/primaite.whl + +.. code-block:: powershell + :caption: Windows (Powershell) + + pip install path\to\your\primaite.whl + + +5. Install ARCD GATE from wheel file .. code-block:: bash :caption: Unix - pip install primaite + pip install path/to/your/arcd_gate-0.1.0-py3-none-any.whl .. code-block:: powershell :caption: Windows (Powershell) - pip install primaite + pip install path\to\your\arcd_gate-0.1.0-py3-none-any.whl -5. Perform the PrimAITE setup +6. Perform the PrimAITE setup .. code-block:: bash :caption: Unix @@ -110,13 +120,14 @@ Clone & Install PrimAITE for Development To be able to extend PrimAITE further, or to build wheels manually before install, clone the repository to a location of your choice: +1. Clone the repository + .. code-block:: bash git clone https://github.com/Autonomous-Resilient-Cyber-Defence/PrimAITE cd primaite -Create and activate your Python virtual environment (venv) - +2. Create and activate your Python virtual environment (venv) .. code-block:: bash :caption: Unix @@ -130,8 +141,7 @@ Create and activate your Python virtual environment (venv) python3 -m venv venv .\venv\Scripts\activate -Install PrimAITE with the dev extra - +3. Install PrimAITE with the dev extra .. code-block:: bash :caption: Unix @@ -144,4 +154,16 @@ Install PrimAITE with the dev extra pip install -e .[dev] +4. Install ARCD GATE from wheel file + +.. code-block:: bash + :caption: Unix + + pip install GATE/arcd_gate-0.1.0-py3-none-any.whl + +.. code-block:: powershell + :caption: Windows (Powershell) + + pip install GATE\arcd_gate-0.1.0-py3-none-any.whl + To view the complete list of packages installed during PrimAITE installation, go to the dependencies page (:ref:`Dependencies`). diff --git a/docs/source/migration_1.2_-_2.0.rst b/docs/source/migration_1.2_-_2.0.rst deleted file mode 100644 index c38fcbe9..00000000 --- a/docs/source/migration_1.2_-_2.0.rst +++ /dev/null @@ -1,57 +0,0 @@ -.. only:: comment - - © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK - -v1.2 to v2.0 Migration guide -============================ - -**1. Installing PrimAITE** - - Like before, you can install primaite from the repository by running ``pip install -e .``. But, there is now an additional setup step which does several things, like setting up user directories, copy default configs and notebooks, etc. Once you have installed PrimAITE to your virtual environment, run this command to finalise setup. - - .. code-block:: bash - - primaite setup - -**2. Running a training session** - - In version 1.2 of PrimAITE, the main entry point for training or evaluating agents was the ``src/primaite/main.py`` file. v2.0.0 introduced managed 'sessions' which are responsible for reading configuration files, performing training, and writing outputs. - - ``main.py`` file still runs a training session but it now uses the new `PrimaiteSession`, and it now requires you to provide the path to your config files. - - .. code-block:: bash - - python src/primaite/main.py --tc path/to/training-config.yaml --ldc path/to/laydown-config.yaml - - Alternatively, the session can be invoked via the commandline by running: - - .. code-block:: bash - - primaite session --tc path/to/training-config.yaml --ldc path/to/laydown-config.yaml - -**3. Location of configs** - - In version 1.2, training configs and laydown configs were all stored in the project repository under ``src/primaite/config``. Version 2.0.0 introduced user data directories, and now when you install and setup PrimAITE, config files are stored in your user data location. On Linux/OSX, this is stored in ``~/primaite/2.0.0/config``. On Windows, this is stored in ``C:\Users\\primaite\configs``. Upon first setup, the configs folder is populated with some default yaml files. It is recommended that you store all your custom configuration files here. - -**4. Contents of configs** - - Some things that were previously part of the laydown config are now part of the traning config. - - * Actions - - If you have custom configs which use these, you will need to adapt them by moving the configuration from the laydown config to the training config. - - Also, there are new configurable items in the training config: - - * Observations - * Agent framework - * Agent - * Deep learning framework - * random red agents - * seed - * deterministic - * hard coded agent view - - Each of these items have default values which are designed so that PrimAITE has the same behaviour as it did in 1.2.0, so you do not have to specify them. - - ACL Rules in laydown configs have a new required parameter: ``position``. The lower the position, the higher up in the ACL table the rule will placed. If you have custom laydowns, you will need to go through them and add a position to each ACL_RULE. diff --git a/docs/source/primaite_session.rst b/docs/source/primaite_session.rst index 8ccc9070..472a361f 100644 --- a/docs/source/primaite_session.rst +++ b/docs/source/primaite_session.rst @@ -14,199 +14,200 @@ A PrimAITE session can be ran either with the ``primaite session`` command from (See :func:`primaite.cli.session`), or by calling :func:`primaite.main.run` from a Python terminal or Jupyter Notebook. Both the ``primaite session`` and :func:`primaite.main.run` take a training config and a lay down config as parameters. +.. note:: + 🚧 *UNDER CONSTRUCTION* 🚧 - - -.. code-block:: bash - :caption: Unix CLI - - cd ~/primaite/2.0.0 - source ./.venv/bin/activate - primaite session --tc ./config/my_training_config.yaml --ldc ./config/my_lay_down_config.yaml - -.. code-block:: powershell - :caption: Powershell CLI - - cd ~\primaite\2.0.0 - .\.venv\Scripts\activate - primaite session --tc .\config\my_training_config.yaml --ldc .\config\my_lay_down_config.yaml - - -.. code-block:: python - :caption: Python - - from primaite.main import run - - training_config = - lay_down_config = - run(training_config, lay_down_config) - -When a session is ran, a session output sub-directory is created in the users app sessions directory (``~/primaite/2.0.0/sessions``). -The sub-directory is formatted as such: ``~/primaite/2.0.0/sessions//_/`` - -For example, when running a session at 17:30:00 on 31st January 2023, the session will output to: -``~/primaite/2.0.0/sessions/2023-01-31/2023-01-31_17-30-00/``. - -``primaite session`` can be ran in the terminal/command prompt without arguments. It will use the default configs in the directory ``primaite/config/example_config``. - -To run a PrimAITE session using legacy training or laydown config files, add the ``--legacy-tc`` and/or ``legacy-ldc`` options. - - - -.. code-block:: bash - :caption: Unix CLI - - cd ~/primaite/2.0.0 - source ./.venv/bin/activate - primaite session --tc ./config/my_legacy_training_config.yaml --legacy-tc --ldc ./config/my_legacy_lay_down_config.yaml --legacy-ldc - -.. code-block:: powershell - :caption: Powershell CLI - - cd ~\primaite\2.0.0 - .\.venv\Scripts\activate - primaite session --tc .\config\my_legacy_training_config.yaml --legacy-tc --ldc .\config\my_legacy_lay_down_config.yaml --legacy-ldc - - -.. code-block:: python - :caption: Python - - from primaite.main import run - - training_config = - lay_down_config = - run(training_config, lay_down_config, legacy_training_config=True, legacy_lay_down_config=True) - - - - -Outputs -------- - -PrimAITE produces four types of outputs: - -* Session Metadata -* Results -* Diagrams -* Saved agents (training checkpoints and a final trained agent) - - -**Session Metadata** - -PrimAITE creates a ``session_metadata.json`` file that contains the following metadata: - - * **uuid** - The UUID assigned to the session upon instantiation. - * **start_datetime** - The date & time the session started in iso format. - * **end_datetime** - The date & time the session ended in iso format. - * **learning** - * **total_episodes** - The total number of training episodes completed. - * **total_time_steps** - The total number of training time steps completed. - * **evaluation** - * **total_episodes** - The total number of evaluation episodes completed. - * **total_time_steps** - The total number of evaluation time steps completed. - * **env** - * **training_config** - * **All training config items** - * **lay_down_config** - * **All lay down config items** - - -**Results** - -PrimAITE automatically creates two sets of results from each learning and evaluation session: - -* Average reward per episode - a csv file listing the average reward for each episode of the session. This provides, for example, an indication of the change over a training session of the reward value -* All transactions - a csv file listing the following values for every step of every episode: - - * Timestamp - * Episode number - * Step number - * Reward value - * Action taken (as presented by the blue agent on this step). Individual elements of the action space are presented in the format AS_X - * Initial observation space (what the blue agent observed when it decided its action) - -**Diagrams** - -* For each session, PrimAITE automatically creates a visualisation of the system / network lay down configuration. -* For each learning and evaluation task within the session, PrimAITE automatically plots the average reward per episode using PlotLY and saves it to the learning or evaluation subdirectory in the session directory. - -**Saved agents** - -For each training session, assuming the agent being trained implements the *save()* function and this function is called by the code, PrimAITE automatically saves the agent state. - -**Example Session Directory Structure** - -.. code-block:: text - - ~/ - └── primaite/ - └── 2.0.0/ - └── sessions/ - └── 2023-07-18/ - └── 2023-07-18_11-06-04/ - ├── evaluation/ - │ ├── all_transactions_2023-07-18_11-06-04.csv - │ ├── average_reward_per_episode_2023-07-18_11-06-04.csv - │ └── average_reward_per_episode_2023-07-18_11-06-04.png - ├── learning/ - │ ├── all_transactions_2023-07-18_11-06-04.csv - │ ├── average_reward_per_episode_2023-07-18_11-06-04.csv - │ ├── average_reward_per_episode_2023-07-18_11-06-04.png - │ ├── checkpoints/ - │ │ └── sb3ppo_10.zip - │ ├── SB3_PPO.zip - │ └── tensorboard_logs/ - │ ├── PPO_1/ - │ │ └── events.out.tfevents.1689674765.METD-9PMRFB3.42960.0 - │ ├── PPO_2/ - │ │ └── events.out.tfevents.1689674766.METD-9PMRFB3.42960.1 - │ ├── PPO_3/ - │ │ └── events.out.tfevents.1689674766.METD-9PMRFB3.42960.2 - │ ├── PPO_4/ - │ │ └── events.out.tfevents.1689674767.METD-9PMRFB3.42960.3 - │ ├── PPO_5/ - │ │ └── events.out.tfevents.1689674767.METD-9PMRFB3.42960.4 - │ ├── PPO_6/ - │ │ └── events.out.tfevents.1689674768.METD-9PMRFB3.42960.5 - │ ├── PPO_7/ - │ │ └── events.out.tfevents.1689674768.METD-9PMRFB3.42960.6 - │ ├── PPO_8/ - │ │ └── events.out.tfevents.1689674769.METD-9PMRFB3.42960.7 - │ ├── PPO_9/ - │ │ └── events.out.tfevents.1689674770.METD-9PMRFB3.42960.8 - │ └── PPO_10/ - │ └── events.out.tfevents.1689674770.METD-9PMRFB3.42960.9 - ├── network_2023-07-18_11-06-04.png - └── session_metadata.json - -Loading a session ------------------ - -A previous session can be loaded by providing the **directory** of the previous session to either the ``primaite session`` command from the cli -(See :func:`primaite.cli.session`), or by calling :func:`primaite.main.run` with session_path. - -.. tabs:: - - .. code-tab:: bash +.. + .. code-block:: bash :caption: Unix CLI cd ~/primaite/2.0.0 source ./.venv/bin/activate - primaite session --load "path/to/session" + primaite session --tc ./config/my_training_config.yaml --ldc ./config/my_lay_down_config.yaml - .. code-tab:: bash + .. code-block:: powershell :caption: Powershell CLI cd ~\primaite\2.0.0 .\.venv\Scripts\activate - primaite session --load "path\to\session" + primaite session --tc .\config\my_training_config.yaml --ldc .\config\my_lay_down_config.yaml - .. code-tab:: python + .. code-block:: python :caption: Python from primaite.main import run - run(session_path=) + training_config = + lay_down_config = + run(training_config, lay_down_config) -When PrimAITE runs a loaded session, PrimAITE will output in the provided session directory + When a session is ran, a session output sub-directory is created in the users app sessions directory (``~/primaite/2.0.0/sessions``). + The sub-directory is formatted as such: ``~/primaite/2.0.0/sessions//_/`` + + For example, when running a session at 17:30:00 on 31st January 2023, the session will output to: + ``~/primaite/2.0.0/sessions/2023-01-31/2023-01-31_17-30-00/``. + + ``primaite session`` can be ran in the terminal/command prompt without arguments. It will use the default configs in the directory ``primaite/config/example_config``. + + To run a PrimAITE session using legacy training or laydown config files, add the ``--legacy-tc`` and/or ``legacy-ldc`` options. + + + + .. code-block:: bash + :caption: Unix CLI + + cd ~/primaite/2.0.0 + source ./.venv/bin/activate + primaite session --tc ./config/my_legacy_training_config.yaml --legacy-tc --ldc ./config/my_legacy_lay_down_config.yaml --legacy-ldc + + .. code-block:: powershell + :caption: Powershell CLI + + cd ~\primaite\2.0.0 + .\.venv\Scripts\activate + primaite session --tc .\config\my_legacy_training_config.yaml --legacy-tc --ldc .\config\my_legacy_lay_down_config.yaml --legacy-ldc + + + .. code-block:: python + :caption: Python + + from primaite.main import run + + training_config = + lay_down_config = + run(training_config, lay_down_config, legacy_training_config=True, legacy_lay_down_config=True) + + + + + Outputs + ------- + + PrimAITE produces four types of outputs: + + * Session Metadata + * Results + * Diagrams + * Saved agents (training checkpoints and a final trained agent) + + + **Session Metadata** + + PrimAITE creates a ``session_metadata.json`` file that contains the following metadata: + + * **uuid** - The UUID assigned to the session upon instantiation. + * **start_datetime** - The date & time the session started in iso format. + * **end_datetime** - The date & time the session ended in iso format. + * **learning** + * **total_episodes** - The total number of training episodes completed. + * **total_time_steps** - The total number of training time steps completed. + * **evaluation** + * **total_episodes** - The total number of evaluation episodes completed. + * **total_time_steps** - The total number of evaluation time steps completed. + * **env** + * **training_config** + * **All training config items** + * **lay_down_config** + * **All lay down config items** + + + **Results** + + PrimAITE automatically creates two sets of results from each learning and evaluation session: + + * Average reward per episode - a csv file listing the average reward for each episode of the session. This provides, for example, an indication of the change over a training session of the reward value + * All transactions - a csv file listing the following values for every step of every episode: + + * Timestamp + * Episode number + * Step number + * Reward value + * Action taken (as presented by the blue agent on this step). Individual elements of the action space are presented in the format AS_X + * Initial observation space (what the blue agent observed when it decided its action) + + **Diagrams** + + * For each session, PrimAITE automatically creates a visualisation of the system / network lay down configuration. + * For each learning and evaluation task within the session, PrimAITE automatically plots the average reward per episode using PlotLY and saves it to the learning or evaluation subdirectory in the session directory. + + **Saved agents** + + For each training session, assuming the agent being trained implements the *save()* function and this function is called by the code, PrimAITE automatically saves the agent state. + + **Example Session Directory Structure** + + .. code-block:: text + + ~/ + └── primaite/ + └── 2.0.0/ + └── sessions/ + └── 2023-07-18/ + └── 2023-07-18_11-06-04/ + ├── evaluation/ + │ ├── all_transactions_2023-07-18_11-06-04.csv + │ ├── average_reward_per_episode_2023-07-18_11-06-04.csv + │ └── average_reward_per_episode_2023-07-18_11-06-04.png + ├── learning/ + │ ├── all_transactions_2023-07-18_11-06-04.csv + │ ├── average_reward_per_episode_2023-07-18_11-06-04.csv + │ ├── average_reward_per_episode_2023-07-18_11-06-04.png + │ ├── checkpoints/ + │ │ └── sb3ppo_10.zip + │ ├── SB3_PPO.zip + │ └── tensorboard_logs/ + │ ├── PPO_1/ + │ │ └── events.out.tfevents.1689674765.METD-9PMRFB3.42960.0 + │ ├── PPO_2/ + │ │ └── events.out.tfevents.1689674766.METD-9PMRFB3.42960.1 + │ ├── PPO_3/ + │ │ └── events.out.tfevents.1689674766.METD-9PMRFB3.42960.2 + │ ├── PPO_4/ + │ │ └── events.out.tfevents.1689674767.METD-9PMRFB3.42960.3 + │ ├── PPO_5/ + │ │ └── events.out.tfevents.1689674767.METD-9PMRFB3.42960.4 + │ ├── PPO_6/ + │ │ └── events.out.tfevents.1689674768.METD-9PMRFB3.42960.5 + │ ├── PPO_7/ + │ │ └── events.out.tfevents.1689674768.METD-9PMRFB3.42960.6 + │ ├── PPO_8/ + │ │ └── events.out.tfevents.1689674769.METD-9PMRFB3.42960.7 + │ ├── PPO_9/ + │ │ └── events.out.tfevents.1689674770.METD-9PMRFB3.42960.8 + │ └── PPO_10/ + │ └── events.out.tfevents.1689674770.METD-9PMRFB3.42960.9 + ├── network_2023-07-18_11-06-04.png + └── session_metadata.json + + Loading a session + ----------------- + + A previous session can be loaded by providing the **directory** of the previous session to either the ``primaite session`` command from the cli + (See :func:`primaite.cli.session`), or by calling :func:`primaite.main.run` with session_path. + + .. tabs:: + + .. code-tab:: bash + :caption: Unix CLI + + cd ~/primaite/2.0.0 + source ./.venv/bin/activate + primaite session --load "path/to/session" + + .. code-tab:: bash + :caption: Powershell CLI + + cd ~\primaite\2.0.0 + .\.venv\Scripts\activate + primaite session --load "path\to\session" + + + .. code-tab:: python + :caption: Python + + from primaite.main import run + + run(session_path=) + + When PrimAITE runs a loaded session, PrimAITE will output in the provided session directory diff --git a/docs/source/request_system.rst b/docs/source/request_system.rst new file mode 100644 index 00000000..41d8eec4 --- /dev/null +++ b/docs/source/request_system.rst @@ -0,0 +1,90 @@ +.. only:: comment + + © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK + +Request System +============== + +``SimComponent``s in the simulation are decoupled from the agent training logic. However, they still need a managed means of accepting requests to perform actions. For this, they use ``RequestManager`` and ``RequestType``. + +Just like other aspects of SimComponent, the request typess are not managed centrally for the whole simulation, but instead they are dynamically created and updated based on the nodes, links, and other components that currently exist. This was achieved in the following way: + +- API + An ``RequestType`` contains two elements: + + 1. ``request`` - selects which action you want to take on this ``SimComponent``. This is formatted as a list of strings such as `['network', 'node', '', 'service', '', 'restart']`. + 2. ``context`` - optional extra information that can be used to decide how to process the request. This is formatted as a dictionary. For example, if the request requires authentication, the context can include information about the user that initiated the request to decide if their permissions are sufficient. + +- request + The request is a list of strings which help specify who should handle the request. The strings in the request list help RequestManagers traverse the 'ownership tree' of SimComponent. The example given above would be handled in the following way: + + 1. ``Simulation`` receives `['network', 'node', '', 'service', '', 'restart']`. + The first element of the request is ``network``, therefore it passes the request down to its network. + 2. ``Network`` receives `['node', '', 'service', '', 'restart']`. + The first element of the request is ``node``, therefore the network looks at the node uuid and passes the request down to the node with that uuid. + 3. ``Node`` receives `['service', '', 'restart']`. + The first element of the request is ``service``, therefore the node looks at the service uuid and passes the rest of the request to the service with that uuid. + 4. ``Service`` receives ``['restart']``. + Since ``restart`` is a defined request type in the service's own RequestManager, the service performs a restart. + +Technical Detail +================ + +This system was achieved by implementing two classes, :py:class:`primaite.simulator.core.RequestType`, and :py:class:`primaite.simulator.core.RequestManager`. + +``RequestType`` +------ + +The ``RequestType`` object stores a reference to a method that executes the request, for example a node could have a request type that stores a reference to ``self.turn_on()``. Technically, this can be any callable that accepts `request, context` as it's parameters. In practice, this is often defined using ``lambda`` functions within a component's ``self._init_request_manager()`` method. Optionally, the ``RequestType`` object can also hold a validator that will permit/deny the request depending on context. + +``RequestManager`` +------------- + +The ``RequestManager`` object stores a mapping between strings and request types. It is responsible for processing the request and passing it down the ownership tree. Technically, the ``RequestManager`` is itself a callable that accepts `request, context` tuple, and so it can be chained with other request managers. + +A simple example without chaining can be seen in the :py:class:`primaite.simulator.file_system.file_system.File` class. + +.. code-block:: python + + class File(FileSystemItemABC): + ... + def _init_request_manager(self): + ... + request_manager.add_request("scan", RequestType(func=lambda request, context: self.scan())) + request_manager.add_request("repair", RequestType(func=lambda request, context: self.repair())) + request_manager.add_request("restore", RequestType(func=lambda request, context: self.restore())) + +*ellipses (``...``) used to omit code impertinent to this explanation* + +Chaining RequestManagers +----------------------- + +A request function needs to be a callable that accepts ``request, context`` as parameters. Since the request manager resolves requests by invoking it with ``request, context`` as parameter, it is possible to use a ``RequestManager`` as a ``RequestType``. + +When a RequestManager accepts a request, it pops the first element and uses it to decide where it should send the remaining request. This is how PrimAITE traverses the ownership tree. If the ``RequestType`` has another ``RequestManager`` as its function, the request will be routed again. Each time the request is passed to a new request manager, the first element is popped. + +An example of how this works is in the :py:class:`primaite.simulator.network.hardware.base.Node` class. + +.. code-block:: python + + class Node(SimComponent): + ... + def _init_request_manager(self): + ... + # a regular action which is processed by the Node itself + request_manager.add_request("turn_on", RequestType(func=lambda request, context: self.turn_on())) + + # if the Node receives a request where the first word is 'service', it will use a dummy manager + # called self._service_request_manager to pass on the reqeust to the relevant service. This dummy + # manager is simply here to map the service UUID that that service's own action manager. This is + # done because the next string after "service" is always the uuid of that service, so we need an + # RequestManager to pop that string before sending it onto the relevant service's RequestManager. + self._service_request_manager = RequestManager() + request_manager.add_request("service", RequestType(func=self._service_request_manager)) + ... + + def install_service(self, service): + self.services[service.uuid] = service + ... + # Here, the service UUID is registered to allow passing actions between the node and the service. + self._service_request_manager.add_request(service.uuid, RequestType(func=service._request_manager)) diff --git a/docs/source/simulation.rst b/docs/source/simulation.rst index 8671a2d2..5e259c6f 100644 --- a/docs/source/simulation.rst +++ b/docs/source/simulation.rst @@ -23,4 +23,8 @@ Contents simulation_components/network/network simulation_components/system/internal_frame_processing simulation_components/system/software - action_system + simulation_components/system/data_manipulation_bot + simulation_components/system/database_client_server + simulation_components/system/dns_client_server + simulation_components/system/ftp_client_server + simulation_components/system/web_browser_and_web_server_service diff --git a/docs/source/simulation_structure.rst b/docs/source/simulation_structure.rst index 2f0a56e8..6e0ab5ce 100644 --- a/docs/source/simulation_structure.rst +++ b/docs/source/simulation_structure.rst @@ -12,7 +12,7 @@ and a domain controller for managing software and users. Each node of the simulation 'tree' has responsibility for creating, deleting, and updating its direct descendants. Also, when a component's ``describe_state()`` method is called, it will include the state of its descendants. The -``apply_action()`` method can be used to act on a component or one of its descendatnts. The diagram below shows the +``apply_request()`` method can be used to act on a component or one of its descendatnts. The diagram below shows the relationship between components. .. image:: _static/component_relationship.png @@ -25,9 +25,9 @@ relationship between components. Actions ======= Agents can interact with the simulation by using actions. Actions are standardised with the -:py:class:`primaite.simulation.core.Action` class, which just holds a reference to two special functions. +:py:class:`primaite.simulation.core.RequestType` class, which just holds a reference to two special functions. -1. The action function itself, it must accept a `request` parameters which is a list of strings that describe what the +1. The request function itself, it must accept a `request` parameters which is a list of strings that describe what the action should do. It must also accept a `context` dict which can house additional information surrounding the action. For example, the context will typically include information about which entity intiated the action. 2. A validator function. This function should return a boolean value that decides if the request is permitted or not. diff --git a/docs/source/state_system.rst b/docs/source/state_system.rst new file mode 100644 index 00000000..b8a9624e --- /dev/null +++ b/docs/source/state_system.rst @@ -0,0 +1,31 @@ +.. only:: comment + + © Crown-owned copyright 2023, Defence Science and Technology Laboratory UK + +Simulation State +============== + +``SimComponent``s in the simulation have a method called ``describe_state`` which returns a dictionary of the state of the component. This is used to report pertinent data that could impact agent's actions or rewards. For instance, the name and health status of a node is reported, which can be used by a reward function to punish corrupted or compromised nodes and reward healthy nodes. Each ``SimComponent`` reports not only it's own attributes in the state but also that of its child components. I.e. a computer node will report the state of its ``FileSystem``, and the ``FileSystem`` will report the state of its files and folders. This happens by recursively calling childrens' own ``describe_state`` methods. + +The game layer calls ``describe_state`` on the trunk ``SimComponent`` (the top-level parent) and then pass the state to the agents once per simulation step. For this reason, all ``SimComponent``s must have a ``describe_state`` method, and they must all be linked to the trunk ``SimComponent``. + +This code snippet demonstrates how the state information is defined within the ``SimComponent`` class: + +.. code-block:: python + + class Node(SimComponent): + operating_state: NodeOperatingState = NodeOperatingState.OFF + services: Dict[str, Service] = {} + + def describe_state(self) -> Dict: + state = super().describe_state() + state["operating_state"] = self.operating_state.value + state["services"] = {uuid: svc.describe_state() for uuid, svc in self.services.items()} + return state + + class Service(SimComponent): + health_state: ServiceHealthState = ServiceHealthState.GOOD + def describe_state(self) -> Dict: + state = super().describe_state() + state["health_state"] = self.health_state.value + return state