#915 - Created app dirs and set as constants in the top-level init.

- renamed _config_values_main to training_config.py and renamed the ConfigValuesMain class to TrainingConfig. Moved training_config.py to src/primaite/config/training_config.py - Renamed all training config yaml file keys to make creating an instance of TrainingConfig easier. Moved action_type and num_steps over to the training config. - Decoupled the training config and lay down config. - Refactored main.py so that it can be ran from CLI and can take a training config path and a lay down config path. - refactored all outputs so that they save to the session dir. - Added some necessary setup scripts that handle creating app dirs, fronting example config files to the user, fronting demo notebooks to the user, performing clean-up in between installations etc. - Added functions that attempt to retrieve the file path of users example config files that have been fronted by the primaite setup. - Added logging config and a getLogger function in the top-level init. - Refactored all logs entries logged to use a logger using the primaite logging config. - Added basic typer CLI for doing things like setup, viewing logs, viewing primaite version, running a basic session. - Updated test to use new features and config structures. - Began updating docs. More to do here.
2023-06-07 22:40:16 +01:00
parent ef3cef530b
commit 273876873e
44 changed files with 1527 additions and 1356 deletions
--- a/docs/source/about.rst
+++ b/docs/source/about.rst
@@ -8,51 +8,51 @@ Features

 PrimAITE provides the following features:

-* A flexible network / system laydown based on the Python networkx framework
-* Nodes and links (edges) host Python classes in order to present attributes and methods (and hence, a more representative model of a platform / system)
-* A ‘green agent’ Information Exchange Requirement (IER) function allows the representation of traffic (protocols and loading) on any / all links. Application of IERs is based on the status of node operating systems and services
-* A ‘green agent’ node Pattern-of-Life (PoL) function allows the representation of core behaviours on nodes (e.g. Hardware state, Software State, Service state, File System state)
-* An Access Control List (ACL) function, mimicking the behaviour of a network firewall, is applied across the model, following standard ACL rule format (e.g. DENY/ALLOW, source IP, destination IP, protocol and port). Application of IERs adheres to any ACL restrictions
-* Presents an OpenAI Gym interface to the environment, allowing integration with any OpenAI Gym compliant defensive agents 
+* A flexible network / system laydown based on the Python networkx framework
+* Nodes and links (edges) host Python classes in order to present attributes and methods (and hence, a more representative model of a platform / system)
+* A ‘green agent’ Information Exchange Requirement (IER) function allows the representation of traffic (protocols and loading) on any / all links. Application of IERs is based on the status of node operating systems and services
+* A ‘green agent’ node Pattern-of-Life (PoL) function allows the representation of core behaviours on nodes (e.g. Hardware state, Software State, Service state, File System state)
+* An Access Control List (ACL) function, mimicking the behaviour of a network firewall, is applied across the model, following standard ACL rule format (e.g. DENY/ALLOW, source IP, destination IP, protocol and port). Application of IERs adheres to any ACL restrictions
+* Presents an OpenAI Gym interface to the environment, allowing integration with any OpenAI Gym compliant defensive agents
 * Red agent activity based on ‘red’ IERs and ‘red’ PoL
-* Defined reward function for use with RL agents (based on nodes status, and green / red IER success)
-* Fully configurable (network / system laydown, IERs, node PoL, ACL, episode step period, episode max steps) and repeatable to suit the training requirements of agents. Therefore, not bound to a representation of any particular platform, system or technology
-* Full capture of discrete metrics relating to agent training (full system state, agent actions taken, average reward)
-* Networkx provides laydown visualisation capability 
+* Defined reward function for use with RL agents (based on nodes status, and green / red IER success)
+* Fully configurable (network / system laydown, IERs, node PoL, ACL, episode step period, episode max steps) and repeatable to suit the training requirements of agents. Therefore, not bound to a representation of any particular platform, system or technology
+* Full capture of discrete metrics relating to agent training (full system state, agent actions taken, average reward)
+* Networkx provides laydown visualisation capability

 Architecture - Nodes and Links
 ******************************

 **Nodes**

-An inheritance model has been adopted in order to model nodes. All nodes have the following base attributes (Class: Node):
+An inheritance model has been adopted in order to model nodes. All nodes have the following base attributes (Class: Node):

-* ID
+* ID
 * Name
-* Type (e.g. computer, switch, RTU - enumeration)
-* Priority (P1, P2, P3, P4 or P5 - enumeration)
-* Hardware State (ON, OFF, RESETTING - enumeration)
+* Type (e.g. computer, switch, RTU - enumeration)
+* Priority (P1, P2, P3, P4 or P5 - enumeration)
+* Hardware State (ON, OFF, RESETTING - enumeration)

-Active Nodes also have the following attributes (Class: Active Node):
+Active Nodes also have the following attributes (Class: Active Node):

-* IP Address
-* Software State (GOOD, PATCHING, COMPROMISED - enumeration)
+* IP Address
+* Software State (GOOD, PATCHING, COMPROMISED - enumeration)
 * File System State (GOOD, CORRUPT, DESTROYED, REPAIRING, RESTORING - enumeration)

-Service Nodes also have the following attributes (Class: Service Node):
+Service Nodes also have the following attributes (Class: Service Node):

-* List of Services (where service is composed of service name and port). There is no theoretical limit on the number of services that can be modelled. Services and protocols are currently intrinsically linked (i.e. a service is an application on a node transmitting traffic of this protocol type)
+* List of Services (where service is composed of service name and port). There is no theoretical limit on the number of services that can be modelled. Services and protocols are currently intrinsically linked (i.e. a service is an application on a node transmitting traffic of this protocol type)
 * Service state (GOOD, PATCHING, COMPROMISED, OVERWHELMED - enumeration)

 Passive Nodes are currently not used (but may be employed for non IP-based components such as machinery actuators in future releases).

 **Links**

-Links are modelled both as network edges (networkx) and as Python classes, in order to extend their functionality. Links include the following attributes:
+Links are modelled both as network edges (networkx) and as Python classes, in order to extend their functionality. Links include the following attributes:

-* ID
+* ID
 * Name
-* Bandwidth (bits/s)
+* Bandwidth (bits/s)
 * Source node ID
 * Destination node ID
 * Protocol list (containing the loading of protocols currently running on the link)
@@ -62,32 +62,32 @@ When the simulation runs, IERs are applied to the links in order to model traffi
 Information Exchange Requirements (IERs)
 ****************************************

-PrimAITE adopts the concept of Information Exchange Requirements (IERs) to model both green agent (background) and red agent (adversary) behaviour. IERs are used to initiate modelling of traffic loading on the network, and have the following attributes:
+PrimAITE adopts the concept of Information Exchange Requirements (IERs) to model both green agent (background) and red agent (adversary) behaviour. IERs are used to initiate modelling of traffic loading on the network, and have the following attributes:

-* ID
-* Start step (i.e. which step in the training episode should the IER start)
-* End step (i.e. which step in the training episode should the IER end)
+* ID
+* Start step (i.e. which step in the training episode should the IER start)
+* End step (i.e. which step in the training episode should the IER end)
 * Source node ID
-* Destination node ID
-* Load (bits/s)
-* Protocol
-* Port
+* Destination node ID
+* Load (bits/s)
+* Protocol
+* Port
 * Running status (i.e. on / off)

-The application of green agent IERs between a source and destination follows a number of rules. Specifically:
+The application of green agent IERs between a source and destination follows a number of rules. Specifically:

-1. Does the current simulation time step fall between IER start and end step
-2. Is the source node operational (both physically and at an O/S level), and is the service (protocol / port) associated with the IER (a) present on this node, and (b) in an operational state (i.e. not PATCHING)
-3. Is the destination node operational (both physically and at an O/S level), and is the service (protocol / port) associated with the IER (a) present on this node, and (b) in an operational state (i.e. not PATCHING)
-4. Are there any Access Control List rules in place that prevent the application of this IER
-5. Are all switches in the (OSPF) path between source and destination operational (both physically and at an O/S level)
+1. Does the current simulation time step fall between IER start and end step
+2. Is the source node operational (both physically and at an O/S level), and is the service (protocol / port) associated with the IER (a) present on this node, and (b) in an operational state (i.e. not PATCHING)
+3. Is the destination node operational (both physically and at an O/S level), and is the service (protocol / port) associated with the IER (a) present on this node, and (b) in an operational state (i.e. not PATCHING)
+4. Are there any Access Control List rules in place that prevent the application of this IER
+5. Are all switches in the (OSPF) path between source and destination operational (both physically and at an O/S level)

-For red agent IERs, the application of IERs between a source and destination follows a number of subtly different rules. Specifically:
+For red agent IERs, the application of IERs between a source and destination follows a number of subtly different rules. Specifically:

-1. Does the current simulation time step fall between IER start and end step
-2. Is the source node operational, and is the service (protocol / port) associated with the IER (a) present on that node and (b) already in a compromised state
-3. Is the destination node operational, and is the service (protocol / port) associated with the IER present on that node
-4. Are there any Access Control List rules in place that prevent the application of this IER
+1. Does the current simulation time step fall between IER start and end step
+2. Is the source node operational, and is the service (protocol / port) associated with the IER (a) present on that node and (b) already in a compromised state
+3. Is the destination node operational, and is the service (protocol / port) associated with the IER present on that node
+4. Are there any Access Control List rules in place that prevent the application of this IER
 5. Are all switches in the (OSPF) path between source and destination operational (both physically and at an O/S level)

 Assuming the rules pass, the IER is applied to all relevant links (based on use of OSPF) between source and destination.
@@ -149,7 +149,7 @@ Red agent pattern-of-life has an additional feature not found in the green patte
 Access Control List modelling
 *****************************

-An Access Control List (ACL) is modelled to provide the means to manage traffic flows in the system. This will allow defensive agents the means to turn on / off rules, or potentially create new rules, to counter an attack.
+An Access Control List (ACL) is modelled to provide the means to manage traffic flows in the system. This will allow defensive agents the means to turn on / off rules, or potentially create new rules, to counter an attack.

 The ACL follows a standard network firewall format. For example:

@@ -183,9 +183,9 @@ All ACL rules are considered when applying an IER. Logic follows the order of ru
 Observation Spaces
 ******************

-The OpenAI Gym observation space provides the status of all nodes and links across the whole system:
+The OpenAI Gym observation space provides the status of all nodes and links across the whole system:

-* Nodes (in terms of hardware state, Software State, file system state and services state) 
+* Nodes (in terms of hardware state, Software State, file system state and services state)
 * Links (in terms of current loading for each service/protocol)

 The observation space can be configured as a ``gym.spaces.Box`` or ``gym.spaces.MultiDiscrete``, by setting the ``OBSERVATIONS`` parameter in the laydown config.
--- a/docs/source/config.rst
+++ b/docs/source/config.rst
@@ -5,17 +5,22 @@ The Config Files Explained

 PrimAITE uses two configuration files for its operation:

-* config_main.yaml - used to define the top-level settings of the PrimAITE environment, and the session that is to be run.
-* config_[name].yaml - used to define the low-level settings of a session, including the network laydown, green / red agent information exchange requirements (IERSs), Access Control Rules, Action Space type, and the number of steps in each episode.
+* **The Training Config**

-config_main.yaml:
-*****************
+    Used to define the top-level settings of the PrimAITE environment, the reward values, and the session that is to be run.

-The config_main.yaml file consists of the following attributes:
+* **The Lay Down Config**
+
+    Used to define the low-level settings of a session, including the network laydown, green / red agent information exchange requirements (IERSs) and Access Control Rules.
+
+Environment Config:
+*******************
+
+The environment config file consists of the following attributes:

 **Generic Config Values**

-* **agentIdentifier** [enum]
+* **agent_identifier** [enum]

   This identifies the agent to use for the session. Select from one of the following:

@@ -23,61 +28,68 @@ The config_main.yaml file consists of the following attributes:
   * STABLE_BASELINES3_PPO - Use a SB3 PPO agent
   * STABLE_BASELINES3_A2C - use a SB3 A2C agent

-* **numEpisodes** [int]
+* **action_type** [enum]

-   This defines the number of episodes that the agent will train or be evaluated over. Each episode consists of a number of steps (with step number defined in the config_[name].yaml file)
+   Determines whether a NODE, ACL, or ANY (combined NODE & ACL) action space format is adopted for the session

-* **timeDelay** [int]
+
+* **num_episodes** [int]
+
+   This defines the number of episodes that the agent will train or be evaluated over.
+
+* **num_stepss** [int]
+
+   Determines the number of steps to run in each episode of the session
+
+
+* **time_delay** [int]

   The time delay (in milliseconds) to take between each step when running a GENERIC agent session

-* **configFilename** [filename]

-   The name of the config_[name].yaml file to use for this session
-
-* **sessionType** [text]
+* **session_type** [text]

   Type of session to be run (TRAINING or EVALUATION)

-* **loadAgent** [bool]
+* **load_agent** [bool]

   Determine whether to load an agent from file

-* **agentLoadFile** [text]
+* **agent_load_file** [text]

   File path and file name of agent if you're loading one in

-* **observationSpaceHighValue** [int]
+* **observation_space_high_value** [int]

   The high value to use for values in the observation space. This is set to 1000000000 by default, and should not need changing in most cases

 **Reward-Based Config Values**

-* **Generic [allOk]** [int]
+* **Generic [all_ok]** [int]

   The score to give when the current situation (for a given component) is no different from that expected in the baseline (i.e. as though no blue or red agent actions had been undertaken)

-* **Node Hardware State [offShouldBeOn]** [int]
+* **Node Hardware State [off_should_be_on]** [int]

   The score to give when the node should be on, but is off

-* **Node Hardware State [offShouldBeResetting]** [int]
+* **Node Hardware State [off_should_be_resetting]** [int]

   The score to give when the node should be resetting, but is off

-* **Node Hardware State [onShouldBeOff]** [int]
+* **Node Hardware State [on_should_be_off]** [int]

   The score to give when the node should be off, but is on

-* **Node Hardware State [onShouldBeResetting]** [int]
+* **Node Hardware State [on_should_be_resetting]** [int]

   The score to give when the node should be resetting, but is on

-* **Node Hardware State [resettingShouldBeOn]** [int]
+* **Node Hardware State [resetting_should_be_on]** [int]

   The score to give when the node should be on, but is resetting

-* **Node Hardware State [resettingShouldBeOff]** [int]
+* **Node Hardware State [resetting_should_be_off]** [int]

   The score to give when the node should be off, but is resetting

@@ -85,27 +97,27 @@ The config_main.yaml file consists of the following attributes:

   The score to give when the node is resetting

-* **Node Operating System or Service State [goodShouldBePatching]** [int]
+* **Node Operating System or Service State [good_should_be_patching]** [int]

   The score to give when the state should be patching, but is good

-* **Node Operating System or Service State [goodShouldBeCompromised]** [int]
+* **Node Operating System or Service State [good_should_be_compromised]** [int]

   The score to give when the state should be compromised, but is good

-* **Node Operating System or Service State [goodShouldBeOverwhelmed]** [int]
+* **Node Operating System or Service State [good_should_be_overwhelmed]** [int]

   The score to give when the state should be overwhelmed, but is good

-* **Node Operating System or Service State [patchingShouldBeGood]** [int]
+* **Node Operating System or Service State [patching_should_be_good]** [int]

   The score to give when the state should be good, but is patching

-* **Node Operating System or Service State [patchingShouldBeCompromised]** [int]
+* **Node Operating System or Service State [patching_should_be_compromised]** [int]

   The score to give when the state should be compromised, but is patching

-* **Node Operating System or Service State [patchingShouldBeOverwhelmed]** [int]
+* **Node Operating System or Service State [patching_should_be_overwhelmed]** [int]

   The score to give when the state should be overwhelmed, but is patching

@@ -113,15 +125,15 @@ The config_main.yaml file consists of the following attributes:

   The score to give when the state is patching

-* **Node Operating System or Service State [compromisedShouldBeGood]** [int]
+* **Node Operating System or Service State [compromised_should_be_good]** [int]

   The score to give when the state should be good, but is compromised

-* **Node Operating System or Service State [compromisedShouldBePatching]** [int]
+* **Node Operating System or Service State [compromised_should_be_patching]** [int]

   The score to give when the state should be patching, but is compromised

-* **Node Operating System or Service State [compromisedShouldBeOverwhelmed]** [int]
+* **Node Operating System or Service State [compromised_should_be_overwhelmed]** [int]

   The score to give when the state should be overwhelmed, but is compromised

@@ -129,15 +141,15 @@ The config_main.yaml file consists of the following attributes:

   The score to give when the state is compromised

-* **Node Operating System or Service State [overwhelmedShouldBeGood]** [int]
+* **Node Operating System or Service State [overwhelmed_should_be_good]** [int]

   The score to give when the state should be good, but is overwhelmed

-* **Node Operating System or Service State [overwhelmedShouldBePatching]** [int]
+* **Node Operating System or Service State [overwhelmed_should_be_patching]** [int]

   The score to give when the state should be patching, but is overwhelmed

-* **Node Operating System or Service State [overwhelmedShouldBeCompromised]** [int]
+* **Node Operating System or Service State [overwhelmed_should_be_compromised]** [int]

   The score to give when the state should be compromised, but is overwhelmed

@@ -145,35 +157,35 @@ The config_main.yaml file consists of the following attributes:

   The score to give when the state is overwhelmed

-* **Node File System State [goodShouldBeRepairing]** [int]
+* **Node File System State [good_should_be_repairing]** [int]

    The score to give when the state should be repairing, but is good

-* **Node File System State [goodShouldBeRestoring]** [int]
+* **Node File System State [good_should_be_restoring]** [int]

    The score to give when the state should be restoring, but is good

-* **Node File System State [goodShouldBeCorrupt]** [int]
+* **Node File System State [good_should_be_corrupt]** [int]

    The score to give when the state should be corrupt, but is good

-* **Node File System State [goodShouldBeDestroyed]** [int]
+* **Node File System State [good_should_be_destroyed]** [int]

    The score to give when the state should be destroyed, but is good

-* **Node File System State [repairingShouldBeGood]** [int]
+* **Node File System State [repairing_should_be_good]** [int]

    The score to give when the state should be good, but is repairing

-* **Node File System State [repairingShouldBeRestoring]** [int]
+* **Node File System State [repairing_should_be_restoring]** [int]

    The score to give when the state should be restoring, but is repairing

-* **Node File System State [repairingShouldBeCorrupt]** [int]
+* **Node File System State [repairing_should_be_corrupt]** [int]

    The score to give when the state should be corrupt, but is repairing

-* **Node File System State [repairingShouldBeDestroyed]** [int]
+* **Node File System State [repairing_should_be_destroyed]** [int]

    The score to give when the state should be destroyed, but is repairing

@@ -181,19 +193,19 @@ The config_main.yaml file consists of the following attributes:

    The score to give when the state is repairing

-* **Node File System State [restoringShouldBeGood]** [int]
+* **Node File System State [restoring_should_be_good]** [int]

    The score to give when the state should be good, but is restoring

-* **Node File System State [restoringShouldBeRepairing]** [int]
+* **Node File System State [restoring_should_be_repairing]** [int]

    The score to give when the state should be repairing, but is restoring

-* **Node File System State [restoringShouldBeCorrupt]** [int]
+* **Node File System State [restoring_should_be_corrupt]** [int]

    The score to give when the state should be corrupt, but is restoring

-* **Node File System State [restoringShouldBeDestroyed]** [int]
+* **Node File System State [restoring_should_be_destroyed]** [int]

    The score to give when the state should be destroyed, but is restoring

@@ -201,19 +213,19 @@ The config_main.yaml file consists of the following attributes:

    The score to give when the state is restoring

-* **Node File System State [corruptShouldBeGood]** [int]
+* **Node File System State [corrupt_should_be_good]** [int]

    The score to give when the state should be good, but is corrupt

-* **Node File System State [corruptShouldBeRepairing]** [int]
+* **Node File System State [corrupt_should_be_repairing]** [int]

    The score to give when the state should be repairing, but is corrupt

-* **Node File System State [corruptShouldBeRestoring]** [int]
+* **Node File System State [corrupt_should_be_restoring]** [int]

    The score to give when the state should be restoring, but is corrupt

-* **Node File System State [corruptShouldBeDestroyed]** [int]
+* **Node File System State [corrupt_should_be_destroyed]** [int]

    The score to give when the state should be destroyed, but is corrupt

@@ -221,19 +233,19 @@ The config_main.yaml file consists of the following attributes:

    The score to give when the state is corrupt

-* **Node File System State [destroyedShouldBeGood]** [int]
+* **Node File System State [destroyed_should_be_good]** [int]

    The score to give when the state should be good, but is destroyed

-* **Node File System State [destroyedShouldBeRepairing]** [int]
+* **Node File System State [destroyed_should_be_repairing]** [int]

    The score to give when the state should be repairing, but is destroyed

-* **Node File System State [destroyedShouldBeRestoring]** [int]
+* **Node File System State [destroyed_should_be_restoring]** [int]

    The score to give when the state should be restoring, but is destroyed

-* **Node File System State [destroyedShouldBeCorrupt]** [int]
+* **Node File System State [destroyed_should_be_corrupt]** [int]

    The score to give when the state should be corrupt, but is destroyed

@@ -245,52 +257,44 @@ The config_main.yaml file consists of the following attributes:

    The score to give when the state is scanning

-* **IER Status [redIerRunning]** [int]
+* **IER Status [red_ier_running]** [int]

   The score to give when a red agent IER is permitted to run

-* **IER Status [greenIerBlocked]** [int]
+* **IER Status [green_ier_blocked]** [int]

   The score to give when a green agent IER is prevented from running

 **Patching / Reset Durations**

-* **osPatchingDuration** [int]
+* **os_patching_duration** [int]

   The number of steps to take when patching an Operating System

-* **nodeResetDuration** [int]
+* **node_reset_duration** [int]

   The number of steps to take when resetting a node's hardware state

-* **servicePatchingDuration** [int]
+* **service_patching_duration** [int]

   The number of steps to take when patching a service

-* **fileSystemRepairingLimit** [int]:
+* **file_system_repairing_limit** [int]:

   The number of steps to take when repairing the file system

-* **fileSystemRestoringLimit** [int]
+* **file_system_restoring_limit** [int]

   The number of steps to take when restoring the file system

-* **fileSystemScanningLimit** [int]
+* **file_system_scanning_limit** [int]

   The number of steps to take when scanning the file system

-config_[name].yaml:
+The Lay Down Config
 *******************

-The config_[name].yaml file consists of the following attributes:
-
-* **itemType: ACTIONS** [enum]
-
-   Determines whether a NODE or ACL action space format is adopted for the session
-
-* **itemType: STEPS** [int]
-
-   Determines the number of steps to run in each episode of the session
+The lay down config file consists of the following attributes:

 * **itemType: PORTS** [int]

--- a/docs/source/session.rst
+++ b/docs/source/session.rst
@@ -29,10 +29,10 @@ the run_generic function should be selected, and should be modified (typically)

 .. code:: python

-    agent = MyAgent(environment, max_steps)
-    for episode in range(0, num_episodes):
-        agent.learn()      
-    env.close()
+    agent = MyAgent(environment, max_steps)
+    for episode in range(0, num_episodes):
+        agent.learn()
+    env.close()
    save_agent(agent)

 Where:
@@ -51,29 +51,29 @@ environment is reset between episodes. Note that the example below should not be

 .. code:: python

-    def learn(self) :
+    def learn(self) :

-    # pre-reqs
+    # pre-reqs

-    # reset the environment
-    self.environment.reset()
-    done = False
+    # reset the environment
+    self.environment.reset()
+    done = False

-    for step in range(max_steps):
-        # calculate the action
+    for step in range(max_steps):
+        # calculate the action
        action = ...

-        # execute the environment step
-        new_state, reward, done, info = self.environment.step(action)
+        # execute the environment step
+        new_state, reward, done, info = self.environment.step(action)

-        # algorithm updates
+        # algorithm updates
        ...

-        # update to our new state
-        state = new_state
+        # update to our new state
+        state = new_state

-        # if done, finish episode
-        if done == True:
+        # if done, finish episode
+        if done == True:
            break

 **Running the session**