Package restructuring

2023-05-25 10:31:37 +01:00
parent 73004f9bf9
commit 4f0d8807d6
51 changed files with 127 additions and 74 deletions
--- a/docs/Makefile
+++ b/docs/Makefile
@@ -0,0 +1,20 @@
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line, and also
+# from the environment for the first two.
+SPHINXOPTS    ?=
+SPHINXBUILD   ?= sphinx-build
+SOURCEDIR     = source
+BUILDDIR      = build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
--- a/docs/make.bat
+++ b/docs/make.bat
@@ -0,0 +1,35 @@
+@ECHO OFF
+
+pushd %~dp0
+
+REM Command file for Sphinx documentation
+
+if "%SPHINXBUILD%" == "" (
+	set SPHINXBUILD=sphinx-build
+)
+set SOURCEDIR=source
+set BUILDDIR=build
+
+%SPHINXBUILD% >NUL 2>NUL
+if errorlevel 9009 (
+	echo.
+	echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
+	echo.installed, then set the SPHINXBUILD environment variable to point
+	echo.to the full path of the 'sphinx-build' executable. Alternatively you
+	echo.may add the Sphinx directory to PATH.
+	echo.
+	echo.If you don't have Sphinx installed, grab it from
+	echo.https://www.sphinx-doc.org/
+	exit /b 1
+)
+
+if "%1" == "" goto help
+
+%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
+goto end
+
+:help
+%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
+
+:end
+popd
--- a/docs/source/about.rst
+++ b/docs/source/about.rst
@@ -0,0 +1,345 @@
+.. _about:
+
+About PrimAITE
+==============
+
+Features
+********
+
+PrimAITE provides the following features:
+
+* A flexible network / system laydown based on the Python networkx framework
+* Nodes and links (edges) host Python classes in order to present attributes and methods (and hence, a more representative model of a platform / system)
+* A ‘green agent’ Information Exchange Requirement (IER) function allows the representation of traffic (protocols and loading) on any / all links. Application of IERs is based on the status of node operating systems and services
+* A ‘green agent’ node Pattern-of-Life (PoL) function allows the representation of core behaviours on nodes (e.g. Operating state, Operating System state, Service state, File System state)
+* An Access Control List (ACL) function, mimicking the behaviour of a network firewall, is applied across the model, following standard ACL rule format (e.g. DENY/ALLOW, source IP, destination IP, protocol and port). Application of IERs adheres to any ACL restrictions
+* Presents an OpenAI Gym interface to the environment, allowing integration with any OpenAI Gym compliant defensive agents 
+* Red agent activity based on ‘red’ IERs and ‘red’ PoL
+* Defined reward function for use with RL agents (based on nodes status, and green / red IER success)
+* Fully configurable (network / system laydown, IERs, node PoL, ACL, episode step period, episode max steps) and repeatable to suit the training requirements of agents. Therefore, not bound to a representation of any particular platform, system or technology
+* Full capture of discrete metrics relating to agent training (full system state, agent actions taken, average reward)
+* Networkx provides laydown visualisation capability 
+
+Architecture - Nodes and Links
+******************************
+
+**Nodes**
+
+An inheritance model has been adopted in order to model nodes. All nodes have the following base attributes (Class: Node):
+
+* ID
+* Name
+* Type (e.g. computer, switch, RTU - enumeration)
+* Priority (P1, P2, P3, P4 or P5 - enumeration)
+* Operating State (ON, OFF, RESETTING - enumeration)
+
+Active Nodes also have the following attributes (Class: Active Node):
+
+* IP Address
+* Operating System State (GOOD, PATCHING, COMPROMISED - enumeration)
+* File System State (GOOD, CORRUPT, DESTROYED, REPAIRING, RESTORING - enumeration)
+
+Service Nodes also have the following attributes (Class: Service Node):
+
+* List of Services (where service is composed of service name and port). There is no theoretical limit on the number of services that can be modelled. Services and protocols are currently intrinsically linked (i.e. a service is an application on a node transmitting traffic of this protocol type)
+* Service state (GOOD, PATCHING, COMPROMISED, OVERWHELMED - enumeration)
+
+Passive Nodes are currently not used (but may be employed for non IP-based components such as machinery actuators in future releases).
+
+**Links**
+
+Links are modelled both as network edges (networkx) and as Python classes, in order to extend their functionality. Links include the following attributes:
+
+* ID
+* Name
+* Bandwidth (bits/s)
+* Source node ID
+* Destination node ID
+* Protocol list (containing the loading of protocols currently running on the link)
+
+When the simulation runs, IERs are applied to the links in order to model traffic loading, individually assigned to each protocol. This allows green (background) and red agent behaviour to be modelled, and defensive agents to identify suspicious traffic patterns at a protocol / traffic loading level of fidelity.
+
+Information Exchange Requirements (IERs)
+****************************************
+
+PrimAITE adopts the concept of Information Exchange Requirements (IERs) to model both green agent (background) and red agent (adversary) behaviour. IERs are used to initiate modelling of traffic loading on the network, and have the following attributes:
+
+* ID
+* Start step (i.e. which step in the training episode should the IER start)
+* End step (i.e. which step in the training episode should the IER end)
+* Source node ID
+* Destination node ID
+* Load (bits/s)
+* Protocol
+* Port
+* Running status (i.e. on / off)
+
+The application of green agent IERs between a source and destination follows a number of rules. Specifically:
+
+1. Does the current simulation time step fall between IER start and end step
+2. Is the source node operational (both physically and at an O/S level), and is the service (protocol / port) associated with the IER (a) present on this node, and (b) in an operational state (i.e. not PATCHING)
+3. Is the destination node operational (both physically and at an O/S level), and is the service (protocol / port) associated with the IER (a) present on this node, and (b) in an operational state (i.e. not PATCHING)
+4. Are there any Access Control List rules in place that prevent the application of this IER
+5. Are all switches in the (OSPF) path between source and destination operational (both physically and at an O/S level)
+
+For red agent IERs, the application of IERs between a source and destination follows a number of subtly different rules. Specifically:
+
+1. Does the current simulation time step fall between IER start and end step
+2. Is the source node operational, and is the service (protocol / port) associated with the IER (a) present on that node and (b) already in a compromised state
+3. Is the destination node operational, and is the service (protocol / port) associated with the IER present on that node
+4. Are there any Access Control List rules in place that prevent the application of this IER
+5. Are all switches in the (OSPF) path between source and destination operational (both physically and at an O/S level)
+
+Assuming the rules pass, the IER is applied to all relevant links (based on use of OSPF) between source and destination.
+
+Node Pattern-of-Life
+********************
+
+Every node can be impacted (i.e. have a status change applied to it) by either green agent pattern-of-life or red agent pattern-of-life. This is distinct from IERs, and allows for attacks (and defence) to be modelled purely within the confines of a node.
+
+The status changes that can be made to a node are as follows:
+
+* All Nodes:
+
+   * Operating State:
+
+      * ON
+      * OFF
+      * RESETTING - when a status of resetting is entered, the node will automatically exit this state after a number of steps (as defined by the nodeResetDuration configuration item) after which it returns to an ON state 
+
+* Active Nodes and Service Nodes:
+
+   * Operating System State:
+
+      * GOOD
+      * PATCHING - when a status of patching is entered, the node will automatically exit this state after a number of steps (as defined by the osPatchingDuration configuration item) after which it returns to a GOOD state
+      * COMPROMISED
+
+   * File System State:
+
+      * GOOD
+      * CORRUPT (can be resolved by repair or restore)
+      * DESTROYED (can be resolved by restore only)
+      * REPAIRING - when a status of repairing is entered, the node will automatically exit this state after a number of steps (as defined by the fileSystemRepairingLimit configuration item) after which it returns to a GOOD state
+      * RESTORING - when a status of repairing is entered, the node will automatically exit this state after a number of steps (as defined by the fileSystemRestoringLimit configuration item) after which it returns to a GOOD state
+
+* Service Nodes only:
+
+   * Service State (for any associated service):
+
+      * GOOD
+      * PATCHING - when a status of patching is entered, the service will automatically exit this state after a number of steps (as defined by the servicePatchingDuration configuration item) after which it returns to a GOOD state
+      * COMPROMISED
+      * OVERWHELMED
+
+Red agent pattern-of-life has an additional feature not found in the green pattern-of-life. This is the ability to influence the state of the attributes of a node via a number of different conditions:
+
+   * DIRECT:
+
+   The pattern-of-life described by the configuration file item will be applied regardless of any other conditions in the network. This is particularly useful for direct red agent entry into the network.
+
+   * IER:
+
+   The pattern-of-life described by the configuration file item will be applied to the service on the node, only if there is an IER of the same protocol / service type incoming at the specified timestep.
+
+   * SERVICE:
+
+   The pattern-of-life described by the configuration file item will be applied to the node based on the state of a service. The service can either be on the same node, or a different node within the network.
+
+Access Control List modelling
+*****************************
+
+An Access Control List (ACL) is modelled to provide the means to manage traffic flows in the system. This will allow defensive agents the means to turn on / off rules, or potentially create new rules, to counter an attack.
+
+The ACL follows a standard network firewall format. For example:
+
+.. list-table:: ACL example
+   :widths: 25 25 25 25 25
+   :header-rows: 1
+
+   * - Permission
+     - Source IP
+     - Dest IP
+     - Protocol
+     - Port
+   * - DENY
+     - 192.168.1.2
+     - 192.168.1.3
+     - HTTPS
+     - 443
+   * - ALLOW
+     - 192.168.1.4
+     - ANY
+     - SMTP
+     - 25
+   * - DENY
+     - ANY
+     - 192.168.1.5
+     - ANY
+     - ANY
+
+All ACL rules are considered when applying an IER. Logic follows the order of rules, so a DENY or ALLOW for the same parameters will override an earlier entry.
+
+Observation Spaces
+******************
+
+The OpenAI Gym observation space provides the status of all nodes and links across the whole system:
+
+* Nodes (in terms of operating state, operating system state, file system state and services state) 
+* Links (in terms of current loading for each service/protocol)
+
+An example observation space is provided below:
+
+.. list-table:: Observation Space example
+   :widths: 25 25 25 25 25 25 25
+   :header-rows: 1
+
+   * - 
+     - ID
+     - Operating State
+     - O/S State
+     - File System State
+     - Service / Protocol A
+     - Service / Protocol B
+   * - Node A
+     - 1
+     - 1
+     - 1
+     - 1
+     - 1
+     - 1
+   * - Node B
+     - 2
+     - 1
+     - 3
+     - 1
+     - 1
+     - 1
+   * - Node C
+     - 3
+     - 2
+     - 1
+     - 1
+     - 3
+     - 2
+   * - Link 1
+     - 5
+     - 0
+     - 0
+     - 0
+     - 0
+     - 10000
+   * - Link 2
+     - 6
+     - 0
+     - 0
+     - 0
+     - 0
+     - 10000
+   * - Link 3
+     - 7
+     - 0
+     - 0
+     - 0
+     - 5000
+     - 0
+
+The observation space is a 6 x 6 Box type (OpenAI Gym Space) in this example. This is made up from the node and link information detailed below.
+
+For the nodes, the following values are represented:
+
+ * ID
+ * Operating State:
+
+    * 1 = ON
+    * 2 = OFF
+    * 3 = RESETTING
+
+ * O/S State:
+
+    * 1 = GOOD
+    * 2 = PATCHING
+    * 3 = COMPROMISED
+
+ * Service State:
+
+    * 1 = GOOD
+    * 2 = PATCHING
+    * 3 = COMPROMISED
+    * 4 = OVERWHELMED
+
+ * File System State:
+
+    * 1 = GOOD
+    * 2 = CORRUPT
+    * 3 = DESTROYED
+    * 4 = REPAIRING
+    * 5 = RESTORING
+
+(Note that each service available in the network is provided as a column, although not all nodes may utilise all services)
+
+For the links, the following statuses are represented:
+
+ * ID
+ * Operating State = N/A
+ * O/S State = N/A
+ * Protocol = loading in bits/s
+
+Action Spaces
+**************
+
+The action space available to the blue agent comes in two types:
+
+ 1. Node-based
+ 2. Access Control List
+
+The choice of action space used during a training session is determined in the config_[name].yaml file.
+
+**Node-Based**
+
+The agent is able to influence the status of nodes by switching them off, resetting, or patching operating systems and services. In this instance, the action space is an OpenAI Gym multidiscrete type, as follows:
+
+ * [0, num nodes] - Node ID (0 = nothing, node ID)
+ * [0, 4] - What property it's acting on (0 = nothing, 1 = state, 2 = O/S state, 3 = service state, 4 = file system state)
+ * [0, 3] - Action on property (0 = nothing, 1 = on / scan, 2 = off / repair, 3 = reset / patch / restore)
+ * [0, num services] - Resolves to service ID (0 = nothing, resolves to service)
+
+**Access Control List**
+
+The blue agent is able to influence the configuration of the Access Control List rule set (which implements a system-wide firewall). In this instance, the action space is an OpenAI multidiscrete type, as follows:
+
+
+ * [0, 2] - Action (0 = do nothing, 1 = create rule, 2 = delete rule)
+ * [0, 1] - Permission (0 = DENY, 1 = ALLOW)
+ * [0, num nodes] - Source IP (0 = any, then 1 -> x resolving to IP addresses)
+ * [0, num nodes] - Dest IP (0 = any, then 1 -> x resolving to IP addresses)
+ * [0, num services] - Protocol (0 = any, then 1 -> x resolving to protocol)
+ * [0, num ports] - Port (0 = any, then 1 -> x resolving to port)
+
+Rewards
+*******
+
+A reward value is presented back to the blue agent on the conclusion of every step. The reward value is calculated via two methods which combine to give the total value:
+
+ 1. Node and service status
+ 2. IER status
+
+**Node and service status**
+
+On every step, the status of each node is compared against both a reference environment (simulating the situation if the red and blue agents had not impacted the environment) 
+and the before and after state of the environment. If the comparison against the reference environment shows no difference, then the score provided is "AllOK". If there is a 
+difference with respect to the reference environment, the before and after states are compared, and a score determined. See :ref:`config` for details of reward values.
+
+**IER status**
+
+On every step, the full IER set is examined to determine whether green and red agent IERs are being permitted to run. Any red agent IERs running incur a penalty; any green agent
+IERs not permitted to run also incur a penalty. See :ref:`config` for details of reward values.
+
+Future Enhancements
+*******************
+
+The PrimAITE project has an ambition to include the following enhancements in future releases:
+
+* Integration with a suitable standardised framework to allow multi-agent integration
+* Integration with external threat emulation tools, either using off-line data, or integrating at runtime
+* Provision of data such that agents can construct alternative observation spaces (as an alternative to the default PrimAITE observation space)
--- a/docs/source/conf.py
+++ b/docs/source/conf.py
@@ -0,0 +1,28 @@
+# Configuration file for the Sphinx documentation builder.
+#
+# For the full list of built-in configuration values, see the documentation:
+# https://www.sphinx-doc.org/en/master/usage/configuration.html
+
+# -- Project information -----------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
+
+project = 'PrimAITE'
+copyright = '2022, jashort'
+author = 'jashort'
+release = '0.1.0'
+
+# -- General configuration ---------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
+
+extensions = ['sphinx_rtd_theme']
+
+templates_path = ['_templates']
+exclude_patterns = []
+
+
+
+# -- Options for HTML output -------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
+
+html_theme = 'sphinx_rtd_theme'
+html_static_path = ['_static']
--- a/docs/source/config.rst
+++ b/docs/source/config.rst
@@ -0,0 +1,397 @@
+.. _config:
+
+The Config Files Explained
+==========================
+
+PrimAITE uses two configuration files for its operation:
+
+* config_main.yaml - used to define the top-level settings of the PrimAITE environment, and the session that is to be run.
+* config_[name].yaml - used to define the low-level settings of a session, including the network laydown, green / red agent information exchange requirements (IERSs), Access Control Rules, Action Space type, and the number of steps in each episode.
+
+config_main.yaml:
+*****************
+
+The config_main.yaml file consists of the following attributes:
+
+**Generic Config Values**
+
+* **agentIdentifier** [enum]
+
+   This identifies the agent to use for the session. Select from one of the following:
+
+   * GENERIC - Where a user developed agent is to be used
+   * STABLE_BASELINES3_PPO - Use a SB3 PPO agent
+   * STABLE_BASELINES3_A2C - use a SB3 A2C agent
+
+* **numEpisodes** [int]
+
+   This defines the number of episodes that the agent will train or be evaluated over. Each episode consists of a number of steps (with step number defined in the config_[name].yaml file)
+
+* **timeDelay** [int]
+
+   The time delay (in milliseconds) to take between each step when running a GENERIC agent session
+
+* **configFilename** [filename]
+
+   The name of the config_[name].yaml file to use for this session
+
+* **sessionType** [text]
+
+   Type of session to be run (TRAINING or EVALUATION)
+
+* **loadAgent** [bool]
+
+   Determine whether to load an agent from file
+
+* **agentLoadFile** [text]
+
+   File path and file name of agent if you're loading one in
+
+* **observationSpaceHighValue** [int]
+
+   The high value to use for values in the observation space. This is set to 1000000000 by default, and should not need changing in most cases
+
+**Reward-Based Config Values**
+
+* **Generic [allOk]** [int]
+
+   The score to give when the current situation (for a given component) is no different from that expected in the baseline (i.e. as though no blue or red agent actions had been undertaken)
+
+* **Node Operating State [offShouldBeOn]** [int]
+
+   The score to give when the node should be on, but is off
+
+* **Node Operating State [offShouldBeResetting]** [int]
+
+   The score to give when the node should be resetting, but is off
+
+* **Node Operating State [onShouldBeOff]** [int]
+    
+   The score to give when the node should be off, but is on
+
+* **Node Operating State [onShouldBeResetting]** [int]
+    
+   The score to give when the node should be resetting, but is on
+
+* **Node Operating State [resettingShouldBeOn]** [int]
+    
+   The score to give when the node should be on, but is resetting
+
+* **Node Operating State [resettingShouldBeOff]** [int]
+    
+   The score to give when the node should be off, but is resetting
+
+* **Node Operating State [resetting]** [int]
+    
+   The score to give when the node is resetting
+
+* **Node Operating System or Service State [goodShouldBePatching]** [int]
+    
+   The score to give when the state should be patching, but is good
+
+* **Node Operating System or Service State [goodShouldBeCompromised]** [int]
+    
+   The score to give when the state should be compromised, but is good
+
+* **Node Operating System or Service State [goodShouldBeOverwhelmed]** [int]
+    
+   The score to give when the state should be overwhelmed, but is good
+
+* **Node Operating System or Service State [patchingShouldBeGood]** [int]
+    
+   The score to give when the state should be good, but is patching
+
+* **Node Operating System or Service State [patchingShouldBeCompromised]** [int]
+    
+   The score to give when the state should be compromised, but is patching
+
+* **Node Operating System or Service State [patchingShouldBeOverwhelmed]** [int]
+    
+   The score to give when the state should be overwhelmed, but is patching
+
+* **Node Operating System or Service State [patching]** [int]
+    
+   The score to give when the state is patching
+
+* **Node Operating System or Service State [compromisedShouldBeGood]** [int]
+    
+   The score to give when the state should be good, but is compromised
+
+* **Node Operating System or Service State [compromisedShouldBePatching]** [int]
+    
+   The score to give when the state should be patching, but is compromised
+
+* **Node Operating System or Service State [compromisedShouldBeOverwhelmed]** [int]
+    
+   The score to give when the state should be overwhelmed, but is compromised
+
+* **Node Operating System or Service State [compromised]** [int]
+    
+   The score to give when the state is compromised
+
+* **Node Operating System or Service State [overwhelmedShouldBeGood]** [int]
+    
+   The score to give when the state should be good, but is overwhelmed
+
+* **Node Operating System or Service State [overwhelmedShouldBePatching]** [int]
+    
+   The score to give when the state should be patching, but is overwhelmed
+
+* **Node Operating System or Service State [overwhelmedShouldBeCompromised]** [int]
+    
+   The score to give when the state should be compromised, but is overwhelmed
+
+* **Node Operating System or Service State [overwhelmed]** [int]
+    
+   The score to give when the state is overwhelmed
+
+* **Node File System State [goodShouldBeRepairing]** [int]
+
+    The score to give when the state should be repairing, but is good
+
+* **Node File System State [goodShouldBeRestoring]** [int]
+
+    The score to give when the state should be restoring, but is good
+
+* **Node File System State [goodShouldBeCorrupt]** [int]
+
+    The score to give when the state should be corrupt, but is good
+
+* **Node File System State [goodShouldBeDestroyed]** [int]
+
+    The score to give when the state should be destroyed, but is good
+
+* **Node File System State [repairingShouldBeGood]** [int]
+
+    The score to give when the state should be good, but is repairing
+
+* **Node File System State [repairingShouldBeRestoring]** [int]
+
+    The score to give when the state should be restoring, but is repairing
+
+* **Node File System State [repairingShouldBeCorrupt]** [int]
+
+    The score to give when the state should be corrupt, but is repairing
+
+* **Node File System State [repairingShouldBeDestroyed]** [int]
+
+    The score to give when the state should be destroyed, but is repairing
+
+* **Node File System State [repairing]** [int]
+
+    The score to give when the state is repairing
+
+* **Node File System State [restoringShouldBeGood]** [int]
+
+    The score to give when the state should be good, but is restoring
+
+* **Node File System State [restoringShouldBeRepairing]** [int]
+
+    The score to give when the state should be repairing, but is restoring
+
+* **Node File System State [restoringShouldBeCorrupt]** [int]
+
+    The score to give when the state should be corrupt, but is restoring
+
+* **Node File System State [restoringShouldBeDestroyed]** [int]
+
+    The score to give when the state should be destroyed, but is restoring
+
+* **Node File System State [restoring]** [int]
+
+    The score to give when the state is restoring
+
+* **Node File System State [corruptShouldBeGood]** [int]
+
+    The score to give when the state should be good, but is corrupt
+
+* **Node File System State [corruptShouldBeRepairing]** [int]
+
+    The score to give when the state should be repairing, but is corrupt
+
+* **Node File System State [corruptShouldBeRestoring]** [int]
+
+    The score to give when the state should be restoring, but is corrupt
+
+* **Node File System State [corruptShouldBeDestroyed]** [int]
+
+    The score to give when the state should be destroyed, but is corrupt
+
+* **Node File System State [corrupt]** [int]
+
+    The score to give when the state is corrupt
+
+* **Node File System State [destroyedShouldBeGood]** [int]
+
+    The score to give when the state should be good, but is destroyed
+
+* **Node File System State [destroyedShouldBeRepairing]** [int]
+
+    The score to give when the state should be repairing, but is destroyed
+
+* **Node File System State [destroyedShouldBeRestoring]** [int]
+
+    The score to give when the state should be restoring, but is destroyed
+
+* **Node File System State [destroyedShouldBeCorrupt]** [int]
+
+    The score to give when the state should be corrupt, but is destroyed
+
+* **Node File System State [destroyed]** [int]
+
+    The score to give when the state is destroyed
+
+* **Node File System State [scanning]** [int]
+
+    The score to give when the state is scanning
+
+* **IER Status [redIerRunning]** [int]
+    
+   The score to give when a red agent IER is permitted to run
+
+* **IER Status [greenIerBlocked]** [int]
+    
+   The score to give when a green agent IER is prevented from running
+
+**Patching / Reset Durations**
+
+* **osPatchingDuration** [int]
+
+   The number of steps to take when patching an Operating System
+
+* **nodeResetDuration** [int]
+   
+   The number of steps to take when resetting a node's operating state
+
+* **servicePatchingDuration** [int]
+   
+   The number of steps to take when patching a service
+
+* **fileSystemRepairingLimit** [int]: 
+
+   The number of steps to take when repairing the file system
+
+* **fileSystemRestoringLimit** [int]
+
+   The number of steps to take when restoring the file system
+
+* **fileSystemScanningLimit** [int]
+
+   The number of steps to take when scanning the file system
+
+config_[name].yaml:
+*******************
+
+The config_[name].yaml file consists of the following attributes:
+
+* **itemType: ACTIONS** [enum]
+   
+   Determines whether a NODE or ACL action space format is adopted for the session
+
+* **itemType: STEPS** [int]
+    
+   Determines the number of steps to run in each episode of the session
+
+* **itemType: PORTS** [int]
+   
+   Provides a list of ports modelled in this session
+
+* **itemType: SERVICES** [freetext]
+   
+   Provides a list of services modelled in this session
+
+* **itemType: NODE**
+    
+   Defines a node included in the system laydown being simulated. It should consist of the following attributes:
+
+     * **id** [int]: Unique ID for this YAML item
+     * **name** [freetext]: Human-readable name of the component
+     * **baseType** [enum]: Relates to the base type of the node. Can be SERVICE, ACTIVE or PASSIVE. PASSIVE nodes do not have an operating system or services. ACTIVE nodes have an operating system, but no services. SERVICE nodes have both an operating system and one or more services
+     * **nodeType** [enum]: Relates to the component type. Can be one of CCTV, SWITCH, COMPUTER, LINK, MONITOR, PRINTER, LOP, RTU, ACTUATOR or SERVER
+     * **priority** [enum]: Provides a priority for each node. Can be one of P1, P2, P3, P4 or P5 (which P1 being the highest)
+     * **hardwareState** [enum]: The initial hardware state of the node. Can be one of ON, OFF or RESETTING
+     * **ipAddress** [IP address]: The IP address of the component in format xxx.xxx.xxx.xxx
+     * **softwareState** [enum]: The intial state of the node operating system. Can be GOOD, PATCHING or COMPROMISED
+     * **fileSystemState** [enum]: The initial state of the node file system. Can be GOOD, CORRUPT, DESTROYED, REPAIRING or RESTORING
+     * **services**: For each service associated with the node:
+
+        * **name** [freetext]: Free-text name of the service, but must match one of the services defined for the system in the services list
+        * **port** [int]: Integer value of the port related to this service, but must match one of the ports defined for the system in the ports list
+        * **state** [enum]: The initial state of the service. Can be one of GOOD, PATCHING, COMPROMISED or OVERWHELMED
+     
+* **itemType: LINK**
+   
+   Defines a link included in the system laydown being simulated. It should consist of the following attributes:
+
+     * **id** [int]: Unique ID for this YAML item
+     * **name** [freetext]: Human-readable name of the component
+     * **bandwidth** [int]: The bandwidth (in bits/s) of the link
+     * **source** [int]: The ID of the source node
+     * **destination** [int]: The ID of the destination node
+
+* **itemType: GREEN_IER**
+
+   Defines a green agent Information Exchange Requirement (IER). It should consist of:
+
+     * **id** [int]: Unique ID for this YAML item
+     * **startStep** [int]: The start step (in the episode) for this IER to begin
+     * **endStep** [int]: The end step (in the episode) for this IER to finish
+     * **load** [int]: The load (in bits/s) for this IER to apply to links
+     * **protocol** [freetext]: The protocol to apply to the links. This must match a value in the services list
+     * **port** [int]: The port that the protocol is running on. This must match a value in the ports list
+     * **source** [int]: The ID of the source node
+     * **destination** [int]: The ID of the destination node
+     * **missionCriticality** [enum]: The mission criticality of this IER (with 5 being highest, 1 lowest)
+
+* **itemType: RED_IER**
+    
+   Defines a red agent Information Exchange Requirement (IER). It should consist of:
+
+     * **id** [int]: Unique ID for this YAML item
+     * **startStep** [int]: The start step (in the episode) for this IER to begin
+     * **endStep** [int]: The end step (in the episode) for this IER to finish
+     * **load** [int]: The load (in bits/s) for this IER to apply to links
+     * **protocol** [freetext]: The protocol to apply to the links. This must match a value in the services list
+     * **port** [int]: The port that the protocol is running on. This must match a value in the ports list
+     * **source** [int]: The ID of the source node
+     * **destination** [int]: The ID of the destination node
+     * **missionCriticality** [enum]: Not currently used. Default to 0
+
+* **itemType: GREEN_POL**
+     
+    Defines a green agent pattern-of-life instruction. It should consist of:
+
+      * **id** [int]: Unique ID for this YAML item
+      * **startStep** [int]: The start step (in the episode) for this PoL to begin
+      * **endStep** [int]: Not currently used. Default to same as start step
+      * **nodeId** [int]: The ID of the node to apply the PoL to
+      * **type** [enum]: The type of PoL to apply. Can be one of OPERATING, OS or SERVICE
+      * **protocol** [freetext]: The protocol to be affected if SERVICE type is chosen. Must match a value in the services list
+      * **state** [enuum]: The state to apply to the node (which represents the PoL change). Can be one of ON, OFF or RESETTING (for node state) or GOOD, PATCHING or COMPROMISED (for operating system state) or GOOD, PATCHING, COMPROMISED or OVERWHELMED (for service state)
+
+* **itemType: RED_POL**
+     
+    Defines a red agent pattern-of-life instruction. It should consist of:
+
+      * **id** [int]: Unique ID for this YAML item
+      * **startStep** [int]: The start step (in the episode) for this PoL to begin
+      * **endStep** [int]: Not currently used. Default to same as start step
+      * **targetNodeId** [int]: The ID of the node to apply the PoL to
+      * **initiator** [enum]: What initiates the PoL. Can be DIRECT, IER or SERVICE
+      * **type** [enum]: The type of PoL to apply. Can be one of OPERATING, OS or SERVICE
+      * **protocol** [freetext]: The protocol to be affected if SERVICE type is chosen. Must match a value in the services list
+      * **state** [enum]: The state to apply to the node (which represents the PoL change). Can be one of ON, OFF or RESETTING (for node state) or GOOD, PATCHING or COMPROMISED (for operating system state) or GOOD, PATCHING, COMPROMISED or OVERWHELMED (for service state) or GOOD, CORRUPT, DESTROYED, REPAIRING or RESTORING (for file system state)
+      * **sourceNodeId** [int] The ID of the source node containing the service to check (used for SERVICE initiator)
+      * **sourceNodeService** [freetext]: The service on the source node to check (used for SERVICE initiator). Must match a value in the services list for this node
+      * **sourceNodeServiceState** [enum]: The state of the source node service to check (used for SERVICE initiator). Can be one of GOOD, PATCHING, COMPROMISED or OVERWHELMED
+
+* **itemType: ACL_RULE**
+     
+    Defines an initial Access Control List (ACL) rule. It should consist of:
+
+      * **id** [int]: Unique ID for this YAML item
+      * **permission** [enum]: Defines either an allow or deny rule. Value must be either DENY or ALLOW
+      * **source** [IP address]: Defines the source IP address for the rule in xxx.xxx.xxx.xxx format
+      * **destination** [IP address]: Defines the destination IP address for the rule in xxx.xxx.xxx.xxx format
+      * **protocol** [freetext]: Defines the protocol for the rule. Must match a value in the services list
+      * **port** [int]: Defines the port for the rule. Must match a value in the ports list
--- a/docs/source/dependencies.rst
+++ b/docs/source/dependencies.rst
@@ -0,0 +1,26 @@
+.. _dependencies:
+
+PrimAITE Dependencies
+=====================
+
+PrimAITE is built with the following versions of dependencies:
+
+* Python 3.10.9
+* PyYAML 6.0
+* numpy 1.23.5
+* networkx 2.8.8
+* gym 0.21.0
+* matplotlib 3.6.2 
+* stable_baselines_3 1.6.2
+
+The latest release of PrimAITE has been tested against the following versions of dependencies:
+
+* Python 3.10.9
+* PyYAML 6.0
+* numpy 1.23.5
+* networkx 2.8.8
+* gym 0.21.0
+* matplotlib 3.6.2 
+* stable_baselines_3 1.6.2
+
+
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -0,0 +1,42 @@
+.. PrimAITE documentation master file, created by
+   sphinx-quickstart on Thu Dec  8 09:51:18 2022.
+   You can adapt this file completely to your liking, but it should at least
+   contain the root `toctree` directive.
+
+Welcome to PrimAITE's documentation
+====================================
+
+What is PrimAITE?
+------------------------
+
+PrimAITE (Primary-level AI Training Environment) is a simulation environment for training AI under the ARCD programme. It incorporates the functionality required of a Primary-level environment, as specified in the Dstl ARCD Training Environment Matrix document:
+
+* The ability to model a relevant platform / system context; 
+* The ability to model key characteristics of a platform / system by representing connections, IP addresses, ports, traffic loading, operating systems, file system, services and processes; 
+* Operates at machine-speed to enable fast training cycles. 
+
+PrimAITE aims to evolve into an ARCD environment that could be used as the follow-on from Reception level approaches (e.g. YAWNING TITAN), and help bridge the Sim-to-Real gap into Secondary level environments (e.g. IMAGINARY YAK).
+
+This is similar to the approach taken by FVEY international partners (e.g. AUS CyBORG, US NSA FARLAND and CAN CyGil). These environments are referenced by the Dstl ARCD Agent Training Environments Knowledge Transfer document (TR141342).
+
+What is PrimAITE built with
+--------------------------------------
+
+* `OpenAI's Gym <https://gym.openai.com/>`_ is used as the basis for AI blue agent interaction with the PrimAITE environment
+* `Networkx <https://github.com/networkx/networkx>`_ is used as the underlying data structure used for the PrimAITE environment
+* `Stable Baselines 3 <https://github.com/DLR-RM/stable-baselines3>`_ is used as a default source of RL algorithms (although PrimAITE is not limited to SB3 agents)
+
+Where next?
+------------
+
+The best place to start is :ref:`about`
+
+.. toctree::
+   :maxdepth: 8
+   :caption: Contents:
+
+   about
+   dependencies
+   config
+   session
+   results
--- a/docs/source/results.rst
+++ b/docs/source/results.rst
@@ -0,0 +1,42 @@
+.. _results:
+
+Results, Output and Logging from PrimAITE
+=========================================
+
+PrimAITE produces four types of data:
+
+* Outputs - Results
+* Outputs - Diagrams
+* Outputs - Saved agents
+* Logging
+
+Outputs can be found in the *[Install Directory]\\Primaite\\Primaite\\outputs* directory
+
+Logging can be found in the *[Install Directory]\\Primaite\\Primaite\\logs* directory
+
+**Outputs - Results**
+
+PrimAITE automatically creates two sets of results from each session, and stores them in the *Results* folder:
+
+* Average reward per episode - a csv file listing the average reward for each episode of the session. This provides, for example, an indication of the change over a training session of the reward value
+* All transactions - a csv file listing the following values for every step of every episode:
+
+	* Timestamp
+	* Episode number
+	* Step number
+	* Initial observation space (before red and blue agent actions have been taken). Individual elements of the observation space are presented in the format OSI_X_Y
+	* Resulting observation space (after the red and blue agent actions have been taken) Individual elements of the observation space are presented in the format OSN_X_Y
+	* Reward value
+	* Action space (as presented by the blue agent on this step). Individual elements of the action space are presented in the format AS_X
+
+**Outputs - Diagrams**
+
+For each session, PrimAITE automatically creates a visualisation of the system / network laydown configuration, and stores it in the *Diagrams* folder.
+
+**Outputs - Saved agents**
+
+For each training session, assuming the agent being trained implements the *save()* function and this function is called by the code, PrimAITE automatically saves the agent state and stores it in the *agents* folder.
+
+**Logging**
+
+PrimAITE also provides output logs (for diagnosis) using the Python Logging package. These can be found in the *[Install Directory]\\Primaite\\Primaite\\logs* directory
--- a/docs/source/session.rst
+++ b/docs/source/session.rst
@@ -0,0 +1,88 @@
+.. _session:
+
+Running a PrimAITE Training or Evaluation Session
+=================================================
+
+The application will determine whether a Training or Evaluation session is being executed via the 'sessionType' value in the config_mail.yaml file. A PrimAITE session will usually be associated with a "Use Case Profile"; this document will present:
+
+* The Use Case name, default number of steps in an episode and default number of episodes in a session. The number of steps and episodes can be modified in the configuration files
+* The system laydown being modelled
+* The objectives of the session (steady-state), the red agent and the blue agent (in a defensive role)
+* The green agent pattern-of-life profile
+* The red agent attack profile
+* The observation space definition
+* The action space definition
+* Agent integration guidance
+* Initial Access Control List settings (if applicable)
+* The reward function definition
+
+**Integrating a user defined blue agent**
+
+Integrating a blue agent with PrimAITE requires some modification of the code within the main.py file. The main.py file consists of a number of functions, each of which will invoke training for a particular agent. These are:
+
+* Generic (run_generic)
+* Stable Baselines 3 PPO (run_stable_baselines3_ppo)
+* Stable Baselines 3 A2C (run_stable_baselines3_a2c)
+
+The selection of which agent type to use is made via the config_main.yaml file. In order to train a user generated agent, 
+the run_generic function should be selected, and should be modified (typically) to be:
+
+.. code:: python
+
+    agent = MyAgent(environment, max_steps)
+    for episode in range(0, num_episodes):
+        agent.learn()      
+    env.close()
+    save_agent(agent)
+
+Where:
+
+* *MyAgent* is the user created agent
+* *environment* is the PrimAITE environment
+* *max_steps* is the number of steps in an episode, as defined in the config_[name].yaml file
+* *num_episodes* is the number of episodes in the session, as defined in the config_main.yaml file
+* the *.learn()* function should be defined in the user created agent
+* the *env.close()* function is defined within PrimAITE
+* the *save_agent()* assumes that a *save()* function has been defined in the user created agent. If not, this line can be ommitted (although it is encouraged, since it will allow the agent to be saved and ported)
+
+The code below provides a suggested format for the learn() function within the user created agent.
+It's important to include the *self.environment.reset()* call within the episode loop in order that the 
+environment is reset between episodes. Note that the example below should not be considered exhaustive.
+
+.. code:: python
+
+    def learn(self) :
+
+    # pre-reqs
+
+    # reset the environment
+    self.environment.reset()
+    done = False
+    
+    for step in range(max_steps):
+        # calculate the action
+        action = ...
+
+        # execute the environment step
+        new_state, reward, done, info = self.environment.step(action)
+
+        # algorithm updates
+        ...
+
+        # update to our new state
+        state = new_state
+
+        # if done, finish episode
+        if done == True:
+            break
+
+**Running the session**
+ 
+In order to execute a session, carry out the following steps:
+
+1. Navigate to "[Install directory]\\Primaite\\Primaite\\”
+2. Start a console window (type “CMD” in path window, or start a console window first and navigate to “[Install Directory]\\Primaite\\Primaite\\”)
+3. Type “python main.py” 
+4. The session will start with an output indicating the current episode, and average reward value for the episode 
+
+