Update text in docs

2023-10-24 17:02:29 +01:00
parent d49d5b292d
commit cb60b6f785
4 changed files with 51 additions and 90 deletions
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -98,7 +98,9 @@ Head over to the :ref:`getting-started` page to install and setup PrimAITE!
   source/getting_started
   source/about
   source/config
+   source/config(v3)
   source/simulation
+   source/game_layer
   source/primaite_session
   source/custom_agent
   PrimAITE API <source/_autosummary/primaite>
--- a/docs/source/config(v3).rst
+++ b/docs/source/config(v3).rst
@@ -1,9 +1,7 @@
 Primaite v3 config
 ******************

-
-
-The YAML file allows configuring a cybersecurity scenario involving a computer network and multiple agents. There are three main sections: training_config, game, and simulation.
+PrimAITE uses a single configuration file to define a cybersecurity scenario. This includes the computer network and multiple agents. There are three main sections: training_config, game, and simulation.

 The simulation section describes the simulated network environment with which the agetns interact.

--- a/docs/source/game_layer.rst
+++ b/docs/source/game_layer.rst
@@ -0,0 +1,48 @@
+PrimAITE Game layer
+*******************
+
+The Primaite codebase consists of two main modules:
+
+* ``simulator``: The simulation logic including the network topology, the network state, and behaviour of various hardware and software classes.
+* ``game``: The agent-training infrastructure which helps reinforcement learning agents interface with the simulation. This includes the observation, action, and rewards, for RL agents, but also scripted deterministic agents. The game layer orchestrates all the interactions between modules, including ARCD GATE.
+
+These two components have been decoupled to allow the agent training code in ARCD GATE to be reused with other simulators. The simulator and game layer communicate using the PrimAITE State API and the PrimAITE Request API. The game layer communicates with ARCD gate using the `Farama Gymnasium Spaces API <https://gymnasium.farama.org/api/spaces/>`_.
+
+..
+    TODO: write up these APIs and link them here.
+
+
+Game layer
+----------
+
+The game layer is responsible for managing agents and getting them to interface with the simulator correctly. It consists of several components:
+
+PrimaiteSession
+^^^^^^^^^^^^^^^
+
+PrimaiteSession is the main entry point into Primaite and it allows the simultaneous coordination of a simulation and agents that interact with it. It also sends messages to ARCD GATE to perform reinforcement learning. PrimaiteSession keeps track of multiple agents of different types.
+
+Agents
+^^^^^^
+
+All agents inherit from the AbstractAgent class, which mandates that they have an ObservationManager, ActionManager, and RewardManager. The agent behaviour depends on the type of agent, but there are two main types:
+* RL agents action during each step is decided by an RL algorithm which lives inside of ARCD GATE. The agent within PrimAITE just acts to format and forward actions decided by an RL policy.
+* Deterministic agents perform all of their decision making within the PrimAITE game layer. They typically have a scripted policy which always performs the same action or a rule-based policy which performs actions based on the current state of the simulation. They can have a stochastic element, and their seed will be settable.
+
+..
+    TODO: add seed to stochastic scripted agents
+
+Observations
+^^^^^^^^^^^^^^^^^^
+
+An agent's observations are managed by the ObservationManager class. It generates observations based on the current simulation state dictionary. It also provides the observation space during initial setup. The data is formatted so it's compatible with Gymnasium.spaces. Observation spaces are composed of one or more components which are defined by the AbstractObservation base class.
+
+Actions
+^^^^^^^
+
+An agent's actions are managed by the ActionManager. It converts actions selected by agents (which are typically integers chosen from a ``gymnasium.spaces.Discrete`` space) into simulation-friendly requests. It also provides the action space during initial setup. Action spaces are composed of one or more components which are defined by the AbstractAction base class.
+
+Rewards
+^^^^^^^
+
+An agent's reward function is managed by the RewardManager. It calculates rewards based on the simulation state (in a way similar to observations). Rewards can be defined as a weighted sum of small reward components. For example, an agents reward can be based on the uptime of a database service plus the loss rate of packets between clients and a web server. The reward components are defined by the AbstractReward base class.
--- a/docs/source/high_level_project_structure.rst
+++ b/docs/source/high_level_project_structure.rst
@@ -1,87 +0,0 @@
-Primaite Codebase Documentation
-===============================
-
-High-level structure
--------------------
-The Primaite codebase consists of two main modules: the agent-training infrastructure and the simulation logic. These modules have been decoupled to allow for flexibility and modularity. The 'game' module acts as an interface between agents and the simulation.
-
-Simulation
----------
-The simulation module purely simulates a computer network. It has no concept of agents acting, but it can interact with agents by providing a 'state' dictionary (using the SimComponent describe_state() method) and by accepting requests (a list of strings).
-
-Game layer
----------
-
-The game layer is responsible for managing agents and getting them to interface with the simulator correctly. It consists of several components:
-
-Observations
-^^^^^^^^^^^^^^^^^^
-
-The ObservationManager is responsible for generating observations from the simulator state dictionary. The data is formatted so it's compatible with Gymnasium.spaces. The ObservationManager is used by the AgentManager to generate observations for each agent.
-
-Actions
-^^^^^^^
-
-The ActionManager is responsible for converting actions selected by agents (which comply with Gymnasium.spaces API) into simulation-friendly requests. The ActionManager is used by the AgentManager to take actions for each agent.
-
-Rewards
-^^^^^^^
-
-The RewardManager is responsible for calculating rewards based on the state (similar to observations). The RewardManager is used by the AgentManager to calculate rewards for each agent.
-
-Agents
-^^^^^^
-
-The AgentManager is responsible for managing agents and their interactions with the simulator. It uses the ObservationManager to generate observations for each agent, the ActionManager to take actions for each agent, and the RewardManager to calculate rewards for each agent.
-
-PrimaiteSession
-^^^^^^^^^^^^^^^
-
-PrimaiteSession is the main entry point into Primaite and it allows the simultaneous coordination of a simulation and agents that interact with it. It also sends messages to ARCD GATE to perform reinforcement learning. PrimaiteSession uses the AgentManager to manage agents and their interactions with the simulator.
-
-Code snippets
-------------
-Here's an example of how to create a PrimaiteSession object:
-
-.. code-block:: python
-
-    from primaite import PrimaiteSession
-
-    session = PrimaiteSession()
-
-To start the simulation, use the start() method:
-
-.. code-block:: python
-
-    session.start()
-
-To stop the simulation, use the stop() method:
-
-.. code-block:: python
-
-    session.stop()
-
-To get the current state of the simulation, use the describe_state() method. This is also used as input for generating observations and rewards:
-
-.. code-block:: python
-
-    state = session.sim.describe_state()
-
-To get the current observation of an agent, use the get_observation() method:
-
-.. code-block:: python
-
-    observation = session.get_observation(agent_id)
-
-To get the current reward of an agent, use the get_reward() method:
-
-.. code-block:: python
-
-    reward = session.get_reward(agent_id)
-
-To take an action for an agent, use the take_action() method:
-
-.. code-block:: python
-
-    action = agent.select_action(observation)
-    session.take_action(agent_id, action)