Merge branch 'dev' into feature/1386-enable-a-repeatable-or-deterministic-baseline-test

This commit is contained in:
Czar Echavez
2023-07-03 16:56:44 +01:00
16 changed files with 527 additions and 79 deletions

View File

@@ -28,6 +28,10 @@ The environment config file consists of the following attributes:
* STABLE_BASELINES3_PPO - Use a SB3 PPO agent
* STABLE_BASELINES3_A2C - use a SB3 A2C agent
* **random_red_agent** [bool]
Determines if the session should be run with a random red agent
* **action_type** [enum]
Determines whether a NODE, ACL, or ANY (combined NODE & ACL) action space format is adopted for the session

View File

@@ -78,10 +78,9 @@ PrimAITE automatically creates two sets of results from each session:
* Timestamp
* Episode number
* Step number
* Initial observation space (before red and blue agent actions have been taken). Individual elements of the observation space are presented in the format OSI_X_Y
* Resulting observation space (after the red and blue agent actions have been taken) Individual elements of the observation space are presented in the format OSN_X_Y
* Initial observation space (what the blue agent observed when it decided its action)
* Reward value
* Action space (as presented by the blue agent on this step). Individual elements of the action space are presented in the format AS_X
* Action taken (as presented by the blue agent on this step). Individual elements of the action space are presented in the format AS_X
**Diagrams**