- renamed _config_values_main to training_config.py and renamed the ConfigValuesMain class to TrainingConfig. Moved training_config.py to src/primaite/config/training_config.py - Renamed all training config yaml file keys to make creating an instance of TrainingConfig easier. Moved action_type and num_steps over to the training config. - Decoupled the training config and lay down config. - Refactored main.py so that it can be ran from CLI and can take a training config path and a lay down config path. - refactored all outputs so that they save to the session dir. - Added some necessary setup scripts that handle creating app dirs, fronting example config files to the user, fronting demo notebooks to the user, performing clean-up in between installations etc. - Added functions that attempt to retrieve the file path of users example config files that have been fronted by the primaite setup. - Added logging config and a getLogger function in the top-level init. - Refactored all logs entries logged to use a logger using the primaite logging config. - Added basic typer CLI for doing things like setup, viewing logs, viewing primaite version, running a basic session. - Updated test to use new features and config structures. - Began updating docs. More to do here.
87 lines
3.5 KiB
ReStructuredText
87 lines
3.5 KiB
ReStructuredText
.. _session:
|
|
|
|
Running a PrimAITE Training or Evaluation Session
|
|
=================================================
|
|
|
|
The application will determine whether a Training or Evaluation session is being executed via the 'sessionType' value in the config_mail.yaml file. A PrimAITE session will usually be associated with a "Use Case Profile"; this document will present:
|
|
|
|
* The Use Case name, default number of steps in an episode and default number of episodes in a session. The number of steps and episodes can be modified in the configuration files
|
|
* The system laydown being modelled
|
|
* The objectives of the session (steady-state), the red agent and the blue agent (in a defensive role)
|
|
* The green agent pattern-of-life profile
|
|
* The red agent attack profile
|
|
* The observation space definition
|
|
* The action space definition
|
|
* Agent integration guidance
|
|
* Initial Access Control List settings (if applicable)
|
|
* The reward function definition
|
|
|
|
**Integrating a user defined blue agent**
|
|
|
|
Integrating a blue agent with PrimAITE requires some modification of the code within the main.py file. The main.py file consists of a number of functions, each of which will invoke training for a particular agent. These are:
|
|
|
|
* Generic (run_generic)
|
|
* Stable Baselines 3 PPO (run_stable_baselines3_ppo)
|
|
* Stable Baselines 3 A2C (run_stable_baselines3_a2c)
|
|
|
|
The selection of which agent type to use is made via the config_main.yaml file. In order to train a user generated agent,
|
|
the run_generic function should be selected, and should be modified (typically) to be:
|
|
|
|
.. code:: python
|
|
|
|
agent = MyAgent(environment, max_steps)
|
|
for episode in range(0, num_episodes):
|
|
agent.learn()
|
|
env.close()
|
|
save_agent(agent)
|
|
|
|
Where:
|
|
|
|
* *MyAgent* is the user created agent
|
|
* *environment* is the PrimAITE environment
|
|
* *max_steps* is the number of steps in an episode, as defined in the config_[name].yaml file
|
|
* *num_episodes* is the number of episodes in the session, as defined in the config_main.yaml file
|
|
* the *.learn()* function should be defined in the user created agent
|
|
* the *env.close()* function is defined within PrimAITE
|
|
* the *save_agent()* assumes that a *save()* function has been defined in the user created agent. If not, this line can be ommitted (although it is encouraged, since it will allow the agent to be saved and ported)
|
|
|
|
The code below provides a suggested format for the learn() function within the user created agent.
|
|
It's important to include the *self.environment.reset()* call within the episode loop in order that the
|
|
environment is reset between episodes. Note that the example below should not be considered exhaustive.
|
|
|
|
.. code:: python
|
|
|
|
def learn(self) :
|
|
|
|
# pre-reqs
|
|
|
|
# reset the environment
|
|
self.environment.reset()
|
|
done = False
|
|
|
|
for step in range(max_steps):
|
|
# calculate the action
|
|
action = ...
|
|
|
|
# execute the environment step
|
|
new_state, reward, done, info = self.environment.step(action)
|
|
|
|
# algorithm updates
|
|
...
|
|
|
|
# update to our new state
|
|
state = new_state
|
|
|
|
# if done, finish episode
|
|
if done == True:
|
|
break
|
|
|
|
**Running the session**
|
|
|
|
In order to execute a session, carry out the following steps:
|
|
|
|
1. Navigate to "[Install directory]\\Primaite\\Primaite\\”
|
|
2. Start a console window (type “CMD” in path window, or start a console window first and navigate to “[Install Directory]\\Primaite\\Primaite\\”)
|
|
3. Type “python main.py”
|
|
4. The session will start with an output indicating the current episode, and average reward value for the episode
|