#915 - Created app dirs and set as constants in the top-level init.

- renamed _config_values_main to training_config.py and renamed the ConfigValuesMain class to TrainingConfig. Moved training_config.py to src/primaite/config/training_config.py - Renamed all training config yaml file keys to make creating an instance of TrainingConfig easier. Moved action_type and num_steps over to the training config. - Decoupled the training config and lay down config. - Refactored main.py so that it can be ran from CLI and can take a training config path and a lay down config path. - refactored all outputs so that they save to the session dir. - Added some necessary setup scripts that handle creating app dirs, fronting example config files to the user, fronting demo notebooks to the user, performing clean-up in between installations etc. - Added functions that attempt to retrieve the file path of users example config files that have been fronted by the primaite setup. - Added logging config and a getLogger function in the top-level init. - Refactored all logs entries logged to use a logger using the primaite logging config. - Added basic typer CLI for doing things like setup, viewing logs, viewing primaite version, running a basic session. - Updated test to use new features and config structures. - Began updating docs. More to do here.
2023-06-07 22:40:16 +01:00
parent a8ce699df3
commit 98fc1e4c71
44 changed files with 1527 additions and 1356 deletions
--- a/docs/source/session.rst
+++ b/docs/source/session.rst
@@ -29,10 +29,10 @@ the run_generic function should be selected, and should be modified (typically)

 .. code:: python

-    agent = MyAgent(environment, max_steps)
-    for episode in range(0, num_episodes):
-        agent.learn()      
-    env.close()
+    agent = MyAgent(environment, max_steps)
+    for episode in range(0, num_episodes):
+        agent.learn()
+    env.close()
    save_agent(agent)

 Where:
@@ -51,29 +51,29 @@ environment is reset between episodes. Note that the example below should not be

 .. code:: python

-    def learn(self) :
+    def learn(self) :

-    # pre-reqs
+    # pre-reqs

-    # reset the environment
-    self.environment.reset()
-    done = False
+    # reset the environment
+    self.environment.reset()
+    done = False

-    for step in range(max_steps):
-        # calculate the action
+    for step in range(max_steps):
+        # calculate the action
        action = ...

-        # execute the environment step
-        new_state, reward, done, info = self.environment.step(action)
+        # execute the environment step
+        new_state, reward, done, info = self.environment.step(action)

-        # algorithm updates
+        # algorithm updates
        ...

-        # update to our new state
-        state = new_state
+        # update to our new state
+        state = new_state

-        # if done, finish episode
-        if done == True:
+        # if done, finish episode
+        if done == True:
            break

 **Running the session**