#915 - Created app dirs and set as constants in the top-level init.
- renamed _config_values_main to training_config.py and renamed the ConfigValuesMain class to TrainingConfig. Moved training_config.py to src/primaite/config/training_config.py - Renamed all training config yaml file keys to make creating an instance of TrainingConfig easier. Moved action_type and num_steps over to the training config. - Decoupled the training config and lay down config. - Refactored main.py so that it can be ran from CLI and can take a training config path and a lay down config path. - refactored all outputs so that they save to the session dir. - Added some necessary setup scripts that handle creating app dirs, fronting example config files to the user, fronting demo notebooks to the user, performing clean-up in between installations etc. - Added functions that attempt to retrieve the file path of users example config files that have been fronted by the primaite setup. - Added logging config and a getLogger function in the top-level init. - Refactored all logs entries logged to use a logger using the primaite logging config. - Added basic typer CLI for doing things like setup, viewing logs, viewing primaite version, running a basic session. - Updated test to use new features and config structures. - Began updating docs. More to do here.
This commit is contained in:
@@ -29,10 +29,10 @@ the run_generic function should be selected, and should be modified (typically)
|
||||
|
||||
.. code:: python
|
||||
|
||||
agent = MyAgent(environment, max_steps)
|
||||
for episode in range(0, num_episodes):
|
||||
agent.learn()
|
||||
env.close()
|
||||
agent = MyAgent(environment, max_steps)
|
||||
for episode in range(0, num_episodes):
|
||||
agent.learn()
|
||||
env.close()
|
||||
save_agent(agent)
|
||||
|
||||
Where:
|
||||
@@ -51,29 +51,29 @@ environment is reset between episodes. Note that the example below should not be
|
||||
|
||||
.. code:: python
|
||||
|
||||
def learn(self) :
|
||||
def learn(self) :
|
||||
|
||||
# pre-reqs
|
||||
# pre-reqs
|
||||
|
||||
# reset the environment
|
||||
self.environment.reset()
|
||||
done = False
|
||||
# reset the environment
|
||||
self.environment.reset()
|
||||
done = False
|
||||
|
||||
for step in range(max_steps):
|
||||
# calculate the action
|
||||
for step in range(max_steps):
|
||||
# calculate the action
|
||||
action = ...
|
||||
|
||||
# execute the environment step
|
||||
new_state, reward, done, info = self.environment.step(action)
|
||||
# execute the environment step
|
||||
new_state, reward, done, info = self.environment.step(action)
|
||||
|
||||
# algorithm updates
|
||||
# algorithm updates
|
||||
...
|
||||
|
||||
# update to our new state
|
||||
state = new_state
|
||||
# update to our new state
|
||||
state = new_state
|
||||
|
||||
# if done, finish episode
|
||||
if done == True:
|
||||
# if done, finish episode
|
||||
if done == True:
|
||||
break
|
||||
|
||||
**Running the session**
|
||||
|
||||
Reference in New Issue
Block a user