2024-10-31 14:42:26 +00:00
|
|
|
.. only:: comment
|
|
|
|
|
|
2025-01-03 11:22:17 +00:00
|
|
|
© Crown-owned copyright 2025, Defence Science and Technology Laboratory UK
|
2024-10-31 14:42:26 +00:00
|
|
|
|
2025-02-26 12:03:00 +00:00
|
|
|
.. _extensible_rewards:
|
2024-10-31 14:42:26 +00:00
|
|
|
|
|
|
|
|
Extensible Rewards
|
|
|
|
|
******************
|
2024-11-07 16:35:39 +00:00
|
|
|
Extensible Rewards differ from the previous reward mechanism used in PrimAITE v3.x as new reward
|
|
|
|
|
types can be added without requiring a change to the RewardFunction class in rewards.py (PrimAITE
|
|
|
|
|
core repository).
|
2024-10-31 14:42:26 +00:00
|
|
|
|
|
|
|
|
Changes to reward class structure.
|
|
|
|
|
==================================
|
|
|
|
|
|
|
|
|
|
Reward classes are inherited from AbstractReward (a sub-class of Pydantic's BaseModel).
|
2024-11-07 16:35:39 +00:00
|
|
|
Within the reward class there is a ConfigSchema class responsible for ensuring the config file data
|
|
|
|
|
is in the correct format. This also means there is little (if no) requirement for and `__init__`
|
|
|
|
|
method. The `.from_config` method is no longer required as it's inherited from `AbstractReward`.
|
2025-01-31 16:00:32 +00:00
|
|
|
Each class requires an discriminator string which is used by the ConfigSchema class to verify that it
|
2024-11-06 15:08:38 +00:00
|
|
|
hasn't previously been added to the registry.
|
2024-10-31 14:42:26 +00:00
|
|
|
|
2024-11-06 15:08:38 +00:00
|
|
|
Inheriting from `BaseModel` removes the need for an `__init__` method but means that object
|
2024-10-31 14:42:26 +00:00
|
|
|
attributes need to be passed by keyword.
|
|
|
|
|
|
2024-11-07 16:35:39 +00:00
|
|
|
To add a new reward class follow the example below. Note that the type attribute in the
|
|
|
|
|
`ConfigSchema` class should match the type used in the config file to define the reward.
|
2024-10-31 14:42:26 +00:00
|
|
|
|
2024-11-07 16:35:39 +00:00
|
|
|
.. code-block:: Python
|
2024-10-31 14:42:26 +00:00
|
|
|
|
2025-03-11 12:27:45 +00:00
|
|
|
class DatabaseFileIntegrity(AbstractReward, discriminator="database-file-integrity"):
|
|
|
|
|
"""Reward function component which rewards the agent for maintaining the integrity of a database file."""
|
2024-10-31 14:42:26 +00:00
|
|
|
|
2025-03-11 12:27:45 +00:00
|
|
|
config: "DatabaseFileIntegrity.ConfigSchema"
|
|
|
|
|
location_in_state: List[str] = [""]
|
|
|
|
|
reward: float = 0.0
|
2024-10-31 14:42:26 +00:00
|
|
|
|
2025-03-11 12:27:45 +00:00
|
|
|
class ConfigSchema(AbstractReward.ConfigSchema):
|
|
|
|
|
"""ConfigSchema for DatabaseFileIntegrity."""
|
2024-10-31 14:42:26 +00:00
|
|
|
|
2025-03-11 12:27:45 +00:00
|
|
|
type: str = "database-file-integrity"
|
|
|
|
|
node_hostname: str
|
|
|
|
|
folder_name: str
|
|
|
|
|
file_name: str
|
2024-10-31 14:42:26 +00:00
|
|
|
|
2025-03-11 12:27:45 +00:00
|
|
|
def calculate(self, state: Dict, last_action_response: "AgentHistoryItem") -> float:
|
|
|
|
|
"""Calculate the reward for the current state.
|
|
|
|
|
pass
|
2024-10-31 14:42:26 +00:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Changes to YAML file.
|
|
|
|
|
=====================
|
|
|
|
|
There's no longer a need to provide a `dns_server` as an option in the simulation section
|
2024-11-06 15:08:38 +00:00
|
|
|
of the config file.
|