BlueRewardMachine
Bases: RewardCalculator
The reward calculator for CC4
Attributes:
Name | Type | Description |
---|---|---|
phase_rewards |
Dict[str, Dict[str, int]]
|
the reward mapping for the current mission phase |
Functions
calculate_reward
calculate_reward(current_state: dict, action_dict: dict, agent_observations: dict, done: bool, state: State)
Calculate the cumulative reward based on the phase mapping.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
current_state |
Dict[str, _]
|
the current state of all the hosts in the simulation |
required |
action_dict |
dict
|
|
required |
agent_observations |
Dict[str, ObservationSet]
|
current agent observations |
required |
done |
bool
|
has the episode ended |
required |
state |
State
|
current State object |
required |
Returns:
Type | Description |
---|---|
int
|
sum of the rewards collected |
get_phase_rewards
Gets the pre-set reward mapping for the current mission phase
Rewards Key: - LWF = Local Work Fails - ASF = Access Service Fails - RIA = Red Impact/Access
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cur_mission_phase |
int
|
the current mission phase of the episode |
required |
Returns:
Type | Description |
---|---|
Dict[str, Dict[str, int]]
|
the phase reward mapping for the current mission phase |