Skip to content

BlueRewardMachine

Bases: RewardCalculator

The reward calculator for CC4

Attributes:

Name Type Description
phase_rewards Dict[str, Dict[str, int]]

the reward mapping for the current mission phase

Functions

calculate_reward

calculate_reward(current_state: dict, action_dict: dict, agent_observations: dict, done: bool, state: State)

Calculate the cumulative reward based on the phase mapping.

Parameters:

Name Type Description Default
current_state Dict[str, _]

the current state of all the hosts in the simulation

required
action_dict dict
required
agent_observations Dict[str, ObservationSet]

current agent observations

required
done bool

has the episode ended

required
state State

current State object

required

Returns:

Type Description
int

sum of the rewards collected

get_phase_rewards

get_phase_rewards(cur_mission_phase)

Gets the pre-set reward mapping for the current mission phase

Rewards Key: - LWF = Local Work Fails - ASF = Access Service Fails - RIA = Red Impact/Access

Parameters:

Name Type Description Default
cur_mission_phase int

the current mission phase of the episode

required

Returns:

Type Description
Dict[str, Dict[str, int]]

the phase reward mapping for the current mission phase