Skip to content

BlueFixedActionWrapper

Bases: BaseWrapper

Maintains action spaces with fixed sizes and ordering across episodes.

On initialization, this wrapper creates a sorted list of all the hosts and subnets each agent can interact with in the CC4 EnterpriseScenario.

On reset, the action space is populated using these sorted lists, translating hostnames to IP addresses where needed, such that any given action index will always correspond to a specific host. If a host does not exist in the current episode, the action will be replaced with a no-op (Sleep) action. Agents can check whether an action corresponds to an active host by consulting action_mask().

Note: This wrapper does not change the observation space. See the companion wrapper BlueFlatWrapper for vector observations of fixed length and order.

Attributes

is_padded property

is_padded: bool

Returns whether the action space has been padded with no-ops.

Functions

__init__

__init__(env: CybORG, pad_spaces: bool = False, *args: bool, **kwargs: bool)

Initialize the BlueFixedActionWrapper for blue agents.

Parameters:

Name Type Description Default
env CybORG

An instance of CybORG. Must not modify action_space.

required
pad_spaces bool

Ensure all observation and action spaces are the same size across all agents by padding the space with the Sleep action. This is a requirement for some RL libraries.

False
*args

Extra arguments are ignored.

()
**kwargs

Extra arguments are ignored.

()

action_labels

action_labels(agent_name: str) -> list[str]

Returns an ordered list of human-readable actions.

action_mask

action_mask(agent_name: str) -> list[bool]

Returns an ordered list corresponding to whether an action is valid or not.

action_space cached

action_space(agent_name: str) -> Space

Returns the discrete space corresponding to the given agent.

action_spaces cached

action_spaces() -> dict[str, Space]

Returns discrete space with optional padding for each agent.

actions

actions(agent_name: str) -> list[Action]

Returns an ordered list of CybORG actions.

get_action_space

get_action_space(agent: str) -> dict[str, list[Action | str | bool]]

Returns all information about an agent's action space.

hosts

hosts(agent_name: str) -> list[str]

Returns an ordered list of names of hosts the agent can interact with.

reset

reset(*args, **kwargs) -> tuple[dict[str, Any], dict[str, dict]]

Reset the environment and update the action space.

Parameters: All arguments are forwarded to the env provided to init.

Returns:

Name Type Description
observation dict[str, Any]

The observations corresponding to each agent. Forwarded directly from the env provided to init.

info : dict[str, dict] Information dictionaries corresponding to each agent. Each dictionary contains the key "action_mask" that maps to a list[bool] where each element corresponds to whether the action at the element's index targets a host or subnet that exists for the duration of the episode.

step

step(actions: dict[str, int | Action] = None, messages: dict[str, Any] = None, **kwargs: dict[str, Any]) -> tuple[dict[str, Any], dict[str, float], dict[str, bool], dict[str, bool], dict[str, dict]]

Take a step in the enviroment using action indices.

Parameters:

Name Type Description Default
actions dict[str, int]

The action index corresponding to each agent. These indices will be mapped to CybORG actions using the equivalent of actions(agent)[index]. The meaning of each action can be found using action_labels(agent)[index].

None
messages dict[str, Any]

Messages from each agent. If an agent does not specify a message, it will send an empty message.

None
**kwargs dict[str, Any]

Extra keywords are forwarded.

{}

Returns:

Name Type Description
observation dict[str, Any]

The observations corresponding to each agent. Forwarded directly from the env provided to init.

rewards dict[str, float]

Rewards for each agent.

terminated dict[str, bool]

Flags whether the agent finished normally.

truncated dict[str, bool]

Flags whether the agent was stopped by env.

info dict[str, dict]

Information dictionaries corresponding to each agent. Each dictionary contains the key "action_mask" that maps to a list[bool] where each element corresponds to whether the action at the element's index targets a host or subnet that exists for the duration of the episode.

subnets

subnets(agent_name: str) -> list[str]

Returns an ordered list of names of subnets the agent can interact with.