Skip to content

BlueFlatWrapper

Bases: BlueFixedActionWrapper

Converts observation spaces to vectors of fixed size and ordering across episodes.

This is a companion wrapper to the BlueFixedActionWrapper and inherits the fixed action space and int-to-action mappings as a result.

Using the sorted host and subnet lists from FixedAction wrapper, this wrapper establishes the maximum observation space for each agent. On each step, the observation vectors are populated such that each element within a vector will have a consistent meaning across runs. This is critical for RL-based agents.

Functions

__init__

__init__(env: CybORG, *args: CybORG, **kwargs: CybORG)

Initialize the BlueFlatWrapper for blue agents.

Note: The padding setting is inherited from BlueFixedActionWrapper.

Args: env (CybORG): The environment to wrap.

*args, **kwargs: Extra arguments are ignored.

observation_change

observation_change(agent_name: str, observation: dict) -> np.ndarray

Converts an observation dictionary to a vector of fixed size and ordering.

Parameters:

Name Type Description Default
agent_name str

Agent corresponding to the observation.

required
observation dict

Observation to convert to a fixed vector.

required

Returns:

Name Type Description
output np.ndarray

observation_space cached

observation_space(agent_name: str) -> Space

Returns the multi-discrete space corresponding to the given agent.

observation_spaces cached

observation_spaces() -> dict[str, Space]

Returns multi-discrete spaces corresponding to each agent.

reset

reset(*args, **kwargs) -> tuple[dict[str, Any], dict[str, Any]]

Reset the environment and update the observation space.

Args: All arguments are forwarded to the env provided to init.

Returns:

Name Type Description
observation dict[str, Any]

The observations corresponding to each agent, translated into a vector format.

info dict[str, dict]

Forwarded from self.env.

step

step(actions: dict[str, int | Action] = None, messages: dict[str, Any] = None, **kwargs: dict[str, Any]) -> tuple[dict[str, np.ndarray], dict[str, float], dict[str, bool], dict[str, bool], dict[str, dict]]

Take a step in the enviroment.

Parameters: action_dict : dict[str, int | Action] The action or action index corresponding to each agent. Indices will be mapped to CybORG actions using the equivalent of actions(k)[v]. The meaning of each action can be found using action_labels(k)[v]. messages : dict[str, Any] Messages from each agent. If an agent does not specify a message, it will send an empty message. **kwargs : dict[str, Any] Extra keywords are forwarded.

Returns:

Name Type Description
observation dict[str, np.ndarray]

Observations for each agent as vectors.

rewards dict[str, float]

Rewards for each agent.

terminated dict[str, bool]

Flags whether the agent finished normally.

truncated dict[str, bool]

Flags whether the agent was stopped by env.

info dict[str, dict]

Forwarded from BlueFixedActionWrapper.