BlueFlatWrapper

Bases: BlueFixedActionWrapper

Converts observation spaces to vectors of fixed size and ordering across episodes.

This is a companion wrapper to the BlueFixedActionWrapper and inherits the fixed action space and int-to-action mappings as a result.

Using the sorted host and subnet lists from FixedAction wrapper, this wrapper establishes the maximum observation space for each agent. On each step, the observation vectors are populated such that each element within a vector will have a consistent meaning across runs. This is critical for RL-based agents.

Functions

init

__init__(env: CybORG, *args: CybORG, **kwargs: CybORG)

Initialize the BlueFlatWrapper for blue agents.

Note: The padding setting is inherited from BlueFixedActionWrapper.

Args: env (CybORG): The environment to wrap.

*args, **kwargs: Extra arguments are ignored.

observation_change

observation_change(agent_name: str, observation: dict) -> np.ndarray

Converts an observation dictionary to a vector of fixed size and ordering.

Parameters:

Name	Type	Description	Default
`agent_name`	`str`	Agent corresponding to the observation.	required
`observation`	`dict`	Observation to convert to a fixed vector.	required

Returns:

Name	Type	Description
`output`	`np.ndarray`

observation_space `cached`

observation_space(agent_name: str) -> Space

Returns the multi-discrete space corresponding to the given agent.

observation_spaces `cached`

observation_spaces() -> dict[str, Space]

Returns multi-discrete spaces corresponding to each agent.

reset

reset(*args, **kwargs) -> tuple[dict[str, Any], dict[str, Any]]

Reset the environment and update the observation space.

Args: All arguments are forwarded to the env provided to init.

Returns:

Name	Type	Description
`observation`	`dict[str, Any]`	The observations corresponding to each agent, translated into a vector format.
`info`	`dict[str, dict]`	Forwarded from self.env.

step

step(actions: dict[str, int | Action] = None, messages: dict[str, Any] = None, **kwargs: dict[str, Any]) -> tuple[dict[str, np.ndarray], dict[str, float], dict[str, bool], dict[str, bool], dict[str, dict]]

Take a step in the enviroment.

Parameters: action_dict : dict[str, int | Action] The action or action index corresponding to each agent. Indices will be mapped to CybORG actions using the equivalent of actions(k)[v]. The meaning of each action can be found using action_labels(k)[v]. messages : dict[str, Any] Messages from each agent. If an agent does not specify a message, it will send an empty message. **kwargs : dict[str, Any] Extra keywords are forwarded.

Returns:

Name	Type	Description
`observation`	`dict[str, np.ndarray]`	Observations for each agent as vectors.
`rewards`	`dict[str, float]`	Rewards for each agent.
`terminated`	`dict[str, bool]`	Flags whether the agent finished normally.
`truncated`	`dict[str, bool]`	Flags whether the agent was stopped by env.
`info`	`dict[str, dict]`	Forwarded from BlueFixedActionWrapper.

BlueFlatWrapper

Functions

__init__

observation_change

observation_space cached

observation_spaces cached

reset

step

init

observation_space `cached`

observation_spaces `cached`