BlueFlatWrapper
Bases: BlueFixedActionWrapper
Converts observation spaces to vectors of fixed size and ordering across episodes.
This is a companion wrapper to the BlueFixedActionWrapper and inherits the fixed action space and int-to-action mappings as a result.
Using the sorted host and subnet lists from FixedAction wrapper, this wrapper establishes the maximum observation space for each agent. On each step, the observation vectors are populated such that each element within a vector will have a consistent meaning across runs. This is critical for RL-based agents.
Functions
__init__
Initialize the BlueFlatWrapper for blue agents.
Note: The padding setting is inherited from BlueFixedActionWrapper.
Args: env (CybORG): The environment to wrap.
*args, **kwargs: Extra arguments are ignored.
observation_change
Converts an observation dictionary to a vector of fixed size and ordering.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
agent_name |
str
|
Agent corresponding to the observation. |
required |
observation |
dict
|
Observation to convert to a fixed vector. |
required |
Returns:
Name | Type | Description |
---|---|---|
output |
np.ndarray
|
|
observation_space
cached
Returns the multi-discrete space corresponding to the given agent.
observation_spaces
cached
Returns multi-discrete spaces corresponding to each agent.
reset
Reset the environment and update the observation space.
Args: All arguments are forwarded to the env provided to init.
Returns:
Name | Type | Description |
---|---|---|
observation |
dict[str, Any]
|
The observations corresponding to each agent, translated into a vector format. |
info |
dict[str, dict]
|
Forwarded from self.env. |
step
step(actions: dict[str, int | Action] = None, messages: dict[str, Any] = None, **kwargs: dict[str, Any]) -> tuple[dict[str, np.ndarray], dict[str, float], dict[str, bool], dict[str, bool], dict[str, dict]]
Take a step in the enviroment.
Parameters:
action_dict : dict[str, int | Action]
The action or action index corresponding to each agent.
Indices will be mapped to CybORG actions using the equivalent of actions(k)[v]
.
The meaning of each action can be found using action_labels(k)[v]
.
messages : dict[str, Any]
Messages from each agent. If an agent does not specify a message, it will send an empty message.
**kwargs : dict[str, Any]
Extra keywords are forwarded.
Returns:
Name | Type | Description |
---|---|---|
observation |
dict[str, np.ndarray]
|
Observations for each agent as vectors. |
rewards |
dict[str, float]
|
Rewards for each agent. |
terminated |
dict[str, bool]
|
Flags whether the agent finished normally. |
truncated |
dict[str, bool]
|
Flags whether the agent was stopped by env. |
info |
dict[str, dict]
|
Forwarded from BlueFixedActionWrapper. |