citylearn.wrappers module

class citylearn.wrappers.ClippedObservationWrapper(env: CityLearnEnv)[source]

Bases: ObservationWrapper

Wrapper for observations min-max and periodic normalization.

Observations are clipped to be within the observation space limits.

Parameters:

env (CityLearnEnv) – CityLearn environment.

observation(observations: List[List[float]]) List[List[float]][source]

Returns normalized observations.

class citylearn.wrappers.DiscreteActionWrapper(env: CityLearnEnv, bin_sizes: List[Mapping[str, int]] = None, default_bin_size: int = None)[source]

Bases: ActionWrapper

Wrapper for action space discretization.

Parameters:
  • env (CityLearnEnv) – CityLearn environment.

  • bin_sizes (List[Mapping[str, int]], optional) – Then number of bins for each active action in each building.

  • default_bin_size (int, default = 10) – The default number of bins if bin_sizes is unspecified for any active building action.

action(actions: List[float]) List[List[float]][source]

Returns undiscretized actions.

property action_space: List[MultiDiscrete]

Returns action space for discretized actions.

class citylearn.wrappers.DiscreteObservationWrapper(env: CityLearnEnv, bin_sizes: List[Mapping[str, int]] = None, default_bin_size: int = None)[source]

Bases: ObservationWrapper

Wrapper for observation space discretization.

Parameters:
  • env (CityLearnEnv) – CityLearn environment.

  • bin_sizes (List[Mapping[str, int]], optional) – Then number of bins for each active observation in each building.

  • default_bin_size (int, default = 10) – The default number of bins if bin_sizes is unspecified for any active building observation.

observation(observations: List[List[float]]) ndarray[source]

Returns discretized observations.

property observation_space: List[MultiDiscrete]

Returns observation space for discretized observations.

class citylearn.wrappers.DiscreteSpaceWrapper(env: CityLearnEnv, observation_bin_sizes: List[Mapping[str, int]] = None, action_bin_sizes: List[Mapping[str, int]] = None, default_observation_bin_size: int = None, default_action_bin_size: int = None)[source]

Bases: Wrapper

Wrapper for observation and action spaces discretization.

Wraps env in citylearn.wrappers.DiscreteObservationWrapper and citylearn.wrappers.DiscreteActionWrapper.

Parameters:
  • env (CityLearnEnv) – CityLearn environment.

  • observation_bin_sizes (List[Mapping[str, int]], optional) – Then number of bins for each active observation in each building.

  • action_bin_sizes (List[Mapping[str, int]], optional) – Then number of bins for each active action in each building.

  • default_observation_bin_size (int, default = 10) – The default number of bins if bin_sizes is unspecified for any active building observation.

  • default_action_bin_size (int, default = 10) – The default number of bins if bin_sizes is unspecified for any active building action.

class citylearn.wrappers.NormalizedActionWrapper(env: CityLearnEnv)[source]

Bases: ActionWrapper

Wrapper for action min-max normalization.

All observations are min-max normalized between 0 and 1.

Parameters:

env (CityLearnEnv) – CityLearn environment.

action(actions: List[float]) List[List[float]][source]

Returns denormalized actions.

property action_space: List[Box]

Returns action space for normalized actions.

class citylearn.wrappers.NormalizedObservationWrapper(env: CityLearnEnv)[source]

Bases: ObservationWrapper

Wrapper for observations min-max and periodic normalization.

Temporal observations including hour, day_type and month are periodically normalized using sine/cosine transformations and then all observations are min-max normalized between 0 and 1.

Parameters:

env (CityLearnEnv) – CityLearn environment.

observation(observations: List[List[float]]) List[List[float]][source]

Returns normalized observations.

property observation_names: List[List[str]]

Names of returned observations.

Includes extra three observations added during cyclic transformation of hour, day_type and month.

Notes

If central_agent is True, a list of 1 sublist containing all building observation names is returned in the same order as buildings. The shared_observations names are only included in the first building’s observation names. If central_agent is False, a list of sublists is returned where each sublist is a list of 1 building’s observation names and the sublist in the same order as buildings.

property observation_space: List[Box]

Returns observation space for normalized observations.

property shared_observations: List[str]

Names of common observations across all buildings i.e. observations that have the same value irrespective of the building.

Includes extra three observations added during cyclic transformation of hour, day_type and month.

class citylearn.wrappers.NormalizedSpaceWrapper(env: CityLearnEnv)[source]

Bases: Wrapper

Wrapper for normalized observation and action spaces.

Wraps env in citylearn.wrappers.NormalizedObservationWrapper and citylearn.wrappers.NormalizedActionWrapper.

Parameters:

env (CityLearnEnv) – CityLearn environment.

class citylearn.wrappers.RLlibMultiAgentActionWrapper(env: CityLearnEnv)[source]

Bases: ActionWrapper

Action wrapper for RLlib multi-agent algorithms.

Wraps action space so that it is returned as gymnasium.spaces.Dict. The keys correspond to the agent IDs i.e., policy IDs in the multi-agent. Also converts agent actions from dict to data structure need by citylearn.citylearn.CityLearnEnv.step().

Parameters:

env (CityLearnEnv) – CityLearn environment.

action(actions: Mapping[str, ndarray]) List[List[float]][source]

Parses actions into data structure for citylearn.citylearn.CityLearnEnv.step().

property action_space: Dict

Parses action space into a gymnasium.spaces.Dict.

class citylearn.wrappers.RLlibMultiAgentEnv(env_config: Mapping[str, Any])[source]

Bases: MultiAgentEnv

Wrapper for RLlib multi-agent algorithms.

Converts, observation and action spaces to gymnasium.spaces.Dict. Also converts observations, actions, rewards, terminated, and truncated to dictionaries where necessary. The dictionary keys correspond to the agent IDs i.e., policy IDs in the multi-agent. Agent IDs are accessible through the _agent_ids property. The initialized environment is a ray.rllib.env.MultiAgentEnv object and has an env attribute that is citylearn.citylearn.CityLearnEnv object.

Parameters:

env_config (Mapping[str, Any]) – Dictionary providing initialization parameters for the environment. Must contain env_kwargs as a key where env_kwargs is a dict used to initialize citylearn.citylearn.CityLearnEnv. Thus it must contain all with positional arguments needed for intialization and optionally contain optional intialization arguments. env_config can also contain a wrappers key that is a list of citylearn.wrappers classes to wrap citylearn.citylearn.CityLearnEnv with. Wrapping with citylearn.wrappers.ClippedObservationWrapper is recommended to avoid having the simulation terminating prematurely with an error due to out of bound observations relative to the observation space.

Notes

This wrapper is only compatible with an environment where citylearn.citylearn.central_agent`=`False and will initialize the environment as such, overriding any value for central_agent in env_kwargs.

property buildings: List[Building]

Convenience property for citylearn.citylearn.CityLearnEnv.buildings().

evaluate(**kwargs) DataFrame[source]

Convenience method for citylearn.citylearn.CityLearnEnv.evaluate().

reset(*, seed: int = None, options: Mapping[str, Any] = None) Tuple[Mapping[str, ndarray], Mapping[str, dict]][source]

Calls citylearn.citylearn.CityLearnEnv.reset() and parses returned values into dictionaries.

step(action_dict: Mapping[str, ndarray]) Tuple[Mapping[str, ndarray], Mapping[str, float], Mapping[str, bool], Mapping[str, bool], Mapping[str, dict]][source]

Calls citylearn.citylearn.CityLearnEnv.step() and parses returned values into dictionaries.

property terminated: bool

Convenience property for citylearn.citylearn.CityLearnEnv.terminated().

property time_step: int

Convenience property for citylearn.citylearn.CityLearnEnv.time_step().

class citylearn.wrappers.RLlibMultiAgentObservationWrapper(env: CityLearnEnv)[source]

Bases: ObservationWrapper

Observation wrapper for RLlib multi-agent algorithms.

Wraps observation space and observations so that they are returned as gymnasium.spaces.Dict and dict objects respectively. The keys in these objects correspond to the agent IDs i.e., policy IDs in the multi-agent.

Parameters:

env (CityLearnEnv) – CityLearn environment.

observation(observations: List[List[float]]) Mapping[str, ndarray][source]

Parses observation into a dictionary.

property observation_space: Dict

Parses observation space into a gymnasium.spaces.Dict.

class citylearn.wrappers.RLlibMultiAgentRewardWrapper(env: CityLearnEnv)[source]

Bases: RewardWrapper

Action wrapper for RLlib multi-agent algorithms.

Wraps action space so that it is returned as a dict mapping agent IDs to reward values.

Parameters:

env (CityLearnEnv) – CityLearn environment.

reward(reward: List[float]) Mapping[str, float][source]

Parses reward into a dict.

class citylearn.wrappers.RLlibSingleAgentWrapper(env_config: Mapping[str, Any])[source]

Bases: StableBaselines3Wrapper

Wrapper for RLlib single-agent algorithms.

Uses the same wrapper as stable-baselines3 by wrapping env in citylearn.wrappers.StableBaselines3Wrapper.

Parameters:

env_config (Mapping[str, Any]) – Dictionary providing initialization parameters for the environment. Must contain env_kwargs as a key where env_kwargs is a dict used to initialize citylearn.citylearn.CityLearnEnv. Thus it must contain all with positional arguments needed for intialization and optionally contain optional intialization arguments. env_config can also contain a wrappers key that is a list of citylearn.wrappers classes to wrap citylearn.citylearn.CityLearnEnv with. Wrapping with citylearn.wrappers.ClippedObservationWrapper is recommended to avoid having the simulation terminating prematurely with an error due to out of bound observations relative to the observation space.

Notes

This wrapper is only compatible with an environment where citylearn.citylearn.central_agent`=`True and will initialize the environment as such, overriding any value for central_agent in env_kwargs.

class citylearn.wrappers.StableBaselines3ActionWrapper(env: CityLearnEnv)[source]

Bases: ActionWrapper

Action wrapper for stable-baselines3 algorithms.

Wraps actions so that they are returned in a 1-dimensional numpy array. This wrapper is only compatible when the environment is controlled by a central agent i.e., citylearn.citylearn.CityLearnEnv.central_agent = True.

Parameters:

env (CityLearnEnv) – CityLearn environment.

action(actions: List[float]) List[List[float]][source]

Returns actions as 1-dimensional numpy array.

property action_space: Box

Returns single spaces Box object.

class citylearn.wrappers.StableBaselines3ObservationWrapper(env: CityLearnEnv)[source]

Bases: ObservationWrapper

Observation wrapper for stable-baselines3 algorithms.

Wraps observations so that they are returned in a 1-dimensional numpy array. This wrapper is only compatible when the environment is controlled by a central agent i.e., citylearn.citylearn.CityLearnEnv.central_agent = True.

Parameters:

env (CityLearnEnv) – CityLearn environment.

observation(observations: List[List[float]]) ndarray[source]

Returns observations as 1-dimensional numpy array.

property observation_space: Box

Returns single spaces Box object.

class citylearn.wrappers.StableBaselines3RewardWrapper(env: CityLearnEnv)[source]

Bases: RewardWrapper

Reward wrapper for stable-baselines3 algorithms.

Wraps rewards so that it is returned as float value. This wrapper is only compatible when the environment is controlled by a central agent i.e., citylearn.citylearn.CityLearnEnv.central_agent = True.

Parameters:

env (CityLearnEnv) – CityLearn environment.

reward(reward: List[float]) float[source]

Returns reward as float value.

class citylearn.wrappers.StableBaselines3Wrapper(env: CityLearnEnv)[source]

Bases: Wrapper

Wrapper for stable-baselines3 algorithms.

Wraps observations so that they are returned in a 1-dimensional numpy array. Wraps actions so that they are returned in a 1-dimensional numpy array. Wraps rewards so that it is returned as float value.

Parameters:

env (CityLearnEnv) – CityLearn environment.

class citylearn.wrappers.TabularQLearningActionWrapper(env: CityLearnEnv, bin_sizes: List[Mapping[str, int]] = None, default_bin_size: int = None)[source]

Bases: ActionWrapper

Action wrapper for citylearn.agents.q_learning.TabularQLearning agent.

Wraps env in citylearn.wrappers.DiscreteActionWrapper.

Parameters:
  • env (CityLearnEnv) – CityLearn environment.

  • bin_sizes (List[Mapping[str, int]], optional) – Then number of bins for each active action in each building.

  • default_bin_size (int, default = 10) – The default number of bins if bin_sizes is unspecified for any active building action.

action(actions: List[float]) List[List[int]][source]

Returns discretized actions.

property action_space: List[Discrete]

Returns action space for discretized actions.

set_combinations() List[List[int]][source]

Returns all combinations of discrete actions.

class citylearn.wrappers.TabularQLearningObservationWrapper(env: CityLearnEnv, bin_sizes: List[Mapping[str, int]] = None, default_bin_size: int = None)[source]

Bases: ObservationWrapper

Observation wrapper for citylearn.agents.q_learning.TabularQLearning agent.

Wraps env in citylearn.wrappers.DiscreteObservationWrapper.

Parameters:
  • env (CityLearnEnv) – CityLearn environment.

  • bin_sizes (List[Mapping[str, int]], optional) – Then number of bins for each active observation in each building.

  • default_bin_size (int, default = 10) – The default number of bins if bin_sizes is unspecified for any active building observation.

observation(observations: List[List[int]]) List[List[int]][source]

Returns discretized observations.

property observation_space: List[Discrete]

Returns observation space for discretized observations.

set_combinations() List[List[int]][source]

Returns all combinations of discrete observations.

class citylearn.wrappers.TabularQLearningWrapper(env: CityLearnEnv, observation_bin_sizes: List[Mapping[str, int]] = None, action_bin_sizes: List[Mapping[str, int]] = None, default_observation_bin_size: int = None, default_action_bin_size: int = None)[source]

Bases: Wrapper

Wrapper for citylearn.agents.q_learning.TabularQLearning agent.

Discretizes observation and action spaces.

Parameters:
  • env (CityLearnEnv) – CityLearn environment.

  • observation_bin_sizes (List[Mapping[str, int]], optional) – Then number of bins for each active observation in each building.

  • action_bin_sizes (List[Mapping[str, int]], optional) – Then number of bins for each active action in each building.

  • default_observation_bin_size (int, default = 10) – The default number of bins if bin_sizes is unspecified for any active building observation.

  • default_action_bin_size (int, default = 10) – The default number of bins if bin_sizes is unspecified for any active building action.