citylearn.agents.base module

class citylearn.agents.base.Agent(observation_names: List[List[str]], observation_space: List[gym.spaces.box.Box], action_space: List[gym.spaces.box.Box], building_information: List[Mapping[str, Any]], **kwargs)[source]

Bases: citylearn.base.Environment

property action_dimension: List[int]

Number of returned actions.

property action_space: List[gym.spaces.box.Box]

Format of valid actions.

property actions: List[List[List[Any]]]

Action history/time series.

add_to_buffer(*args, **kwargs)[source]

Update replay buffer

Notes

This implementation does nothing but is kept to keep the API for all agents similar during simulation.

property building_information: List[Mapping[str, Any]]

Building metadata.

next_time_step()[source]

Advance to next time_step value.

Notes

Override in subclass for custom implementation when advancing to next time_step.

property observation_names: List[List[str]]

Names of active observations that can be used to map observation values.

property observation_space: List[gym.spaces.box.Box]

Format of valid observations.

reset()[source]

Reset environment to initial state.

Calls reset_time_step.

Notes

Override in subclass for custom implementation when reseting environment.

select_actions(observations: List[List[float]], deterministic: Optional[bool] = None) List[List[float]][source]

Provide actions for current time step.

Return randomly sampled actions from action_space.

Parameters
  • observations (List[List[float]]) – Environment observations

  • deterministic (bool, default: False) – Wether to return purely exploitatative deterministic actions.

Returns

actions – Action values

Return type

List[float]

set_encoders() List[List[citylearn.preprocessing.Encoder]][source]

Get observation value transformers/encoders for use in agent algorithm.

The encoder classes are defined in the preprocessing.py module and include PeriodicNormalization for cyclic observations, OnehotEncoding for categorical obeservations, RemoveFeature for non-applicable observations given available storage systems and devices and Normalize for observations with known minimum and maximum boundaries.

Returns

encoders – Encoder classes for observations ordered with respect to active_observations.

Return type

List[List[Encoder]]