citylearn.agents.marlisa module

class citylearn.agents.marlisa.MARLISA(*args, regression_buffer_capacity: int = None, start_regression_time_step: int = None, regression_frequency: int = None, information_sharing: bool = None, pca_compression: float = None, iterations: int = None, **kwargs)[source]

Bases: SAC

property batch_size: int

Batch size.

property coordination_variables_history: List[float]
get_encoded_regression_targets(index: int, observations: List[float]) float[source]
get_encoded_regression_variables(index: int, observations: List[float]) List[float][source]
get_exploration_prediction(observations: List[List[float]]) List[List[float]][source]

Return randomly sampled actions from action_space multiplied by action_scaling_coefficient.

get_exploration_prediction_with_information_sharing(observations: List[List[float]]) Tuple[List[List[float]], List[List[float]]][source]
get_exploration_prediction_without_information_sharing(observations: List[List[float]]) Tuple[List[List[float]], List[List[float]]][source]
get_post_exploration_prediction(observations: List[List[float]], deterministic: bool) List[List[float]][source]

Action sampling using policy, post-exploration time step

get_post_exploration_prediction_with_information_sharing(observations: List[List[float]], deterministic: bool) Tuple[List[List[float]], List[List[float]]][source]
get_post_exploration_prediction_without_information_sharing(observations: List[List[float]], deterministic: bool) Tuple[List[List[float]], List[List[float]]][source]
get_regression_variables(index: int, observations: List[float], actions: List[float]) List[float][source]
property hidden_dimension: List[float]

Hidden dimension.

property information_sharing: bool
property iterations: int
property pca_compression: float
predict_demand(index: int, observations: List[float], actions: List[float]) float[source]
property regression_buffer_capacity: int
property regression_frequency: int
reset()[source]

Reset environment to initial state.

Calls reset_time_step.

Notes

Override in subclass for custom implementation when reseting environment.

set_energy_coefficients()[source]
set_networks()[source]
set_pca()[source]
set_regression_encoders() List[List[Encoder]][source]

Get observation value transformers/encoders for use in MARLISA agent internal regression model.

The encoder classes are defined in the preprocessing.py module and include PeriodicNormalization for cyclic observations, OnehotEncoding for categorical obeservations, RemoveFeature for non-applicable observations given available storage systems and devices and Normalize for observations with known minimum and maximum boundaries.

Returns:

encoders – Encoder classes for observations ordered with respect to active_observations.

Return type:

List[Encoder]

property start_regression_time_step: int
update(observations: List[List[float]], actions: List[List[float]], reward: List[float], next_observations: List[List[float]], terminated: bool, truncated: bool)[source]

Update replay buffer.

Parameters:
  • observations (List[List[float]]) – Previous time step observations.

  • actions (List[List[float]]) – Previous time step actions.

  • reward (List[float]) – Current time step reward.

  • next_observations (List[List[float]]) – Current time step observations.

  • terminated (bool) – Indication that episode has ended.

  • truncated (bool) – If episode truncates due to a time limit or a reason that is not defined as part of the task MDP.

class citylearn.agents.marlisa.MARLISARBC(env: CityLearnEnv, rbc: RBC = None, **kwargs: Any)[source]

Bases: MARLISA, SACRBC

Uses citylearn.agents.rbc.RBC to select action during exploration before using citylearn.agents.marlisa.MARLISA.

Parameters:
  • env (CityLearnEnv) – CityLearn environment.

  • rbc (RBC) – citylearn.agents.rbc.RBC or child class, used to select actions during exploration.

  • **kwargs (Any) – Other keyword arguments used to initialize super class.

get_exploration_prediction_without_information_sharing(observations: List[List[float]]) Tuple[List[List[float]], List[List[float]]][source]