citylearn.agents.marlisa module
- class citylearn.agents.marlisa.MARLISA(*args, regression_buffer_capacity: Optional[int] = None, start_regression_time_step: Optional[int] = None, regression_frequency: Optional[int] = None, information_sharing: Optional[bool] = None, pca_compression: Optional[float] = None, iterations: Optional[int] = None, **kwargs)[source]
Bases:
citylearn.agents.sac.SAC
- add_to_buffer(observations: List[List[float]], actions: List[List[float]], reward: List[float], next_observations: List[List[float]], done: bool)[source]
Update replay buffer.
- Parameters
observations (List[List[float]]) – Previous time step observations.
actions (List[List[float]]) – Previous time step actions.
reward (List[float]) – Current time step reward.
next_observations (List[List[float]]) – Current time step observations.
done (bool) – Indication that episode has ended.
- property batch_size: int
Batch size.
- property coordination_variables_history: List[float]
- get_exploration_actions(observations: List[List[float]]) List[List[float]] [source]
Return randomly sampled actions from action_space multiplied by
action_scaling_coefficient
.- Returns
actions – Action values.
- Return type
List[List[float]]
- get_exploration_actions_with_information_sharing(observations: List[List[float]]) Tuple[List[List[float]], List[List[float]]] [source]
- get_exploration_actions_without_information_sharing(observations: List[List[float]]) Tuple[List[List[float]], List[List[float]]] [source]
- get_post_exploration_actions(observations: List[List[float]], deterministic: bool) List[List[float]] [source]
Action sampling using policy, post-exploration time step
- get_post_exploration_actions_with_information_sharing(observations: List[List[float]], deterministic: bool) Tuple[List[List[float]], List[List[float]]] [source]
- get_post_exploration_actions_without_information_sharing(observations: List[List[float]], deterministic: bool) Tuple[List[List[float]], List[List[float]]] [source]
- get_regression_variables(index: int, observations: List[float], actions: List[float]) List[float] [source]
Hidden dimension.
- property information_sharing: bool
- property iterations: int
- property pca_compression: float
- property regression_buffer_capacity: int
- property regression_frequency: int
- reset()[source]
Reset environment to initial state.
Calls reset_time_step.
Notes
Override in subclass for custom implementation when reseting environment.
- set_regression_encoders() List[List[citylearn.preprocessing.Encoder]] [source]
Get observation value transformers/encoders for use in MARLISA agent internal regression model.
The encoder classes are defined in the preprocessing.py module and include PeriodicNormalization for cyclic observations, OnehotEncoding for categorical obeservations, RemoveFeature for non-applicable observations given available storage systems and devices and Normalize for observations with known minimum and maximum boundaries.
- Returns
encoders – Encoder classes for observations ordered with respect to active_observations.
- Return type
List[Encoder]
- property start_regression_time_step: int