Open In Colab

QuickStart

Install the latest CityLearn version from PyPi with the :code:pip command:

[ ]:
!pip install CityLearn==2.0.0

Centralized RBC

Run the following to simulate an environment controlled by centralized RBC agent for a single episode:

[2]:
from citylearn.agents.rbc import BasicRBC as RBCAgent
from citylearn.citylearn import CityLearnEnv, EvaluationCondition

dataset_name = 'citylearn_challenge_2022_phase_1'
env = CityLearnEnv(dataset_name, central_agent=True, simulation_end_time_step=1000)
model = RBCAgent(env)
model.learn(episodes=1)

# print cost functions at the end of episode
kpis = model.env.evaluate(baseline_condition=EvaluationCondition.WITHOUT_STORAGE_BUT_WITH_PARTIAL_LOAD_AND_PV)
kpis = kpis.pivot(index='cost_function', columns='name', values='value')
kpis = kpis.dropna(how='all')
display(kpis)
name Building_1 Building_2 Building_3 Building_4 Building_5 District
cost_function
annual_peak_average NaN NaN NaN NaN NaN 1.095048
carbon_emissions_total 1.102594 1.121684 1.160362 1.288392 1.156426 1.165892
cost_total 1.052283 1.050133 1.099890 1.238812 1.061484 1.100520
daily_one_minus_load_factor_average NaN NaN NaN NaN NaN 1.006849
daily_peak_average NaN NaN NaN NaN NaN 1.126686
discomfort_delta_average 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
discomfort_delta_maximum 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
discomfort_delta_minimum 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
electricity_consumption_total 1.154071 1.201622 1.220946 1.351182 1.251914 1.235947
monthly_one_minus_load_factor_average NaN NaN NaN NaN NaN 0.995422
ramping_average NaN NaN NaN NaN NaN 1.157437
zero_net_energy 1.069359 1.094072 1.096655 1.109892 1.110463 1.096088

Decentralized-Independent SAC

Run the following to simulate an environment controlled by decentralized-independent SAC agents for 1 training episode:

[3]:
from citylearn.agents.sac import SAC as RLAgent
from citylearn.citylearn import CityLearnEnv, EvaluationCondition

dataset_name = 'citylearn_challenge_2022_phase_1'
env = CityLearnEnv(dataset_name, central_agent=False, simulation_end_time_step=1000)
model = RLAgent(env)
model.learn(episodes=2, deterministic_finish=True)

# print cost functions at the end of episode
kpis = model.env.evaluate(baseline_condition=EvaluationCondition.WITHOUT_STORAGE_BUT_WITH_PARTIAL_LOAD_AND_PV)
kpis = kpis.pivot(index='cost_function', columns='name', values='value')
kpis = kpis.dropna(how='all')
display(kpis)
name Building_1 Building_2 Building_3 Building_4 Building_5 District
cost_function
annual_peak_average NaN NaN NaN NaN NaN 0.999520
carbon_emissions_total 1.000830 1.003118 1.006803 1.008576 1.005219 1.004909
cost_total 1.000736 1.003527 1.005652 1.007746 1.004541 1.004441
daily_one_minus_load_factor_average NaN NaN NaN NaN NaN 0.997970
daily_peak_average NaN NaN NaN NaN NaN 1.002592
discomfort_delta_average 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
discomfort_delta_maximum 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
discomfort_delta_minimum 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
electricity_consumption_total 1.000747 1.003289 1.006059 1.008239 1.005223 1.004711
monthly_one_minus_load_factor_average NaN NaN NaN NaN NaN 0.998779
ramping_average NaN NaN NaN NaN NaN 0.999844
zero_net_energy 1.000396 1.002922 1.014362 1.016329 1.016463 1.010094

Decentralized-Cooperative MARLISA

Run the following to simulate an environment controlled by decentralized-cooperative MARLISA agents for 1 training episode:

[4]:
from citylearn.agents.marlisa import MARLISA as RLAgent
from citylearn.citylearn import CityLearnEnv, EvaluationCondition

dataset_name = 'citylearn_challenge_2022_phase_1'
env = CityLearnEnv(dataset_name, central_agent=False, simulation_end_time_step=1000)
model = RLAgent(env)
model.learn(episodes=2, deterministic_finish=True)

kpis = model.env.evaluate(baseline_condition=EvaluationCondition.WITHOUT_STORAGE_BUT_WITH_PARTIAL_LOAD_AND_PV)
kpis = kpis.pivot(index='cost_function', columns='name', values='value')
kpis = kpis.dropna(how='all')
display(kpis)
name Building_1 Building_2 Building_3 Building_4 Building_5 District
cost_function
annual_peak_average NaN NaN NaN NaN NaN 0.996312
carbon_emissions_total 1.001109 1.001338 1.002362 1.001276 1.001262 1.001469
cost_total 1.000774 1.000223 1.002054 1.001097 1.000936 1.001017
daily_one_minus_load_factor_average NaN NaN NaN NaN NaN 0.999645
daily_peak_average NaN NaN NaN NaN NaN 1.000204
discomfort_delta_average 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
discomfort_delta_maximum 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
discomfort_delta_minimum 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
electricity_consumption_total 1.001689 1.002693 1.003330 1.001814 1.001687 1.002243
monthly_one_minus_load_factor_average NaN NaN NaN NaN NaN 0.999536
ramping_average NaN NaN NaN NaN NaN 1.000448
zero_net_energy 1.000717 1.001723 1.001401 1.000740 1.000561 1.001029

Stable Baselines3 Reinforcement Learning Algorithms

Install the latest version of Stable Baselines3:

[ ]:
!pip install stable-baselines3==1.7.0

Before the environment is ready for use in Stable Baselines3, it needs to be wrapped. Firstly, wrap the environment using the NormalizedObservationWrapper (see docs) to ensure that observations served to the agent are min-max normalized between [0, 1] and cyclical observations e.g. hour, are encoded using the cosine transformation.

Next, we wrap with the StableBaselines3Wrapper (see docs) that ensures observations, actions and rewards are served in manner that is compatible with Stable Baselines3 interface.

For the following Stable Baselines3 example, the baeda_3dem dataset that support building temperature dynamics is used.

⚠️ NOTE: central_agent in the env must be True when using Stable Baselines3 as it does not support multi-agents.

[6]:
from stable_baselines3.sac import SAC
from citylearn.citylearn import CityLearnEnv
from citylearn.wrappers import NormalizedObservationWrapper, StableBaselines3Wrapper

dataset_name = 'baeda_3dem'
env = CityLearnEnv(dataset_name, central_agent=True, simulation_end_time_step=1000)
env = NormalizedObservationWrapper(env)
env = StableBaselines3Wrapper(env)
model = SAC('MlpPolicy', env)
model.learn(total_timesteps=env.time_steps*2)

# evaluate
observations = env.reset()

while not env.done:
    actions, _ = model.predict(observations, deterministic=True)
    observations, _, _, _ = env.step(actions)

kpis = env.evaluate()
kpis = kpis.pivot(index='cost_function', columns='name', values='value')
kpis = kpis.dropna(how='all')
display(kpis)
name Building_1 Building_2 Building_3 Building_4 District
cost_function
annual_peak_average NaN NaN NaN NaN 0.572397
cost_total 1.018358 0.988815 0.737863 0.589145 0.833545
daily_one_minus_load_factor_average NaN NaN NaN NaN 0.942072
daily_peak_average NaN NaN NaN NaN 0.706729
discomfort_delta_average -0.377245 -0.009167 3.860405 2.633747 1.526935
discomfort_delta_maximum 2.292152 2.667698 6.868553 8.035797 4.966050
discomfort_delta_minimum -6.134180 -3.742098 -3.700124 -2.782747 -4.089787
discomfort_proportion 0.290196 0.545936 0.875761 0.683599 0.598873
discomfort_too_cold_proportion 0.254902 0.256184 0.003654 0.005806 0.130136
discomfort_too_hot_proportion 0.035294 0.289753 0.872107 0.677794 0.468737
electricity_consumption_total 1.043795 1.060913 0.764151 0.648842 0.879425
monthly_one_minus_load_factor_average NaN NaN NaN NaN 0.859535
ramping_average NaN NaN NaN NaN 1.019842
zero_net_energy 1.043196 1.060596 0.764151 0.514950 0.845723