SSL Assignment Report 1
SSL Assignment Report 1
1 Abstract
Our chosen assignment [3] involves forecasting the energy consumption and production time-series for
prosumers (consumers who also produce energy, mainly via solar panels) on the Enefit platform in
Estonia. We focus on relevant data features such as prosumer installed capacity, client metadata, his-
torical and forecast weather data from nearby stations, natural gas prices, and electricity market prices.
We plan to model this time-series data using Gated Recurrent Units (GRUs), LSTMs with Attention
(LSTM-Attn), MLPs and XGBoost [5], [9], [10]. We also open-source our results and experiments at
[15].
Contents
1 Abstract 1
2 Dataset Preprocessing 2
2.1 Data Loading and Cleaning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Data Merging and Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.3 Feature Extraction and Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.4 Normalization and Data Splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3 Models Used 3
3.1 MLP Model – Base Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.2 LSTM with Attention – Improved Sequential Modeling . . . . . . . . . . . . . . . . . . . . 3
3.3 GRU Model – Enhanced Temporal Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.4 XGBoost Gradient Boosting – Best Performing Model . . . . . . . . . . . . . . . . . . . . 3
4 Results 4
4.1 MLP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4.2 LSTM-Attn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4.3 GRU (window = 50) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4.4 GRU (window = 100) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4.5 XGBoost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
5 Future Work 6
5.1 Data Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5.1.1 Common Time Series Augmentation Techniques . . . . . . . . . . . . . . . . . . . 6
5.1.2 Generative Time Series Synthetization Techniques . . . . . . . . . . . . . . . . . . 6
5.1.3 Evaluating Synthesized Data Quality . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5.2 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5.2.1 Boosting-based methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5.2.2 Traditional Statistical Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5.2.3 Transformers Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1
A Dataset Description 8
A.1 train.csv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
A.2 gas prices.csv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
A.3 client.csv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
A.4 electricity prices.csv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
A.5 forecast weather.csv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
A.6 historical weather.csv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
B Weather Visualization 11
2 Dataset Preprocessing
Compared to the plethora of features given in Appendix A, we pruned most of them to save compute
and check the models’ performance with the most intuitive and descriptive features available.
• train.csv: Core energy target data (datetime, target, is consumption, data block id).
• client.csv: Client metadata (installed capacity, county, product type).
• forecast weather.csv: Weather forecasts (e.g. direct solar radiation, surface solar radiation downwards)
2
3 Models Used
In this project, four distinct models have been employed to forecast energy consumption and production.
Each model leverages a unique architecture tailored to capture the data’s temporal and feature-based
characteristics. They are described below in the order of experimentation and performance:
3
4 Results
In this section we report each model’s forecast performance and show their prediction vs. ground-truth
curves plus training/validation loss plots.
4.1 MLP
(a) Prediction vs. Ground Truth (b) Training and Validation Loss
4.2 LSTM-Attn
(a) Prediction vs. Ground Truth (b) Training and Validation Loss
4
(a) Prediction vs. Ground Truth (b) Training and Validation Loss
(a) Prediction vs. Ground Truth (b) Training and Validation Loss
4.5 XGBoost
5
5 Future Work
5.1 Data Augmentation
We want to employ several augmentation techniques to make our models more robust and prevent
overfitting. Time series are particularly hard to augment due to their temporal structure which is
inherently tied to their behavior. Thus, we categorize the augmentation techniques into two avenues:
5.2 Models
5.2.1 Boosting-based methods
LightBGM [6] is known to be a faster and less resource-intensive variant of XGBoost. Doing faster
experiments with no drawbacks would be very useful.
CatBoost [13] using unique features via oblivious decision trees (the same split criterion is used on
the whole tree). The sophisticated built-in handling of categorical features should leverage the importance
of features such as county or is business, capturing complex interactions.
6
References
[1] Mikolaj Bińkowski, Jeff Donahue, Sander Dieleman, Aidan Clark, Erich Elsen, Norman Casagrande,
Luis C. Cobo, and Karen Simonyan. High fidelity speech synthesis with adversarial networks. arXiv
preprint arXiv:1909.11646, 2019.
[2] Tianqi Chen and Carlos Guestrin. XGBoost: A scalable tree boosting system. In Proceedings of the
22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages
785–794. ACM, 2016.
[3] Enefit. Enefit - Predict Energy Behavior of Prosumers. Kaggle Competition, 2023. Accessed:
2025-04-17.
[4] F. Haddad. How to evaluate the quality of the synthetic data – measuring from the perspective of
fidelity, utility, and privacy. AWS Machine Learning Blog, 2022. Accessed: 2025-04-17.
[5] hyd. 1st place solution. Kaggle Discussion Forum, Enefit - Predict Energy Behavior of Prosumers
Competition, 2024. Accessed: 2025-04-17.
[6] Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan
Liu. LightGBM: A highly efficient gradient boosting decision tree. In Isabelle Guyon, Ulrike von
Luxburg, Sébastien Bengio, Hanna M. Wallach, Rob Fergus, Sanjay Vishwanathan, and Roman
Garnett, editors, Advances in Neural Information Processing Systems 30, pages 3146–3154. Curran
Associates, Inc., 2017.
[7] Anders Boesen Lindbo Larsen, Søren Kaae Sønderby, Hugo Larochelle, and Ole Winther. Autoen-
coding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300, 2015.
[8] Mad Devs. Basic Data Augmentation Method Applied to Time Series. Mad Devs Blog, Jan 2024.
Published: 2024-01-15. Accessed: 2025-04-17.
[9] Matt Motoki. 6th place solution. Kaggle Discussion Forum, Enefit - Predict Energy Behavior of
Prosumers Competition, 2024. Accessed: 2025-04-17.
[10] Jakob Khalil Musone and Thomas Deckers. Enefit - Predict Energy Behavior of Prosumers.
https://github.com/Musone/Predict-Energy-Behavior-of-Prosumers, 2025. GitHub reposi-
tory. Accessed: 2025-04-17.
[11] Neptune.ai. ARIMA vs Prophet vs LSTM for Time Series Prediction. Neptune.ai Blog, Jan 2025.
Published: 2025-01-24. Accessed: 2025-04-17.
[12] Yuqi Nie, Nam H. Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam. A time series is worth
64 words: Long-term forecasting with transformers. In International Conference on Learning Rep-
resentations, 2023. arXiv preprint arXiv:2211.14730.
[13] Liudmila Ostroumova Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush,
and Andrey Gulin. CatBoost: unbiased boosting with categorical features. In Advances in Neural
Information Processing Systems, pages 6639–6649, 2018.
[14] Jinsung Yoon, Daniel Jarrett, and Mihaela van der Schaar. Time-series generative adversarial
networks. In Advances in Neural Information Processing Systems (NeurIPS), volume 32, 2019.
7
A Dataset Description
This appendix details the structure and parameters of the dataset files provided for the competition. All
datasets use EET/EEST time (UTC+2/UTC+3), with ‘datetime‘ columns typically indicating the start
of a 1-hour interval, except for some instantaneous weather measurements noted below.
A.1 train.csv
Summary: Core training data with hourly energy targets per segment.
A.3 client.csv
Summary: Client segment info, including installed solar capacity.
8
Table 4: Dataset Parameters: client.csv (continued)
9
A.6 historical weather.csv
Summary: Observed historical weather data. Units/definitions may differ from forecasts.
10
B Weather Visualization
11