0% found this document useful (0 votes)

36 views3 pages

Project Part I

This document discusses a dataset containing weather and solar irradiance data along with solar power generation. The author notes concerns about the context and source of the data. Plots of covariates vs responses show correlations between variables like angle of incidence and power generation. Abnormal patterns in some covariates will be investigated further.

Uploaded by

Scott Underwood

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views3 pages

Project Part I

Uploaded by

Scott Underwood

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Scott Underwood

01/27/2023

MS&E 226 Project Part I

This dataset contains weather data and solar irradiance data along with power generation for a
solar photovoltaic array. I have set aside a randomly selected test dataset consisting of 20% of the data,
and a training dataset consisting of the remaining 80% of the data. My main concern with the data is
that the source of the data provides little to no context on where the data points come from. It is
unclear if they are all sampled from the same location or different locations, or what the time scale is
between the samples (i.e. are they sampled each day at the same time, different time, etc.). All of the
data provided seems realistic, but the lack of context is reason for concern. Perhaps through analysis we
will be able to glean insight into some of these questions, such as if the data is from one location or
multiple different locations.

With the increasing penetration of renewable energy on the electric grid, there is a growing
need for predicting the power generation of these sources, which are unpredictable and subject to the
weather conditions. This dataset will hopefully be used to glean insight into how different weather and
solar irradiance conditions impact solar power generation. This information can then be used to predict
solar power generation in the future using weather forecasts, which will be critical to proper power
planning and electric grid operation. One potential issue with using a model derived from a dataset for
prediction is that there are scientific equations relating to solar irradiance that can be used to calculate
power directly, something I’ve worked on doing at the Pacific Northwest National Laboratory (PNNL).
While it’s important to explore both methods of predicting solar power generation, my worry is that
using solar power generation equations rather than predictive models based off historical data will be
more accurate in predicting power generation.

One covariate that I wish was collected in this dataset is the location of all these measurements,
whether it be a categorical variable such as the city or state, or a continuous variable like the
coordinates. It’s important to know if these data points come from the same location or not. The time of
day and date of measurement would also be nice to have, as different times of day and times of year
correspond to different levels of solar irradiance. While the effects of these other variables are likely
(hopefully) captured in the covariates that are present, I would have collected those fields as well to
provide a more informative dataset.

The main response variable of interest is the power generation, ‘generated_power_kw’, which is
a continuous variable. Being able to predict the power generation given the other covariates is
important in grid planning, as mentioned above, which makes it the logical choice to model as a
continuous response variable. There isn’t an obvious choice for a binary response variable in this
dataset, but I am going to use ‘precipitation’ as the binary response variable, with a value of 0
corresponding to 0.0 precipitation, and a value of 1 corresponding to > 0.0 precipitation from the
‘total_precipitation_sfc’ column. The covariates, mostly containing solar irradiance or weather data,
should be able to accurately predict whether or not it is raining.

Due to my prior experience in the field, there are a few variables present that I wouldn’t expect
would impact solar power generation. Namely, pressure and wind speed theoretically should not impact
solar power generation, as the solar power generation equations rely solely on solar irradiance and
temperature data. I’m not going to exclude the data, however, as it’s possible that they have correlation
to power generation (although it would likely be due to interactions with other covariates).
Scott Underwood
01/27/2023

I would be interesting in potentially adding interactions between temperature and radiation as

well as angle of incidence, zenith and azimuth and radiation. These are the main factors that I expect to
impact predicted solar power generation, and I’d expect that their interactions would have a significant
impact. For example, I’d expect angle of incidence, zenith and azimuth to have a larger impact on power
generation if radiation is also high. There are no null or NA values in my dataset, so I plan to use every
entry.

Plotting the covariates against the continuous response variable ‘generated_power_kw’ in

Figure 1, I found that angle of incidence, zenith, precipitation, and snowfall have a strong negative
correlation with the power generation (but it doesn’t appear to be a linear relationship in the case of
angle of incidence and zenith). The field ‘shortwave_radation_backwards_sfc’ appears to have a slight
positive correlation, although not all that strong. Lastly, azimuth appears to have a parabolic correlation,
with the highest generated power corresponding to about 180 degree azimuth angle. My main surprise
is that cloud cover doesn’t seem to have a stronger correlation with power generation.

Figure 1: Plot of generated_power_kw vs. covariates

Plotting the covariates against the binary response variable ‘precipitation’ in Figure 2 yields less
obvious results, but a few correlations can still be seen. Jitter was added to show the density of the
points due to the binary nature of the precipitation variable. Namely it appears that
‘mean_sea_level_pressure_MSL’ and ‘shortwave_radiation_backwards_sfc’ are negatively correlated
with ‘precipitation’ (higher values correspond to a value of zero for precipitation) and
‘relative_humidity_2_m_above_gnd’ appears to be positively correlated. These make sense, as
precipitation tends to come with higher humidity levels and lower solar intensity and pressure. I’d also
expect to see a correlation between some of the cloud cover variables and precipitation, but they don’t
show an obvious pattern in the plots displayed.
Scott Underwood
01/27/2023

Figure 2: Plot of precipitation vs. covariates (with jitter)

One concern with the data after looking at some of the plots are the abnormal patterns in some
of the covariate values. For example, azimuth has a couple of gaps between 150-200 where there are no
data points, which seems unlikely due to the density of the rest of the range. Additionally, the low and
medium cloud cover variables show a strong vertical line around 10%. These abnormal behaviors may
indicate some sort of bias in the data collection and will be something to look for as we investigate the
dataset further.

Looking at the mean value of all the covariates for precipitation levels of 0 and 1 reveals trends
that weren’t visible in the plots in Figure 2. Namely, total cloud cover is more than twice as high for
precipitation level 1 as for precipitation level 0 (79 vs. 30). There are other slight differences between
the mean values of covariates between the two precipitation levels, but none that jump out that
weren’t mentioned in analysis of the plots. Looking at the variance between the two levels, the
shortwave radiation has approximately twice as high variance for precipitation level 0 as for
precipitation level 1 (7,881 vs. 3,194). Additionally, the generated power has twice as high variance for
precipitation level 0 as for precipitation level 1 (89,251 vs. 43,183).

While there is concern about the lack of knowledge of how the data was collected and
where/when it is from, overall there seems to be informative data in the dataset. There also appear to
be relevant correlations, minimal null or NA values, and realistic looking data. In further parts of the
project, we will look at uncovering more correlations and interactions between covariates and response
variables and use this information to inform predictions about solar power generation.

Automated Deep CNN-LSTM Architecture Design For Solar Irradiance Forecasting
No ratings yet
Automated Deep CNN-LSTM Architecture Design For Solar Irradiance Forecasting
12 pages
Project Part II
No ratings yet
Project Part II
6 pages
Short-Term Solar Power Forecasts Considering Various Weather Variables
No ratings yet
Short-Term Solar Power Forecasts Considering Various Weather Variables
4 pages
Research Proposal
No ratings yet
Research Proposal
13 pages
1 s2.0 S0038092X20305090 Main PDF
No ratings yet
1 s2.0 S0038092X20305090 Main PDF
12 pages
A Comparative Study of Data Mining Methods For Solar Radiation and Temperature Forecasting Models
No ratings yet
A Comparative Study of Data Mining Methods For Solar Radiation and Temperature Forecasting Models
31 pages
DA - Group 31
No ratings yet
DA - Group 31
11 pages
A Comparative Study of Data Mining Methods For Solar Radiation and Temperature Forecasting Models
No ratings yet
A Comparative Study of Data Mining Methods For Solar Radiation and Temperature Forecasting Models
31 pages
Metrics For Evaluating The Accuracy of Solar Power Forecasting
No ratings yet
Metrics For Evaluating The Accuracy of Solar Power Forecasting
10 pages
Prediction of Solar Radiation Using Artificial Neural Network
No ratings yet
Prediction of Solar Radiation Using Artificial Neural Network
12 pages
1 s2.0 S0306261923010097 Main
No ratings yet
1 s2.0 S0306261923010097 Main
16 pages
Sinhgad Institute of Technology & Science, Pune: Academic Year: 2024-2025 Class: BE Synopsis
No ratings yet
Sinhgad Institute of Technology & Science, Pune: Academic Year: 2024-2025 Class: BE Synopsis
4 pages
A Review of Distributed Solar Forecasting With Remote Sensing and Deep Learning-2024
No ratings yet
A Review of Distributed Solar Forecasting With Remote Sensing and Deep Learning-2024
20 pages
Group 33 Mid
No ratings yet
Group 33 Mid
16 pages
Solar Power Generation Data-2
No ratings yet
Solar Power Generation Data-2
34 pages
Sreya Banneni Sai Assignment1
No ratings yet
Sreya Banneni Sai Assignment1
2 pages
Data Analysis For Solar Energy Generation in A Uni PDF
No ratings yet
Data Analysis For Solar Energy Generation in A Uni PDF
7 pages
Abhishek SOLAR GRID
No ratings yet
Abhishek SOLAR GRID
15 pages
Jiang 2017
No ratings yet
Jiang 2017
14 pages
Sustainability 14 17005 v2
No ratings yet
Sustainability 14 17005 v2
31 pages
Project Part III
No ratings yet
Project Part III
5 pages
IRJMETS70200017156
No ratings yet
IRJMETS70200017156
7 pages
Estimation of Solar Radiation in China
No ratings yet
Estimation of Solar Radiation in China
4 pages
Solar Power Prediction
No ratings yet
Solar Power Prediction
20 pages
Solar Panel Data Analysis and ARIMA Model For Power Generation Prediction
No ratings yet
Solar Panel Data Analysis and ARIMA Model For Power Generation Prediction
22 pages
Analysis of Solar Power Generation Forecasting Usi
No ratings yet
Analysis of Solar Power Generation Forecasting Usi
7 pages
Slide For Project Defense
No ratings yet
Slide For Project Defense
23 pages
Gaussian Process Regression For Probabilistic Short-Term Solar Output Forecast
No ratings yet
Gaussian Process Regression For Probabilistic Short-Term Solar Output Forecast
8 pages
1 s2.0 S0142061521001563 Main
No ratings yet
1 s2.0 S0142061521001563 Main
12 pages
Employing Machine Learning For Advanced Gap Imputation in Solar Power Generation Databases
No ratings yet
Employing Machine Learning For Advanced Gap Imputation in Solar Power Generation Databases
17 pages
Enhancing Solar Power Generation Through AC Power Prediction Optimization in Solar Plants
No ratings yet
Enhancing Solar Power Generation Through AC Power Prediction Optimization in Solar Plants
8 pages
Solar Irradiance Prediction - RP
No ratings yet
Solar Irradiance Prediction - RP
8 pages
A Comparative Study of Time Series Forecasting of
No ratings yet
A Comparative Study of Time Series Forecasting of
26 pages
Ensemble Models For Solar Power Forecasting-A Weather Classification Approach
No ratings yet
Ensemble Models For Solar Power Forecasting-A Weather Classification Approach
20 pages
Solar Energy Prediction by Pearson Correlation
No ratings yet
Solar Energy Prediction by Pearson Correlation
20 pages
Icgea 2024
No ratings yet
Icgea 2024
6 pages
EN671: Solar Energy Conversion Technology: Project Report
No ratings yet
EN671: Solar Energy Conversion Technology: Project Report
22 pages
Major
No ratings yet
Major
6 pages
Solar Power Prediction Using Machine Learning
No ratings yet
Solar Power Prediction Using Machine Learning
7 pages
Applsci 13 13072
No ratings yet
Applsci 13 13072
12 pages
A Comprehensive Review On Ensemble Solar Power Forecasting AlgorithmsJournal of Electrical Engineering and Technology
No ratings yet
A Comprehensive Review On Ensemble Solar Power Forecasting AlgorithmsJournal of Electrical Engineering and Technology
15 pages
ADVANCE 2023 Paper 302
No ratings yet
ADVANCE 2023 Paper 302
8 pages
Short-Term Solar Power Forecasting Using Different Machine Learning Models
No ratings yet
Short-Term Solar Power Forecasting Using Different Machine Learning Models
21 pages
IJCRT2405665
No ratings yet
IJCRT2405665
6 pages
TSP Energy 54032
No ratings yet
TSP Energy 54032
23 pages
Solarsaksham: Aiml-Powered Solar Forecasting"
No ratings yet
Solarsaksham: Aiml-Powered Solar Forecasting"
30 pages
Lorenz Et Al-2012-Progress in Photovoltaics Research and Applications
No ratings yet
Lorenz Et Al-2012-Progress in Photovoltaics Research and Applications
10 pages
Research Article: Global Solar Radiation Forecasting Using Square Root Regularization-Based Ensemble
No ratings yet
Research Article: Global Solar Radiation Forecasting Using Square Root Regularization-Based Ensemble
21 pages
Predicting Daily Mean Solar Power Using Machine Learning Regression Techniques Enhanced Reader
No ratings yet
Predicting Daily Mean Solar Power Using Machine Learning Regression Techniques Enhanced Reader
7 pages
Solar Power Forecasting Using Different ML Algorithms
No ratings yet
Solar Power Forecasting Using Different ML Algorithms
12 pages
Final Research Paper
No ratings yet
Final Research Paper
16 pages
RP - PDF 2 Pages
No ratings yet
RP - PDF 2 Pages
2 pages
Day-Ahead Forecasting For The Tropics With Numerical Weather Prediction and Machine Learning
No ratings yet
Day-Ahead Forecasting For The Tropics With Numerical Weather Prediction and Machine Learning
6 pages
Clusters
No ratings yet
Clusters
8 pages
Solar
No ratings yet
Solar
2 pages
1 s2.0 S096014812200461X Main
No ratings yet
1 s2.0 S096014812200461X Main
14 pages
Solar Energy Prediction Using Decision Tree Regressor
No ratings yet
Solar Energy Prediction Using Decision Tree Regressor
14 pages
10 29137-Umagd 1100957-2364053
No ratings yet
10 29137-Umagd 1100957-2364053
10 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Project Part I

Uploaded by

Project Part I

Uploaded by

Scott Underwood

MS&E 226 Project Part I

I would be interesting in potentially adding interactions between temperature and radiation as

Plotting the covariates against the continuous response variable ‘generated_power_kw’ in

Figure 1: Plot of generated_power_kw vs. covariates

Figure 2: Plot of precipitation vs. covariates (with jitter)

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.