Rainfall Prediction Using ML
Rainfall Prediction Using ML
Devi prasad ponnapula Teja nagendra prasad chakali Ram mohan reddy emani
99220042100 99220041148 99220041066
Computer Science and Computer Science and Computer Science and
Engineering Engineering Engineering
Kalasalingam Academy of Kalasalingam Academy of Kalasalingam Academy of
Research and Education, Research and Education, Research and Education,
Krishnankoil Krishnankoil Krishnankoil
99220042100@klu.ac.in 99220041148@klu.ac.in 99220041066@klu.ac.in
Abstract:The Random Forest Classifier forecasts the model the FCM and then apply it as a Rainfall Prediction
rainfall using historical weather data. The continuous System. That methodology involved some preprocessing of
data of rainfalls is converted into a binary classification weather data and applied a predictive model based on FCM.
of rain/no rain in order to identify the significant The parameters were further fine-tuned in accordance with
weather conditions preceding probable raining. In the the MBOA for making the accuracy of the forecast better.
quest to further optimize the optimality of accuracy and System approaches with a dataset of 25,919 samples
capability for generalization of this model, it divided the achieves maximum prediction accuracy at 94.22% and
preprocessed dataset into training sets and testing sets. proves to surpass the best known existing techniques. There
Features that affect rainfall are found, and the is real scope for the application of the system in smart city
contribution of importance is visualized, so aspects can environments for real time applications for rainfall
be understood. Along with the accuracy score, prediction..[1]
classification report and confusion matrix offer
performance metrics that could be used in "A Rain Prediction Using Machine Learning
differentiating rainy and non-rainy days. Techniques" is a research paper explaining the application
This alerting system will be built in through Twilio API of machine learning models in rainfall prediction-Multiple
in terms of SMS for predictions regarding rainfall, Linear Regression (MLR), Support Vector Regression
which can further be included in the weather alert (SVR), Lasso Regression. The authors lay huge emphasis
system. If it predicts rain, then the model would send a on the fact that rainfall is nonlinear and the only way to
text to the user with a message regarding this and also make accurate forecasts would, therefore, be through the
ask the person to take all the precautionary measures on significance of agriculture in the long run and preventing
time. In this model, flood monitoring may be applied in disastrous occurrences. The dataset used is in a range from
agricultural planning and, hence, could be applied for 1901 to 2015 and PCA used to reduce features.
controlling disasters in that real-time rain forecasts are Experimental results reflect that SVR performs better than
provided and decisions regarding various tasks are MLR in dealing with nonlinearities and yields the most
made accordingly. accurate predictions. [2]
Future development may add more functions of adding
data and integrating with IoT sensors where data is In "Machine Learning Techniques for Rainfall
automatically gathered. Prediction: A Review," Parmar, Mistree, and Sompura
explore techniques that are applied in machine learning for
Keyword's: Rainfall prediction, Random Forest the purpose of rainfall prediction; this paper has high
Classifier, weather forecasting, feature importance, Twilio lighting on the nonlinear nature of data regarding rainfall,
API, SMS alert system, flood monitoring, binary which poses challenges. It critiques the statistical and
classification, weather data analysis, real-time machine learning models, such as ARIMA, ANN, and
notifications SVM, and remarks that ANN is especially powerful in
rainfall forecasting as it can handle nonlinear patterns. The
I. INTRODUCTION
paper further summarizes studies in tabular form according
to accuracy and prediction attributes; it thus concludes that
This paper proposed an FCMM-RPS methodological
machine learning techniques, and particularly ANN, will be
framework based on Fuzzy Cognitive Maps, integrating with
a modified version of butterfly optimization algorithm to promising. [3]
helps determine which conditions most significantly affect
Edwin Salcedo Aliaga contributed to the paper the probability of rainfall: temperature, humidity, or
"Automatic System for Rainfall Monitoring and Prediction atmospheric pressure. This project contains training the
with IoT and Machine Learning" in the context of a low- classifier on split weather data for training and test and
cost IoT-based system for monitoring rainfall and related further assessing the model by metrics such as accuracy
climate data in Bolivia. Based on Arduino-style weather score, classification report, and confusion matrix.
stations with sensors for rainfall, temperature, humidity, and The usability aspect of inclusivity has been advanced by the
solar radiation, the system sends over GSM/GPRS data to a model being integrated with the Twilio API to provide, in
central web application. In terms of prediction of trends in real-time, SMS alerts based on predictions. It makes the
the weather and possible usage in agriculture and disaster output from the model purely analytical into actionable
preparation, it also incorporates machine learning form since it allows sending notifications regarding
specifically ARIMA. It discusses the dependency of the forecasted rainfall. It therefore makes it a useful decision-
system on GSM infrastructure and its plans to upgrade the making device in agriculture and also in flood-monitoring
existing system. [4] and public safety applications, where timely information
may be crucial for plans and action prevention.
The IoT based lava flood detection early warning
system consists of two main subsystems which are flood
detection and disaster communication. The flood detection II. LITERATURE SURVEY
system employs a rainfall intensity gauge, modified to
measure cold lava flows while a vibration sensor is used for Optimizing Convolutional Neural Networks for
detecting cold lava's presence. The obtained data is Document Image Classification. The authors compare
analyzed using a fuzzy decision tree to predict volcanic models optimized for document images with models
flood. The result is then sent to concerned parties using optimized for natural images. In an experimental
several IoT platforms that include SMS, WhatsApp comparison of network architectures, input preprocessing,
messaging, or radio communication. We run the system in a and data augmentation, the authors show how modifications
practical setting on Mount Merapi. The trial suggests that such as shear transformations, larger input sizes, and a
the system is able to issue the prediction and alert on time tailored architecture improve classification accuracy. Here,
which will further improve disaster preparedness and save the CNN achieves state-of-the-art accuracy at 91.03% on
lives by issuing evacuation as early as possible. [5] the RVL-CDIP dataset. Analysis shows that CNNs trained
on document images learn spatially specific features for
layout elements and thereby improve the performance on
It is a DSS for agriculture using the SMS technology to document-based tasks. [7]
update farmers about weather forecasts and agriculture
information. This web-based system is dedicated to farmers The paper deals with IoT-based EWS, an early warning
living in rural areas supplied with timely and relevant system to detect lava floods based on rainfall intensity and
weather forecasts and agricultural information for better seismic activities monitoring peripherally surrounding
planner and decision making to increase productivity. The active volcanoes. The EWS consists of rain gauges and
farmers may set SMS alerts for users to be notified of vibration sensors equipped with IoT-connectivity facility
weather changes, which will provide farmers with more for data analysis using the Kalman filters and a fuzzy
important climate information that will help to boost decision tree. Field tests on Mount Merapi, Indonesia, have
agricultural output. [6] shown that EWS can detect both rainfall intensities and
ground vibrations, giving warnings to local disaster
response teams via SMS texts, WhatsApp, and radio. This
would mean that a message could be delivered at a rapid
ALGORITHMS USED: pace in order to reduce the impact of such disaster events in
volcanic areas. [8]
The Random Forest Classifier algorithm uses the
Random Forest Classifier to predict rainfall mainly. This novel paradigm in the extreme convectional rain
Random Forest is an ensemble learning algorithm, which precipitations forecast model is further provided. ResNet-
combines multiple decision trees so as to enhance Attention-BiLSTM architecture conquers most of the
prediction accuracy and robustness. Generally, the limitations pointed out with previous techniques: keeping
individual trees in the forest make different predictions all important properties of the radar observation and
based on different subsets of data and features, and then the containing residual connections. Attention mechanisms are
final prediction is derived by aggregating the results of further brought out to mark important parts in the radar
these individual trees, mostly with a majority vote in data, and BiLSTM is used to capture related information
classification tasks. This reduces the likelihood of according to sequences in time. Testing was conducted to
overfitting, especially with complex datasets. In addition, see if it was replacing the previous methods that had
the Random Forest algorithm happens to be a good choice achieved significant improvements over mean absolute
for this problem since it can handle large datasets and errors as well as stability in predictability at all altitudes.
determine the feature importance. This feature importance
Adding this therefore introduces accuracy for instances of complex relationships and temporal dependencies over the
worse weather. [9] weather data.
Architecture: The RNN learns time-dependent patterns,
Paper Rainfall Prediction System provides for a whereas the CNN extracts spatial features from satellite or
machine learning approach to predict rainfall for three dry radar images. A variation of RNN known as Long Short-
regions in Sri Lanka: Anuradhapura, Vavuniya, and Maha Term Memory (LSTM) is highly effective in sequential
Illuppallama. The historical rainfall data for 2021-2023 data.
have been taken and multiple models like SARIMA, Advantages: High precision with complex datasets, flexible
Decision Trees, Random Forest, GBR, XGBoost, and to the different sources of the data.
LSTM have been applied with GBR coming out as highly
High requirements: They require large datasets, high
above the others in terms of accuracy and reliability. The
GBR model performed the best for Vavuniya with an MSE computational resources, and can be tougher to interpret.
of 7.45, MAE of 1.07, and ( R^2 ) of 0.93. This system
scaled up with MLOps provides practical advantages in IV. PROPOSED SYSTEM
agricultural planning and water management for the target
areas.[10] The integration of the accuracy and usability proposed
in the system makes the rainfall prediction a real-time alert
A metaheuristics-based system for rainfall prediction machine learning in real time. Its model classifies the days
using the FCMM-RPS technique introduces an advanced as either rainy or not according to the processing it does on
FCM-based rainfall prediction approach where FCM, being historical stream weather data with current or existing data.
traditionally known to suppress preprocessing of weather This is how the system can interpret weather conditions in a
data, has parameters upgraded by MBOA to ensure that
more complex manner, taking temperature, humidity, and
maximum prediction potential is realized. This model was
atmospheric pressure as precursors to rainfall. Feature
tested on a batch of 25,919 samples and resulted in reaching
a maximum prediction accuracy of 94.22%. Hence, it importance analysis enabled by Random Forest algorithms
outperformed any existing technique. It would act as a gives insight into the most influencing factors of weather,
promising system for a real-time application of rainfall which further leads to highly accurate and focused
forecasting in smart city environments. [11] predictions.
This model trains very fast, and thus performs very rapidly
III. EXISTING SYSTEM on new data and gives in-time predictions. Something
unique with this system is that it uses the Twilio API, which
1. Statistical Models by itself provides a tool to get real-time SMS notice. It
Overview: These models -ARIMA and STL- use historical automatically sends an SMS alert to the registered user in
data trends and seasonality for forecasting rainfall. case of the predicted fall of rain. This way, instant response
Architecture: They represent mathematical formulas that and preparation are possible. Most useful to agricultural
capture patterns from time-series data. users and flooding-prone residents where faster access of
Advantages: Easy to use, interpretable and very efficient for rainfall predictions may expect further damage on crops,
simple linear trends. manage resources, and aid early warning systems.
Disadvantages: Less effective in complex or nonlinear The model proposed in this article would offer a better high
relationships, sensitive to missing values. prediction accuracy and the actionable alert mechanism for
2. Machine Learning Models practical utility to communities and industries by offering
Overview: Basically, the models used include Random timely information related to their state of weather
Forest, Support Vector Machines (SVM), and Gradient conditions than the currently existing systems. Subsequent
Boosting Decision Trees (GBDT) relying on a set of iterations would be able to use data from IoT sensors in
weather features to classify or forecast the probability of order to further increase their predictive accuracies by fully
rainfall. automated and real-time acquisition of data for even timelier
Architecture: The underlying pattern in multiple variables rainfall predictions and alerts.
becomes recognized through feature engineering, optimizing
the prediction using cross-validation and hyperparameter
tuning.
Advantages: High accuracy, interpretable feature
importance, and flexibility in handling nonlinear data.
Disadvantages: They require careful tuning, are sensitive to
noise, and might be costly to compute.
3. Deep Learning Models
Synopsis: Among the models employed in this class of data Fig 1.1 Machine Translation Flow Chart
are the Recurrent Neural Networks (RNN) and
Convolutional Neural Networks (CNN), which break
really and clearly illustrate this model's capability to fuse
V. METHODOLOGY together predictive analytics with real-time manifestations
that permit the timely provision of information in quite
palatable ways. Future enhancements by the integration of
With this accuracy and usability incorporated into the
real-time external weather sources will do the world of good
system, the rain prediction makes this process a real-time concerning the improvement of accuracy and permitting the
alert machine learning in real time. Its model classifies the system situational adaptability, thus emerging as invaluable
days as either rainy or not according to the processing done assets towards personalized and actionable weather
on historical stream weather data with current or existing forecasting.
data. This is how the system can interpret weather
conditions in a more complex manner, taking temperature, VII. CONCLUSION
humidity, and atmospheric pressure as precursors to rainfall.
Feature importance analysis enabled by Random Forest This proposed rainfall prediction system indicates a
algorithms gives insight into the most influencing factors of good approach to predicting and alerting for the onset of
weather, which further leads to highly accurate and focused impending rainfall. The classifier uses a Random Forest
predictions. Classifier, which extensively develops the machine
This model trains very fast, and thus performs very rapidly learning to understand the complex relationship with
on new data and gives in-time predictions. The thing is, it high accuracy between the main variables in the
weather. Precise preprocessing and selection of relevant
uses the Twilio API, which, on its own, gives a tool that gets
features ensures the optimization of the model with
real-time SMS notice. The system automatically sends a
focus on critical influencing factors on rainfall, thus
notice via SMS to the registered user in case of the predicted ensuring reliability and robustness in various weather
fall of rain, hence an instant response and preparation scenarios.
possible. Most useful to the agricultural users and the flood- Perhaps the most evident feature is the use of the Twilio
prone residents where the faster access of rainfall API to post real-time SMS updates. This should bridge
predictions may expect further damage on crops, resource the gap between predictive data and practical
management, and aid early warning systems. application, ensuring that rain predictions don't become
The model proposed in this paper would provide a better a mere attempt at gaining insight. With pop-up alerts,
high prediction accuracy and the actionable alert mechanism the negative effects may be minimized because all
for practical utility to communities and industries by agricultural people, disaster management, and many
providing timely information related to their state of weather other sectors relying on weather can be notified in
conditions than the currently existing systems. Subsequent advance.
This project clearly states the opportunities of the
iterations would be able to use data from IoT sensors in
present and future in machine learning and cloud-based
order to further increase their predictive accuracies by fully
communication services for the enhancement of
automated and real-time acquisition of data for even timelier traditional means of weather prediction. Although the
rainfall predictions and alerts. current versions are accurate enough, future editions can
include more sources of data, such as IoT-enabled
VI. RESULT weather sensors, to make the system more accurate and
responsive. The enhanced rainfall prediction system
Hence, rainfall prediction represents the way in which shall be available for communities and industries with
the offered prediction via historical weather data may be valuable tools easily accessible for real-time weather
very helpful for rainy and non-rainy days. This way, intelligence, supporting proactive planning and
Random Forest Classifier executed an on-site demonstration resilience.
successfully to show that it was capable of generalizing
reasonably with sufficient accuracy on the test set for unseen VIII. REFERENCES
data. Other performance metrics such as classification
reports and confusion matrices also show a highly balanced IEEE Citation: M. Mohammed, R. Kolapalli, N. Golla,
score on the precision and recall parts of the model itself, and S. S. Maturi, "Prediction Of Rainfall Using
avoiding any issues regarding false positives and false Machine Learning Techniques," International Journal
negatives. This will assuredly predict rain, which is very
of Scientific & Technology Research, vol. 9, no. 1, pp.
worthwhile to places where fault in alerts could lead to
exhaustion of resources or closure of opportunities for 3236–3240, Jan. 2020. [1]
preemptive acts.
Apart from being accurate, it measures practical IEEE Citation: M. Mohammed, R. Kolapalli, N. Golla,
implications through real-time alerting through the API and S. S. Maturi, "Prediction Of Rainfall Using
contact of Twilio. Alerts will be sent to users via text Machine Learning Techniques," International Journal
messages once there is an immediate forecast of rain. This of Scientific & Technology Research, vol. 9, no. 1, pp.
hands over the users enough time to react to changing 3236–3240, Jan. 2020. [2]
weather situations. The notification capability, one
integrated with the power of machine learning, goes on to
IEEE Citation: A. Parmar, K. Mistree, and M.
Sompura, "Machine Learning Techniques for Rainfall
Prediction: A Review," 2017 International Conference
on Innovations in Information, Embedded and
Communication Systems (ICIIECS), 2017, pp. 1-8..
[3]