Crop Prediction
Crop Prediction
degree of
B.Tech.
In
1
DECLARATION
Date:
2
Certificate
Signature: Signature:
3
ACKNOWLEDGEMENT
It gives us a great sense of pleasure to present the report of the B.Tech. Project undertaken
during B.Tech. Second Year. We owe special debt of gratitude to our project supervisor
Subhash Singh Parihar, Department of Computer Science and Engineering, Pranveer Singh
Institute of Technology, Kanpur for his constant support and guidance throughout the course
of our work. His sincerely, thoroughness and perseverance have been a constant source of
inspiration for us. It is only his cognizant efforts that our endeavours have seen light of the
day.
We also take the opportunity to acknowledge the contribution of Professor Dr. Vishal Nagar,
Dean Computer Science & Engineering Department, Pranveer Singh Institute of Technology,
Kanpur for his full support and assistance during the development of the project.
We also do not like to miss the opportunity to acknowledge the contribution of all faculty
members of the department for their kind assistance and cooperation during the development
of our project. Last but not the least, we acknowledge our friends for their contribution in the
completion of the project.
Signature: Signature:
Name: Shreya Shukla Name: Himanshi Singh
Roll No.: 2101641520133 Roll No.: 2101641520074
Signature: Signature:
Name: Ranu Pandey Name: Krati Itondia
Roll No.: 2101641520114 Roll No.: 2101641520081
Signature:
Name: Muskan Shah
Roll No.: 2101641520093
4
4
ABSTRACT
Online examination is one of the crucial parts for online education system. It is
efficient, fast enough and reduces the large amount of material resource. An
examination system is developed based on the web. This paper describes the
principle of the system, presents the main functions of the system, analyses the
auto-generating test paper algorithm, and discusses the security of the system.
It is undeniable that online education provides ample of benefits to young
learners. Nevertheless, there are also many negative implications from online
education. Limited collaborative learning, increase in time and effort are the
several negative implications from online education. This study examines the
implications of online education among students especially in a private higher
learning institution and its effect towards Malaysian national education system.
Information has been collected through surveys, interviews and together with
secondary data, and were analysed using SPSS. The studies found that there are
various serious issues regarding online education and on its effect on the quality
of Malaysian Education System to certain extend. Several problems have been
identified and these issues have to be solved in order to sustain the quality of
education for future generations. Furthermore, Ministry of Higher Education
5
(MOHE) should formulate a standard policy, monitor closely the implementation
of online education, evaluate and review the method used in teaching and
upgrade to maintain the quality of online education in private higher education
institution.
6
TABLE OF CONTENTS
DECLARATION…………………………………. ii
CERTIFICATE…………………………………... iii
ACKNOWLEDGEMENTS…………………….… iv
ABSTRACT…………………………………….…. v
LIST OF TABLES………………………………... vi
LIST OF FIGURES…………………………..…. viii
LIST OF SYMBOLS………………………….…. ix
LIST OF ABBREVIATIONS………………….… ix
1. INTRODUCTION 1-11
1.1 MOTIVATION 3
1.2 BACKGROUND OF PROBLEM 4
1.2.1. Current System 4
1.2.2. Issues in Current System 5
1.2.2.1 Functionality Issues 6
1.2.2.2 Security Issues 7
1.3. PROBLEM STATEMENT 9
1.4. PROPOSED WORK 10
1.5. ORGANIZATION OF REPORT 11
3. IMPLEMENTATION 18-29
7
3.1 Code 18
3.2 Figures of Project 29
ENHANCEMENTS
5.1 Conclusion 33
REFERENCES 36
8
LIST OF FIGURES
In design methodology
Architecture diagram
Flow chart
In implementation
Login page
Admin panel
Addition
Add exam
Question designing
Student details
Sign in
Examination page
Submission
Results to admin
Results to student
9
LIST OF SYMBOLS
% Percentage
Arrow symbol for flow of data
Result of diagram
Process
LIST OF ABBREVIATIONS
ID Identification
UI User Interface
10
CHAPTER 1:
INTRODUCTION
One of the most essential aspects of human survival is agriculture which is the main source of
food. Unfortunately most of the farmers in our country use the normal way of farming which
may be a hectic process to investigate data manually associated with soil and crops. This
problem could be solved by using modern farming methods. The agriculture sector
contributes a lot to the country’s economic process, it's necessary to introduce the latest
technologies such as IoT, automation, etc. in agriculture which relatively improves the crop
agriculture results in effective crop health monitoring without human involvement within the
field. The Internet of things is that the network of physical objects embedded with sensors,
irrigation system, atmospheric conditions like temperature, humidity. IoT technology used in
collecting information about conditions like weather, rainfall, humidity, temperature, and soil
moisture. Wireless sensor networks are used for monitoring the farm conditions and
microcontrollers are accustomed to control and automate the farm processes to look at
remotely the conditions within the kind of image and video, wireless cameras are used. A
smartphone allows farmers to stay updated with the continued conditions of his agricultural
land using IoT at any time and any a part of the world. IoT technology can minimize cost and
enhance the productivity of traditional farming. The use of cloud services and creating a
graphical user interface will bring healthy monitoring very easy. Farmers need not to
understand the concept of using the data, GUI will make it easier to take correct decisions.
11
environmental parameters such as rainfall percentage, atmospheric humidity, temperature,
etc. The microcontroller ATMEGA328P and sensor nodes with wireless transceiver module
supported Zig bee protocol is used in designing the system. Web application and database
enables in retrieving and storing the data. In this experiment the sensor node failure and
energy efficiency are monitored. An experiment conducted on smart agriculture greenhouse
monitoring system based on ZigBee technology. The system performs data acquisition,
processing, transmission, and reception functions. The objective of the experiment is to
understand the greenhouse environment system, where the system is efficient in managing
the environmental area and reduces the cost of farming and also saves energy. The gateway
has a Linux operating system and cortex A8 processor which act as a core. Overall the
planning implements remote smart monitoring and control of greenhouse and also replaces
the old wired technology to wireless, also reduces manpower cost. Operations and fulfillment
are suitable places to prove efficiency gains. Researchers studied the work of a rural farming
community that replaces some of the traditional techniques. The sensor nodes have different
external sensors namely soil moisture sensor, soil pH, atmospheric humidity, and temperature
sensors connected to it. Based on the soil moisture, the sensor activates a motor for water
discharging during the period of water scarcity and switches off after the required amount of
water is discharged. This leads to conservation of water and soil pH is shipped to the bottom
layer and successively base layer intimates the farmer about soil pH via SMS using GSM
model. This information helps the farmers to reduce the amount of fertilizers used. A
development of rice crop monitoring using IoT is proposed to provide a helping hand in real-
time monitoring and increasing rice production. The automated control of water discharge for
irrigation and the ultimate supply of information is implemented using a wireless sensor
network.
12
The parameters of a crop are determined using sensors like temperature, soil, humidity. These
data are compared with pre-determined values and accordingly the crop condition is notified
to the farmer remotely using GSM, thus reducing physical effort. This information about the
crops is notified through a telephonic message to the farmer so that he or she can utilize his or
her time on better production units. This combination of traditional methods with this
technology will result in agricultural modernization. The modern farmer is unable to identify
how the various environment parameters like humidity and temperature affect their crop.
Despite the rapid spread of mobile connectivity and mobile internet in the country, efficient
and cheap methods to exploit the same to increase efficiency and productivity remain out of
reach. Thus one of the most important challenges is the lack of proper monitoring and control
mechanisms for efficient farming. This paper explains the development of a prototype of an
efficient Plant growth monitoring system, which along with providing data about the
environmental parameters surrounding the plant, which are vital to the plant’s growth. The
proposed idea discusses a cost effective system that receives data about the conditions
surrounding the plants from various sensors in the system.
13
MOTIVATION
Farming has been one of the initial factors in establishing human civilization for more than
10,000 years. Back then only minimum handy tools were used for activities such as
harvesting, sowing, etc. Today after thousands of years we have advanced immensely in
also being witnessed , hence, demand for food supply has increased drastically. In such a fast
growing era it becomes even more important to monitor the agricultural sector than before,
because even a single problem in a crops seasons due to any threat can affect the supply.
Therefore it becomes very important to have a smart agricultural system for monitoring crop
14
Thus, by implementing the following methods suggested by the authors conventional
1.By developing smart applications which can capture images of crops and analyse their
health conditions.
2.Using drones , which will be monitoring the whole field without human intervention and
help in detecting the plants affected by pests and sprinkle pesticides over the affected area.
3.Using wireless rovers for examining the soil conditions, and also useful for identifying
4.Using various AI sensors for keeping track of the moisture content of the soil and notifying
Plant disease detection can be achieved successfully using deep learning. Deep learning has
played a very significant role in identifying plant defects or diseases. Deep learning
techniques have proven to be a strong tool as it has the capacity to handle an immense
amount of datasets which improves the chances for better detection. Deep Learning
algorithms like CNN. With the help of image processing we can use CNN for recognition of
various patterns in the dataset. CNN is quite adaptable and has a less complex structure than
BACKGROUND OF PROBLEMS: -
crop conditions in near-real-time, most of which rely on maps of anomalies of metrics from
15
crop growth dynamics. These methods require a seamless and comparative historical archive
of metrics and a real-time satellite data processing capacity to produce biophysical products
using dedicated algorithms. The differences are then qualitatively interpreted as crop growth
classes. There are three ways to present these metric differences: (1) an anomaly map at a
specific date, indicating spatial variations and offering comparisons across large regions; (2)
aggregated profiles of current and reference years to reflect the development of crops over the
growing season for the specific spatial extent, as derived from the VI time series, showing the
start, length, ascending and descending slope, and peak of crop greenness; and (3) spatial
clustering maps in which pixels reflecting similar crop development conditions are grouped.
losses, and drought assessments are incorporated in most CMSs as part of the crop condition
evaporation rates can propagate from a meteorological drought into an agricultural drought,
leading to a reduction in crop yield or even to complete crop failures. In this regard, many
drought indices have been developed for detecting meteorological droughts caused by climate
variabilities, such as the standardized precipitation index (SPI), the standardized precipitation
evapotranspiration index (SPEI) and the Palmer drought severity index (PDSI).
indices have been developed, such as the vegetation condition index (VCI) and mean
vegetation condition index(MVCI). Given that drought events cause soil moisture drying and
land surface temperature (LST) changes, the temperature condition index (TCI) based on
LSTs, soil moisture agricultural drought index (SMADI), evaporative stress index (ESI) and
hydrothermal weather index (HWI) have been proposed to determine agricultural drought
16
conditions. The LST is a timely response indicator that can reflect crop stress before
substantial visual symptoms arise. The popular VHI is the combination of VCI and TCI
diseases or pests. Nutrients have been reported as another stress factor after water stress at the
global scale. Inversion algorithms involving CCC or leaf nitrogen content (LNC) have been
developed to detect nutrient stress in wheat and other crops. Although many visible-band VIs
have been developed to relate to chlorophyll or nitrogen content, red-edge bands, special
bands located between the red and near-infrared bands, have been proven to be more
sensitive to chlorophyll content. The advantage of the red-edge band in detecting the nutrient
status or chlorophyll content has been continuously demonstrated, but it is only effective for
dense crops. Sentinel-2 satellites have three red-edge bands, making it possible to detect
Many metrics have been developed to identify types of diseases and pests, assess the
corresponding infection severities, and map their distributions at the plot or regional scale.
However, prior knowledge is needed to identify the types of local disease/pest or other
stresses that occur in the field. In areas for which prior knowledge is lacking, it is thus
difficult to proactively achieve reliable and precise assessment , as a variety of signs and
plant damage caused by crop diseases and pests can also be caused by other factors, such as
nutrition deficiencies, thus leading to challenges when attempting to separate the actual stress
factor. For instance, the photochemical reflectance index (PRI) is used not only for wheat
yellow rust detection, but has also commonly been used to detect water stress, frost stress and
17
damage, and nitrogen content and stress. New indicators and metrics are needed to
PROBLEM STATEMENT
Limitations in current crop monitoring methods have been identified in the previous sections.
Some limitations, including in situ data accessibility and knowledge-based analysis, might
reduce the applicability of crop monitoring and lead to uncertain and undesirable
consequences.
Existing methods usually require new ground-truth data for each new setting to parameterize
algorithms and models and assess their accuracies. The field sampling requirements prevent
most global systems from obtaining crop area estimates and yield prediction components
(Table 1), as collaboration with local institutions is required to conduct field work and access
to in situ data for training and calibrating algorithms and modelling outside national
strategy to leverage partner investments and to ensure that data are curated with a standard
protocol. GEOGLAM embraces the Global Earth Observation System of Systems (GEOSS)
Data Sharing Principles that encourage full open sharing of data, including both EO data
and in situ data, although open sharing of in situ data is still a challenge, as it sometimes
involves privacy issues. Nevertheless, the GEOGLAM Joint Experiment for Crop
Assessment and Monitoring (JECAM) initiative demonstrates a best practice method of data
scientific data papers. However, the ground-truth dataset is not yet fully publicly available
due to restrictions imposed by the in situ providers. Moreover, in the foreseen future, it is
unrealistic to expect the full sharing of ground-truth data with increasing trade tensions and
18
strained global cooperation. Therefore, the requirement for ground-truth data in crop
monitoring should encourage crop monitoring activities at the domestic and local levels.
As in situ data collection is one of the major challenges for crop monitoring, closing the
ground-truth data gaps and improving the data collection efficiency are essential for
strengthening the reliability of crop monitoring. However, the acquisition of field data,
especially at large scales, is time- and cost-consuming and labour-intensive. To address this
issue, crowdsourcing might provide an alternative and efficient solution for acquiring field-
based data . Crowdsourcing information has become a widespread data acquisition method in
environmental and resource monitoring , serving as a potential solution for closing the
ground-truth data gaps. With the wide use of mobile phones, smartphone sensors, such as
cameras, satellite positioning, and photoreceptors, have become major platforms for
crowdsourcing information collection. A mobile global positioning system (GPS)-video-
geographic information systems (GIS) application (called a GVG app) can collect such data
as crop types, planting dates, irrigation and expected yields with corresponding geolocation
information. Convolutional neural networks (CNNs) have been used to automatically identify
crop types from Google Street photos or GVG photos.
Data collection of the actual crop yield is not only labour-intensive and costly, but also
difficult to be implemented efficiently. It relies on the grain harvest of samples in the field
with uncertainties in both sampling and unavoidable grain losses during harvest. A new
method for field yield data measurement involving AI and computer vision to count the
numbers of spikes, seed numbers per spike and the sizes of seeds for weight determination
(Fig. 4) is urgently needed for integration into GVG.
19
in the form of regular bulletins. Therefore, analysts must specialize in the specific region for
which they have expertise in regional agroclimatic conditions and management practices if
they are to understand how the crop indicators generated by the system describe the actual
yield variations in that region. In this case, the personal knowledge, views, or preferences of
analysts all affect their working practices.
Alternatively, crop monitoring should be inclusive of users and provide user-driven services.
All components and functions of Crop Watch, including the self-calibration abilities of
models and the collaborative analyses of indicators, were transferred to APIs in the Crop
Watch-Cloud, which enables users to carry out self-serviced crop monitoring by selecting
their preferred indicators for the user's area of interest. This allows users to complete crop
monitoring independently and autonomously from the data download to the final synthesized
analysis.
For example, with the support of a customized CropWatch for Mozambique local conditions,
officials in the Mozambique Ministry of Agriculture and Rural Development (MARD) who
respond to crop monitoring and earlier warming can apply specific programming language
environments to call APIs and organize processing workflows and can also set up self-
defined projects/systems for any areas of interest in their country by invoking the appropriate
APIs. As users from MARD have defined the modules themselves, calibrated and used the
tools, MARD enhances the capability and reliability of crop monitoring for Mozambique
without additional investment in storage and computational resources. This effort was
recognized as one of the best rural solutions in 2020 by the International Fund for
Agricultural Development and one of the good practices in South–South and Triangular
Cooperation for Sustainable Development.
Furthermore, it would be better for users to obtain crop information from their own systems
or from different sources to ensure the reliability and representativeness of information and to
prevent unconscious biases. This is why, immediately after the global food crisis of 2008, the
Group of Twenty (G20) Agriculture Ministers launched a crop monitoring initiative with
international participation, i.e. GEOGLAM during the French G20 Presidency in 2011. The
objectives of GEOGLAM were to increase market transparency, improve food security and
stabilize commodity prices by producing and disseminating crop information and enhancing
crop monitoring capacities. The dissemination of global or regional crop information from
20
various hosts, including Crop Watch, increases the availability and transparency of food-
related information by providing regularly released bulletins and reports.
PROPOSED WORK
Crop prediction is an important application of machine learning that can help farmers make
informed decisions about their crops. The proposed work of crop prediction through ML
involves using various machine learning algorithms to predict the yield of different crops
The first step in this process is to gather data on various factors that can affect crop yield,
such as soil moisture, temperature, rainfall, and nutrient levels. This data can be collected
through various sources such as weather stations, soil sensors, and satellite imagery.
Once the data has been collected, it can be processed and analyzed using machine learning
algorithms such as linear regression, decision trees, random forests, and neural networks.
21
These algorithms can be trained on historical data to identify patterns and relationships
The next step is to use the trained model to predict crop yields for future growing seasons.
This can help farmers make informed decisions about when to plant, what crops to plant, and
One important aspect of this work is the need for accurate and up-to-date data. This can be a
challenge in some regions where data may be scarce or unreliable. In addition, the models
may need to be customized for different crop types and growing conditions.
CHAPTER 2:
DESIGN METHODOLOGY
Data is a very important part of any Machine Learning System. As the climate
changes from place to place, it was necessary to get data at district level.
Historical data about the crop and the climate of a particular region was needed to
implement the system. This data was gathered from different government
websites. The data about the crops of was gathered from www.data.gov.in and
the data about the climate was gathered from www.imd.gov.in. The climatic
parameters which affect the crop the most are precipitation, temperature, cloud
cover, vapour pressure, wet day frequency. So, the data about these climatic
parameters was gathered at a monthly level. Dataset Collection: In this phase, we
collect data from various sources and prepare datasets. And the provided dataset
is in the use of analytics (descriptive and diagnostic). There are several online
22
abstracts sources such as Data.gov.in and indiastat.org. For at least ten years the
yearly abstracts of a crop will be used. These datasets usually accept behaviour of
anarchic time series. Combined the primary and necessary abstracts. Random
Forests for Global and Regional Crop Yield Predictions. Data Partitioning: The
Entire dataset is partitioned into 2 parts: for example, say, 75% of the dataset is
used for training the model and 25% of the data is set aside to test the model. To
predict future events Machine Learning Algorithms: Supervised learning:
Supervised machine learning algorithms can apply what has been learned in the
past to new data using labelled examples. After Sufficient training the system can
provide targets for any new input. IN order to change the model accordingly the
learning algorithm can also differentiate its results with the correct, intended
output and find errors. Unsupervised learning: IN comparison, unsupervised
machine learning algorithms are used when the information used to train is
neither labelled nor classified. Unsupervised learning does analysis of how
systems can infer a function to describe a hidden structure from unlabelled data.
In order to describe hidden structures from unlabelled data the system doesn’t
figure out the right output, but it examines the data and can draw inferences from
datasets.
Data Collection: Collecting relevant data is the first step in any ML-based
prediction model. The data can include historical data on crop yield, weather
patterns, soil quality, fertilizer usage, and other factors that may impact crop
production.
Data Analysis: In this step, exploratory data analysis (EDA) techniques are
applied to understand the relationships between different features and crop yields.
This step helps identify the most relevant features for crop yield prediction.
23
Model Selection: Once the relevant features are identified, the next step is to
select an appropriate ML model for crop yield prediction.
Some popular models include linear regression, decision trees, random forests,
and neural networks.
Model Training: The selected model is trained using the preprocessed data, and
the training process involves iteratively adjusting the model's parameters to
minimize the error between the predicted crop yield and the actual yield.
24
Fig 2. A data flow diagram (DFD) maps out the flow of information for any
process or system.
25
26
Fig. 1. Proposed Approach Fig. 1. Shows the proposed approach and how the
data is summarized, and Random Forest algorithm is applied, and the result is
calculated.
Firstly, the research questions are defined. When research questions are ready, databases are
used to select the relevant studies. The databases that were used in this study are Science
Direct, Scopus, Web of Science, Springer Link, Wiley, and Google Scholar. After the
selection of relevant studies, they were filtered and assessed using a set of exclusion and
quality criteria. All the relevant data from the selected studies are extracted, and eventually,
the extracted data were synthesized in response to the research questions. The approach we
followed can be split up into three parts: plan review, conduct review, and report review.
The first stage is planning the review. In this stage, research questions are identified, a
protocol is developed, and eventually, the protocol is validated to see if the approach is
feasible. In addition to the research questions, publication venues, initial search strings, and
publication selection criteria are also defined. When all of this information is defined, the
protocol is revised one more time to see if it represents a proper review protocol
27
The second stage is conducting the review, which is represented in fig. When
conducting the review, the publications were selected by going through all the
databases. The data was extracted, which means that their information regarding
authors, year of publication, type of publication, and more information regarding
the research questions were stored. After all the necessary data was extracted
correctly, the data was synthesized in order to provide an overview of the relevant
papers published so far.
28
CHAPTER 3
IMPLEMENTATION
3.1 Pseudo Code of the Proposed System
2.Barplotting
29
30
3.Head data
4.Data information
6.Functions of data
31
5.Tail data
32
3.2 Code Format
import pyttsx3 #
Importing pyttsx3 library to convert text into speech.
import pandas as pd #
Importing pandas library
from sklearn import preprocessing #
Importing sklearn library. This is a very powerfull library for machine
learning. Scikit-learn is probably the most useful library for machine
learning in Python. The sklearn library contains a lot of efficient tools for
machine learning and statistical modeling including classification,
regression, clustering and dimensionality reduction.
from sklearn.neighbors import KNeighborsClassifier #
Importing Knn Classifier from sklearn library.
import numpy as np #
Importing numpy to do stuffs related to arrays
import PySimpleGUI as sg #
Importing pysimplegui to make a Graphical User Interface.
engine = pyttsx3.init('sapi5') #
Defining the speech rate, type of voice etc.
voices = engine.getProperty('voices')
rate = engine.getProperty('rate')
engine.setProperty('rate', rate-20)
engine.setProperty('voice',voices[0].id)
def speak(audio): #
Defining a speak function. We can call this function when we want to make our
program to speak something.
engine.say(audio)
engine.runAndWait()
le = preprocessing.LabelEncoder() #
Various machine learning algorithms require numerical input data, so you need
to represent categorical columns in a numerical column. In order to encode
this data, you could map each value to a number. This process is known as
label encoding, and sklearn conveniently will do this for you using Label
Encoder.
33
crop = le.fit_transform(list(excel["CROP"])) #
Mapping the values in weather into numerical form.
NITROGEN = list(excel["NITROGEN"]) #
Making the whole row consisting of nitrogen values to come into nitrogen.
PHOSPHORUS = list(excel["PHOSPHORUS"]) #
Making the whole row consisting of phosphorus values to come into phosphorus.
POTASSIUM = list(excel["POTASSIUM"]) #
Making the whole row consisting of potassium values to come into potassium.
TEMPERATURE = list(excel["TEMPERATURE"]) #
Making the whole row consisting of temperature values to come into
temperature.
HUMIDITY = list(excel["HUMIDITY"]) #
Making the whole row consisting of humidity values to come into humidity.
PH = list(excel["PH"]) #
Making the whole row consisting of ph values to come into ph.
RAINFALL = list(excel["RAINFALL"]) #
Making the whole row consisting of rainfall values to come into rainfall.
features = features.transpose()
# Making transpose of the features
print(features.shape)
# Printing the shape of the features after getting transposed.
print(crop.shape)
# Printing the shape of crop. Please note that the shape of the features and
crop should match each other to make predictions.
model = KNeighborsClassifier(n_neighbors=3)
# The number of neighbors is the core deciding factor. K is generally an odd
number if the number of classes is 2. When K=1, then the algorithm is known as
the nearest neighbor algorithm.
model.fit(features, crop)
# fit your model on the train set using fit() and perform prediction on the
test set using predict().
layout = [[sg.Text(' Crop Recommendation Assistant',
font=("Helvetica", 30), text_color = 'yellow')],
# Defining the layout of the Graphical User Interface. It consist of some
text, Buttons, and blanks to take Input.
[sg.Text('Please enter the following details :-', font=("Helvetica",
20))],
34
# We have defined the text size, font type, font size, blank size, colour of
the text in the GUI.
[sg.Text('Enter ratio of Nitrogen in the soil
:', font=("Helvetica", 20)), sg.Input(font=("Helvetica",20), size = (20,1) )],
[sg.Text('Enter ratio of Phosphorous in the soil
:', font=("Helvetica", 20)), sg.Input(font=("Helvetica", 20),size = (20,1))],
[sg.Text('Enter ratio of Potassium in the soil
:', font=("Helvetica", 20)), sg.Input(font=("Helvetica", 20),size = (20,1))],
[sg.Text('Enter average Temperature value around the field :',
font=("Helvetica", 20)), sg.Input(font=("Helvetica", 20),size = (20,1)),
sg.Text('*C', font=("Helvetica", 20))],
[sg.Text('Enter average percentage of Humidity around the field :',
font=("Helvetica", 20)), sg.Input(font=("Helvetica", 20),size = (20,1)),
sg.Text('%', font=("Helvetica", 20))],
[sg.Text('Enter PH value of the soil
:', font=("Helvetica", 20)), sg.Input(font=("Helvetica", 20),size = (20,1))],
[sg.Text('Enter average amount of Rainfall around the field
:', font=("Helvetica", 20) ), sg.Input(font=("Helvetica", 20),size =
(20,1)),sg.Text('mm', font=("Helvetica", 20))],
[sg.Text(size=(50,1),font=("Helvetica",20) , text_color = 'yellow',
key='-OUTPUT1-' )],
[sg.Button('Submit', font=("Helvetica", 20)),sg.Button('Quit',
font=("Helvetica", 20))] ]
window = sg.Window('Crop Recommendation Assistant', layout)
while True:
event, values = window.read()
if event == sg.WINDOW_CLOSED or event == 'Quit':
# If the user will press the quit button then the program will end up.
break
print(values[0])
nitrogen_content = values[0]
# Taking input from the user about nitrogen content in the soil.
phosphorus_content = values[1]
# Taking input from the user about phosphorus content in the soil.
potassium_content = values[2]
# Taking input from the user about potassium content in the soil.
temperature_content = values[3]
# Taking input from the user about the surrounding temperature.
humidity_content = values[4]
# Taking input from the user about the surrounding humidity.
ph_content = values[5]
# Taking input from the user about the ph level of the soil.
rainfall = values[6]
# Taking input from the user about the rainfall.
predict1 = np.array([nitrogen_content,phosphorus_content,
potassium_content, temperature_content, humidity_content, ph_content,
35
rainfall],dtype=float) # Converting all the data that we collected from the
user into a array form to make further predictions.
print(predict1)
# Printing the data after being converted into a array form.
predict1 = predict1.reshape(1,-1)
# Reshaping the input data so that it can be applied in the model for getting
accurate results.
print(predict1)
# Printing the input data value after being reshaped.
predict1 = model.predict(predict1)
# Applying the user input data into the model.
print(predict1)
# Finally printing out the results.
crop_name = str()
if predict1 == 0:
# Above we have converted the crop names into numerical form, so that we can
apply the machine learning model easily. Now we have to again change the
numerical values into names of crop so that we can print it when required.
crop_name = 'Apple(सेब)'
elif predict1 == 1:
crop_name = 'Banana(केला)'
elif predict1 == 2:
crop_name = 'Blackgram(काला चना)'
elif predict1 == 3:
crop_name = 'Chickpea(काबुली चना)'
elif predict1 == 4:
crop_name = 'Coconut(नारियल)'
elif predict1 == 5:
crop_name = 'Coffee(कॉफ़ी)'
elif predict1 == 6:
crop_name = 'Cotton(कपास)'
elif predict1 == 7:
crop_name = 'Grapes(अंगूर)'
elif predict1 == 8:
crop_name = 'Jute(जूट)'
elif predict1 == 9:
crop_name = 'Kidneybeans(राज़में)'
elif predict1 == 10:
crop_name = 'Lentil(मसूर की दाल)'
elif predict1 == 11:
crop_name = 'Maize(मक्का)'
elif predict1 == 12:
crop_name = 'Mango(आम)'
elif predict1 == 13:
crop_name = 'Mothbeans(मोठबीन)'
elif predict1 == 14:
crop_name = 'Mungbeans(मूंग)'
elif predict1 == 15:
36
crop_name = 'Muskmelon(खरबूजा)'
elif predict1 == 16:
crop_name = 'Orange(संतरा)'
elif predict1 == 17:
crop_name = 'Papaya(पपीता)'
elif predict1 == 18:
crop_name = 'Pigeonpeas(कबूतर के मटर)'
elif predict1 == 19:
crop_name = 'Pomegranate(अनार)'
elif predict1 == 20:
crop_name = 'Rice(चावल)'
elif predict1 == 21:
crop_name = 'Watermelon(तरबूज)'
37
if int(phosphorus_content) >= 1 and int(phosphorus_content) <= 50:
# Here I have divided the phosphorus values into three categories.
phosphorus_level = 'less'
elif int(phosphorus_content) >= 51 and int(phosphorus_content) <=100:
phosphorus_level = 'not to less but also not to high'
elif int(phosphorus_content) >=101:
phosphorus_level = 'high'
print(crop_name)
print(humidity_level)
print(temperature_level)
print(rainfall_level)
print(nitrogen_level)
print(phosphorus_level)
print(potassium_level)
print(phlevel)
speak("Sir according to the data that you provided to me. The ratio of
nitrogen in the soil is " + nitrogen_level + ". The ratio of phosphorus in
the soil is " + phosphorus_level + ". The ratio of potassium in the soil is
" + potassium_level + ". The temperature level around the field is " +
temperature_level + ". The humidity level around the field is " +
humidity_level + ". The ph type of the soil is " + phlevel + ". The amount of
rainfall is " + rainfall_level ) # Making our program to speak about the
data that it has received about the crop in front of the user.
window['-OUTPUT1-'].update('The best crop that you can grow : ' +
crop_name ) # Suggesting the best crop
after prediction.
speak("The best crop that you can grow is " + crop_name)
# Speaking the name of the predicted crop.
38
window.close()
39
3.3 Results
To predict the crop yield rate a application is created. This application includes three parts.
First is managing datasets, second is testing datasets and third is analyzing datasets. In
managing datasets we can get the datasets of previous years and they can also be converted
into supporting format.
40
41
CHAPTER 4
TESTING/RESULT AND ANALYSIS
The practice of cumulating and Data collection is a way to keep track of past
occurrences so that one can utilize da repetitive patterns. The ‘Crop
Recommendation’ dataset is collected from the Kaggle website. The dataset takes
into account 22 different crops (N) (ii) Phosphorus content ratio (P) expressed in
degree Celsius (v) Percentage of Relative Humidity measured in millimeters.(vi)
ph value and (vii) Rainfall measured in millimeters.
The process of modifying raw data into a form learning algorithms to find
insights or forecast outcomes is called Data preprocessing. In this project the data
processing method is to find missing values. Getting every data point for every
record in dataset is tough. Empty cells, values like null or a specific character,
such as a question mark, might all indicate that data is missing. The dataset used
in the project didn’t have any missing values.
42
4.3. Train and Test
It is a process of splitting the dataset into a training dataset and testing dataset
using train_test_split() method of scikit learn module. 2200 data in the dataset has
been divided as 80% of a dataset into training dataset-1760 and 20% of a dataset
into testing dataset-440 data.
Scoring, often known as prediction, is the act of creating values from new input
data using a trained machine learning model. Using model.score() method
calculating the score of each model over a training dataset shows how well the
model has learned.
43
4.7 Result
44
CHAPTER 5
CONCLUSION AND FUTURE
ENHANCEMENTS
5.1 Conclusion
The comparative study of three different supervised machine learning models (KNN,
Decision Tree,and Random Forest) is done to predict the best-suited crop for the particular
land that can help farmers to grow crops more efficiently. In completion, we concluded that
the crop prediction dataset showed the best accuracy with Random Forest Classifier both in
Entropy and Gini Criterion with 99.32%. In contrast, K-Nearest Neighbor has the lowest
accuracy among the three with 97.04%, and the accuracy of Decision Tree Classifier is in
between KNN and Random Forest Classifier. When comparing the accuracy value, Decision
Tree Gini criterion gave a better accuracy of 98.86% compared to Decision Tree Entropy
Criterion. In the future, new data from the fields can be collected to get a clear image of the
soil and incorporate other machine learning algorithms and deep learning algorithms such as
ANN or CNN to classify more varieties of crops.
The KNN algorithm is a popular machine learning algorithm used in crop prediction. It is
used to predict crop yield and production based on historical data of weather, soil, and crop
types. The algorithm works by finding the k-nearest neighbors of a new data point, and then
predicting the outcome based on the majority vote of those neighbors.
In conclusion, KNN algorithm is a promising approach for crop prediction. However, the
accuracy of the model heavily depends on the quality and quantity of data used for training.
Additionally, there are other factors such as weather changes, pest attacks, and changes in
soil composition that can affect crop yield, and therefore, these factors should also be
considered in the crop prediction model.
45
5.2 Future Enhancements
We have to collect all required data by giving GPS locations of a land and by taking access
from Rain forecasting system of by the government, we can predict crops by just giving GPS
location. Also, we can develop the model to avoid over and under crisis of the food. We
believe the proposed system will be able to help farmers to take the right decision of
cultivating the right crop. A farmer can plant different crop in different districts based on the
system recommendations. So, every farmer will get the chance of maximizing their yield and
profit by using the system. Our main goal is to produce more with less as even being a
developing country; we are almost using all our resources to keep up to data with the rest of
the world. In addition, any sort of contribution to the agriculture can be beneficial for the
country as well to its people.
The proposed model is constructed by using AI algorithms to reduce the farmers’ problems of
getting losses in their farms due to lack of knowledge of cultivation in different soil and
weather conditions. The model is created by using machine learning (SVM) and deep
learning (LSTM, RNN) techniques. The model predicts best crops that should be grown on
land with less expenses among a number of crops available after analyzing the prediction
parameters. To the best of studies, there is no such work in existence that uses the same
techniques in predicting the crops. Hence, it is concluded that there is an enhancement in the
accuracy of this research work when compared to the existing work that used another
techniques for prediction of crops. The accuracy is calculated as 97%. It has a vast extension
in future and can be actualized and interfaced with a flexible and multi-skilled application.
The farmers need to be educated and hence, will get a clear information regarding best crop
yield on their mobiles. With this, even if the rancher is at home, the work can be managed at
that particular instant of time, without facing any kind of loss ahead. The progress in the
agribusiness field will be extremely appreciable which will further result in helping the
farmers in production of crops.
46
REFERENCES
[1] Dahikar S and Rode S V 2014 Agricultural crop yield prediction using artificial neural
network approach International Journal of Innovative Research in Electrical, Electronics,
Instrumentation and Control Engineering vol 2 Issue 1 pp 683-6.
[2] Suresh A, Ganesh P and Ramalatha M 2018 Prediction of major crop yields of
Tamilnadu using K-means and Modified KNN 2018 3rd International Conference on
Communication and Electronics Systems (ICCES) pp 88-93 doi:
10.1109/CESYS.2018.8723956.
[3] Medar R, Rajpurohit V S and Shweta S 2019 Crop yield prediction using machine
learning techniques IEEE 5th International Conference for Convergence in Technology
(I2CT) pp 1-5 doi: 10.1109/I2CT45611.2019.9033611.
[4] Nishant P S, Venkat P S, Avinash B L and Jabber B 2020 Crop yield prediction based
on Indian agriculture using machine learning 2020 International Conference for Emerging
Technology (INCET) pp 1-4 doi: 10.1109/INCET49848.2020.9154036.
[5] Kalimuthu M, Vaishnavi P and Kishore M 2020 Crop prediction using machine
learning 2020 Third International Conference on Smart Systems and Inventive Technology
(ICSSIT) pp 926-32 doi: 10.1109/ICSSIT48917.2020.9214190.
[8] Sellam V, and Poovammal E 2016 Prediction of crop yield using regression analysis
Indian
47
Journal of Science and Technology vol 9(38) pp 1-5.
[9] Bharath S, Yeshwanth S, Yashas B L and Vidyaranya R Javalagi 2020 Comparative Analysis of
Machine Learning Algorithms in The Study of Crop and Crop yield Prediction International Journal
of Engineering Research & Technology (IJERT) NCETESFT – 2020 vol 8 Issue 14.
[10] Mahendra N, Vishwakarma D, Nischitha K, Ashwini and Manjuraju M. R 2020 Crop
prediction using machine learning approaches, International Journal of Engineering
Research & Technology (IJERT) vol 9 Issue 8 (August 2020).
[11] Gulati P and Jha S K 2020 Efficient crop yield prediction in India using machine
learning techniques International Journal of Engineering Research & Technology (IJERT)
ENCADEMS – 2020 vol 8 Issue 10.
[12] Gupta A, Nagda D, Nikhare P, Sandbhor A, 2021, Smart crop prediction using IoT
and machine learning International Journal of Engineering Research & Technology
(IJERT) NTASU – 2020 vol 9 Issue 3.
48