0% found this document useful (0 votes)
35 views

Crop Prediction

Uploaded by

ranupander617
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

Crop Prediction

Uploaded by

ranupander617
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 48

MONITORING CROP HEALTH PROJECT

Report submitted in partial fulfilment of the requirement for


the

degree of

B.Tech.

In

Computer Science & Engineering


By
Krati Itondia (2101641520081)
Ranu Pandey (2101641520114)
Muskan shah (2101641520093)
Himanshi Singh (210164152074)
Shreya Shukla (2101641520133)

Under the guidance of


Priyanka Arya
(Associate Professor)
Project Id: CS_AI_2B_04

Pranveer Singh Institute of Technology, Kanpur


Dr A P J A K Technical University
Lucknow

1
DECLARATION

This is to certify that Report entitled “Monitering Crop Health” which is


submitted by me in partial fulfilment of the requirement for the award of degree
B.Tech. in Computer Science and Engineering to Pranveer Singh Institute of
Technology, Kanpur Dr. A P J A K Technical University, Lucknow comprises
only our own work and due acknowledgement has been made in the text to all
other material used.

Date:

Krati Itondia (2101641520081)


Himanshi Singh (2101641520074)
Muskan Shah (2101641520093)
Ranu Pandey (2101641520114)
Shreya Shukla (2101641520133)

2
Certificate

This is to certify that Report entitled “Monitering Crop Health” which is


submitted by Shreya Shukla, Himanshi Singh, Ranu Pandey,Muskan Shah , Krati
Itondia in partial fulfilment of the requirement for the award of degree B.Tech. in
Computer Science & Engineering to Pranveer Singh Institute of Technology,
Kanpur affiliated to Dr. A P J A K Technical University, Lucknow is a record of
the candidate own work carried out by him under my supervision. The matter
embodied in this thesis is original and has not been submitted for the award of
any other degree.

Signature: Signature:

Dr. Vishal Nagar Priyanka Arya


Dean CSE Department, Associate Professor
PSIT, Kanpur CSE Department,
PSIT, Kanpur

3
ACKNOWLEDGEMENT

It gives us a great sense of pleasure to present the report of the B.Tech. Project undertaken
during B.Tech. Second Year. We owe special debt of gratitude to our project supervisor
Subhash Singh Parihar, Department of Computer Science and Engineering, Pranveer Singh
Institute of Technology, Kanpur for his constant support and guidance throughout the course
of our work. His sincerely, thoroughness and perseverance have been a constant source of
inspiration for us. It is only his cognizant efforts that our endeavours have seen light of the
day.

We also take the opportunity to acknowledge the contribution of Professor Dr. Vishal Nagar,
Dean Computer Science & Engineering Department, Pranveer Singh Institute of Technology,
Kanpur for his full support and assistance during the development of the project.

We also do not like to miss the opportunity to acknowledge the contribution of all faculty
members of the department for their kind assistance and cooperation during the development
of our project. Last but not the least, we acknowledge our friends for their contribution in the
completion of the project.

Signature: Signature:
Name: Shreya Shukla Name: Himanshi Singh
Roll No.: 2101641520133 Roll No.: 2101641520074

Signature: Signature:
Name: Ranu Pandey Name: Krati Itondia
Roll No.: 2101641520114 Roll No.: 2101641520081

Signature:
Name: Muskan Shah
Roll No.: 2101641520093

4
4
ABSTRACT

This Online Examination System is a software solution, which allows any


industry or institute to arrange, conduct and manage examinations via an online
environment. It can be done through the Internet/Intranet and/ Local Area
Network environments. Some of the problems faced during manual examination
systems are the delays occurred in result processing, filing poses a problem,
filtering of records is difficult. The chance of loss of records is high also record
searching is difficult. Maintenance of the system is also very difficult and takes
lot of time and effort. The physical classroom learning nowadays is no longer
applicable for the current younger generations. Internet and distance learning
which is generally known as online education plays a vital role in the country's
education system.

Online examination is one of the crucial parts for online education system. It is
efficient, fast enough and reduces the large amount of material resource. An
examination system is developed based on the web. This paper describes the
principle of the system, presents the main functions of the system, analyses the
auto-generating test paper algorithm, and discusses the security of the system.
It is undeniable that online education provides ample of benefits to young
learners. Nevertheless, there are also many negative implications from online
education. Limited collaborative learning, increase in time and effort are the
several negative implications from online education. This study examines the
implications of online education among students especially in a private higher
learning institution and its effect towards Malaysian national education system.
Information has been collected through surveys, interviews and together with
secondary data, and were analysed using SPSS. The studies found that there are
various serious issues regarding online education and on its effect on the quality
of Malaysian Education System to certain extend. Several problems have been
identified and these issues have to be solved in order to sustain the quality of
education for future generations. Furthermore, Ministry of Higher Education

5
(MOHE) should formulate a standard policy, monitor closely the implementation
of online education, evaluate and review the method used in teaching and
upgrade to maintain the quality of online education in private higher education
institution.

6
TABLE OF CONTENTS

DECLARATION…………………………………. ii
CERTIFICATE…………………………………... iii
ACKNOWLEDGEMENTS…………………….… iv
ABSTRACT…………………………………….…. v
LIST OF TABLES………………………………... vi
LIST OF FIGURES…………………………..…. viii
LIST OF SYMBOLS………………………….…. ix
LIST OF ABBREVIATIONS………………….… ix

CHAPTER NO. TITLE PAGE NO.

1. INTRODUCTION 1-11
1.1 MOTIVATION 3
1.2 BACKGROUND OF PROBLEM 4
1.2.1. Current System 4
1.2.2. Issues in Current System 5
1.2.2.1 Functionality Issues 6
1.2.2.2 Security Issues 7
1.3. PROBLEM STATEMENT 9
1.4. PROPOSED WORK 10
1.5. ORGANIZATION OF REPORT 11

2. DESIGN METHODOLOGY 12-17


2.1 Diagram 12
2.1.1 Architecture diagram 13
2.1.2 DFD (data flow diagram) 14
2.1.3 ERD (entity relationship diagram) 15
2.1.4 Flow chart cc 17

3. IMPLEMENTATION 18-29

7
3.1 Code 18
3.2 Figures of Project 29

4. TESTING/RESULT AND ANALYSIS 30-32

5. CONCLUSION AND FUTURE 33-36

ENHANCEMENTS
5.1 Conclusion 33

5.2 Future Enhancements 35

REFERENCES 36

8
LIST OF FIGURES

 In design methodology
 Architecture diagram

 DFD (Data Flow Diagram)

 ERD (Entity Relationship Diagram)

 Flow chart

 In implementation
 Login page

 Admin panel

 Addition

 Add exam

 Question designing

 Student details

 Sign in

 Exam starts reminder

 Examination page

 Submission

 Results to admin

 Results to student

9
LIST OF SYMBOLS

% Percentage
Arrow symbol for flow of data

Data input and output

Result of diagram

Symbol for decision making

Process

LIST OF ABBREVIATIONS

 ID Identification

 TMA Text-Mark Assignment

 UI User Interface

 PCA Principal Component Analysis


 LMS Learning management systems

10
CHAPTER 1:
INTRODUCTION
One of the most essential aspects of human survival is agriculture which is the main source of

food. Unfortunately most of the farmers in our country use the normal way of farming which

may be a hectic process to investigate data manually associated with soil and crops. This

problem could be solved by using modern farming methods. The agriculture sector

contributes a lot to the country’s economic process, it's necessary to introduce the latest

technologies such as IoT, automation, etc. in agriculture which relatively improves the crop

production and helps in developing the economy. Implementation of automation in

agriculture results in effective crop health monitoring without human involvement within the

field. The Internet of things is that the network of physical objects embedded with sensors,

software, and electronic components like microcontrollers, as sensors and microcontrollers

cannot be connected to the internet directly. Crop productivity is dependent on a decent

irrigation system, atmospheric conditions like temperature, humidity. IoT technology used in

collecting information about conditions like weather, rainfall, humidity, temperature, and soil

moisture. Wireless sensor networks are used for monitoring the farm conditions and

microcontrollers are accustomed to control and automate the farm processes to look at

remotely the conditions within the kind of image and video, wireless cameras are used. A

smartphone allows farmers to stay updated with the continued conditions of his agricultural

land using IoT at any time and any a part of the world. IoT technology can minimize cost and

enhance the productivity of traditional farming. The use of cloud services and creating a

graphical user interface will bring healthy monitoring very easy. Farmers need not to

understand the concept of using the data, GUI will make it easier to take correct decisions.

Researchers developed a sensor network which is wireless, to observe the conditions of


farming and increasing crop production and quality. Sensors are used to monitor

11
environmental parameters such as rainfall percentage, atmospheric humidity, temperature,
etc. The microcontroller ATMEGA328P and sensor nodes with wireless transceiver module
supported Zig bee protocol is used in designing the system. Web application and database
enables in retrieving and storing the data. In this experiment the sensor node failure and
energy efficiency are monitored. An experiment conducted on smart agriculture greenhouse
monitoring system based on ZigBee technology. The system performs data acquisition,
processing, transmission, and reception functions. The objective of the experiment is to
understand the greenhouse environment system, where the system is efficient in managing
the environmental area and reduces the cost of farming and also saves energy. The gateway
has a Linux operating system and cortex A8 processor which act as a core. Overall the
planning implements remote smart monitoring and control of greenhouse and also replaces
the old wired technology to wireless, also reduces manpower cost. Operations and fulfillment
are suitable places to prove efficiency gains. Researchers studied the work of a rural farming
community that replaces some of the traditional techniques. The sensor nodes have different
external sensors namely soil moisture sensor, soil pH, atmospheric humidity, and temperature
sensors connected to it. Based on the soil moisture, the sensor activates a motor for water
discharging during the period of water scarcity and switches off after the required amount of
water is discharged. This leads to conservation of water and soil pH is shipped to the bottom
layer and successively base layer intimates the farmer about soil pH via SMS using GSM
model. This information helps the farmers to reduce the amount of fertilizers used. A
development of rice crop monitoring using IoT is proposed to provide a helping hand in real-
time monitoring and increasing rice production. The automated control of water discharge for
irrigation and the ultimate supply of information is implemented using a wireless sensor
network.

12
The parameters of a crop are determined using sensors like temperature, soil, humidity. These
data are compared with pre-determined values and accordingly the crop condition is notified
to the farmer remotely using GSM, thus reducing physical effort. This information about the
crops is notified through a telephonic message to the farmer so that he or she can utilize his or
her time on better production units. This combination of traditional methods with this
technology will result in agricultural modernization. The modern farmer is unable to identify
how the various environment parameters like humidity and temperature affect their crop.
Despite the rapid spread of mobile connectivity and mobile internet in the country, efficient
and cheap methods to exploit the same to increase efficiency and productivity remain out of
reach. Thus one of the most important challenges is the lack of proper monitoring and control
mechanisms for efficient farming. This paper explains the development of a prototype of an
efficient Plant growth monitoring system, which along with providing data about the
environmental parameters surrounding the plant, which are vital to the plant’s growth. The
proposed idea discusses a cost effective system that receives data about the conditions
surrounding the plants from various sensors in the system.

13
MOTIVATION

Farming has been one of the initial factors in establishing human civilization for more than

10,000 years. Back then only minimum handy tools were used for activities such as

harvesting, sowing, etc. Today after thousands of years we have advanced immensely in

terms of technology as well as lifestyle. With technical advancement, a growing population is

also being witnessed , hence, demand for food supply has increased drastically. In such a fast

growing era it becomes even more important to monitor the agricultural sector than before,

because even a single problem in a crops seasons due to any threat can affect the supply.

Therefore it becomes very important to have a smart agricultural system for monitoring crop

yield and prevent it from any mishap.

14
Thus, by implementing the following methods suggested by the authors conventional

methods can be improved:-

1.By developing smart applications which can capture images of crops and analyse their

health conditions.

2.Using drones , which will be monitoring the whole field without human intervention and

help in detecting the plants affected by pests and sprinkle pesticides over the affected area.

3.Using wireless rovers for examining the soil conditions, and also useful for identifying

creepers which damage the crops.

4.Using various AI sensors for keeping track of the moisture content of the soil and notifying

us using an email system when there's a deficiency.

Plant disease detection can be achieved successfully using deep learning. Deep learning has

played a very significant role in identifying plant defects or diseases. Deep learning

techniques have proven to be a strong tool as it has the capacity to handle an immense

amount of datasets which improves the chances for better detection. Deep Learning

algorithms like CNN. With the help of image processing we can use CNN for recognition of

various patterns in the dataset. CNN is quite adaptable and has a less complex structure than

other algorithms. It also requires a lesser number of parameters for training.

BACKGROUND OF PROBLEMS: -

 Limitations on monitoring crop conditions


There are similar methods employed across global, regional and national CMSs for analysing

crop conditions in near-real-time, most of which rely on maps of anomalies of metrics from

the average values to investigate spatial variations, or on temporal development to reflect

15
crop growth dynamics. These methods require a seamless and comparative historical archive

of metrics and a real-time satellite data processing capacity to produce biophysical products

using dedicated algorithms. The differences are then qualitatively interpreted as crop growth

classes. There are three ways to present these metric differences: (1) an anomaly map at a

specific date, indicating spatial variations and offering comparisons across large regions; (2)

aggregated profiles of current and reference years to reflect the development of crops over the

growing season for the specific spatial extent, as derived from the VI time series, showing the

start, length, ascending and descending slope, and peak of crop greenness; and (3) spatial

clustering maps in which pixels reflecting similar crop development conditions are grouped.

 Limitations on detecting crop stress driven by drought


Drought is the major natural disaster that causes the most extensive crop stress and yield

losses, and drought assessments are incorporated in most CMSs as part of the crop condition

component or individual component. A lack of precipitation combined with higher

evaporation rates can propagate from a meteorological drought into an agricultural drought,

leading to a reduction in crop yield or even to complete crop failures. In this regard, many

drought indices have been developed for detecting meteorological droughts caused by climate

variabilities, such as the standardized precipitation index (SPI), the standardized precipitation

evapotranspiration index (SPEI) and the Palmer drought severity index (PDSI).

Drought conditions directly affect the morphology, greenness, photosynthesis, biomass

accumulation, and evapotranspiration of crops. Many vegetation-based agricultural drought

indices have been developed, such as the vegetation condition index (VCI) and mean

vegetation condition index(MVCI). Given that drought events cause soil moisture drying and

land surface temperature (LST) changes, the temperature condition index (TCI) based on

LSTs, soil moisture agricultural drought index (SMADI), evaporative stress index (ESI) and

hydrothermal weather index (HWI) have been proposed to determine agricultural drought

16
conditions. The LST is a timely response indicator that can reflect crop stress before

substantial visual symptoms arise. The popular VHI is the combination of VCI and TCI

reflecting both biophysical and environmental conditions.

 Limitations on determining the impacts of nutrients, diseases and pests


on crop stress
If crop stress is not caused by adverse weather, it is likely to be caused by nutrient stress,

diseases or pests. Nutrients have been reported as another stress factor after water stress at the

global scale. Inversion algorithms involving CCC or leaf nitrogen content (LNC) have been

developed to detect nutrient stress in wheat and other crops. Although many visible-band VIs

have been developed to relate to chlorophyll or nitrogen content, red-edge bands, special

bands located between the red and near-infrared bands, have been proven to be more

sensitive to chlorophyll content. The advantage of the red-edge band in detecting the nutrient

status or chlorophyll content has been continuously demonstrated, but it is only effective for

dense crops. Sentinel-2 satellites have three red-edge bands, making it possible to detect

chlorophyll content using imagery from these satellites.

Many metrics have been developed to identify types of diseases and pests, assess the

corresponding infection severities, and map their distributions at the plot or regional scale.

However, prior knowledge is needed to identify the types of local disease/pest or other

stresses that occur in the field. In areas for which prior knowledge is lacking, it is thus

difficult to proactively achieve reliable and precise assessment , as a variety of signs and

plant damage caused by crop diseases and pests can also be caused by other factors, such as

nutrition deficiencies, thus leading to challenges when attempting to separate the actual stress

factor. For instance, the photochemical reflectance index (PRI) is used not only for wheat

yellow rust detection, but has also commonly been used to detect water stress, frost stress and

17
damage, and nitrogen content and stress. New indicators and metrics are needed to

distinguish the various causes of stress and to quantify different severities.

PROBLEM STATEMENT

Limitations in current crop monitoring methods have been identified in the previous sections.

Some limitations, including in situ data accessibility and knowledge-based analysis, might

reduce the applicability of crop monitoring and lead to uncertain and undesirable

consequences.

Existing methods usually require new ground-truth data for each new setting to parameterize

algorithms and models and assess their accuracies. The field sampling requirements prevent

most global systems from obtaining crop area estimates and yield prediction components

(Table 1), as collaboration with local institutions is required to conduct field work and access

to in situ data for training and calibrating algorithms and modelling outside national

boundaries is still challenging. GEOGLAM has implemented an in situ data coordination

strategy to leverage partner investments and to ensure that data are curated with a standard

protocol. GEOGLAM embraces the Global Earth Observation System of Systems (GEOSS)

Data Sharing Principles that encourage full open sharing of data, including both EO data

and in situ data, although open sharing of in situ data is still a challenge, as it sometimes

involves privacy issues. Nevertheless, the GEOGLAM Joint Experiment for Crop

Assessment and Monitoring (JECAM) initiative demonstrates a best practice method of data

sharing to enhance the availability of in situ data through intercomparison projects or

scientific data papers. However, the ground-truth dataset is not yet fully publicly available

due to restrictions imposed by the in situ providers. Moreover, in the foreseen future, it is

unrealistic to expect the full sharing of ground-truth data with increasing trade tensions and

18
strained global cooperation. Therefore, the requirement for ground-truth data in crop

monitoring should encourage crop monitoring activities at the domestic and local levels.

As in situ data collection is one of the major challenges for crop monitoring, closing the
ground-truth data gaps and improving the data collection efficiency are essential for
strengthening the reliability of crop monitoring. However, the acquisition of field data,
especially at large scales, is time- and cost-consuming and labour-intensive. To address this
issue, crowdsourcing might provide an alternative and efficient solution for acquiring field-
based data . Crowdsourcing information has become a widespread data acquisition method in
environmental and resource monitoring , serving as a potential solution for closing the
ground-truth data gaps. With the wide use of mobile phones, smartphone sensors, such as
cameras, satellite positioning, and photoreceptors, have become major platforms for
crowdsourcing information collection. A mobile global positioning system (GPS)-video-
geographic information systems (GIS) application (called a GVG app) can collect such data
as crop types, planting dates, irrigation and expected yields with corresponding geolocation
information. Convolutional neural networks (CNNs) have been used to automatically identify
crop types from Google Street photos or GVG photos.

Data collection of the actual crop yield is not only labour-intensive and costly, but also
difficult to be implemented efficiently. It relies on the grain harvest of samples in the field
with uncertainties in both sampling and unavoidable grain losses during harvest. A new
method for field yield data measurement involving AI and computer vision to count the
numbers of spikes, seed numbers per spike and the sizes of seeds for weight determination
(Fig. 4) is urgently needed for integration into GVG.

The reliability of crop information is essential, as such information serves as an important


resource factor with significant economic value and consequences. There is a lack of
transparent and standardized methods for synthesizing various information in crop
monitoring to support decision making , thus affecting food prices. Instead, knowledge-based
analyses are mostly applied in crop monitoring activities, especially in the process of
generating actionable reports. Analysts explore the indicators provided by the system and
identify the indicators that best explain the actual crop growth and crop stress conditions.
They then select robust methods to conduct accurate crop acreage estimates, yield
predictions, and production forecasts for specific agroclimatic regions and publish the results

19
in the form of regular bulletins. Therefore, analysts must specialize in the specific region for
which they have expertise in regional agroclimatic conditions and management practices if
they are to understand how the crop indicators generated by the system describe the actual
yield variations in that region. In this case, the personal knowledge, views, or preferences of
analysts all affect their working practices.

Alternatively, crop monitoring should be inclusive of users and provide user-driven services.
All components and functions of Crop Watch, including the self-calibration abilities of
models and the collaborative analyses of indicators, were transferred to APIs in the Crop
Watch-Cloud, which enables users to carry out self-serviced crop monitoring by selecting
their preferred indicators for the user's area of interest. This allows users to complete crop
monitoring independently and autonomously from the data download to the final synthesized
analysis.

For example, with the support of a customized CropWatch for Mozambique local conditions,
officials in the Mozambique Ministry of Agriculture and Rural Development (MARD) who
respond to crop monitoring and earlier warming can apply specific programming language
environments to call APIs and organize processing workflows and can also set up self-
defined projects/systems for any areas of interest in their country by invoking the appropriate
APIs. As users from MARD have defined the modules themselves, calibrated and used the
tools, MARD enhances the capability and reliability of crop monitoring for Mozambique
without additional investment in storage and computational resources. This effort was
recognized as one of the best rural solutions in 2020 by the International Fund for
Agricultural Development and one of the good practices in South–South and Triangular
Cooperation for Sustainable Development.

Furthermore, it would be better for users to obtain crop information from their own systems
or from different sources to ensure the reliability and representativeness of information and to
prevent unconscious biases. This is why, immediately after the global food crisis of 2008, the
Group of Twenty (G20) Agriculture Ministers launched a crop monitoring initiative with
international participation, i.e. GEOGLAM during the French G20 Presidency in 2011. The
objectives of GEOGLAM were to increase market transparency, improve food security and
stabilize commodity prices by producing and disseminating crop information and enhancing
crop monitoring capacities. The dissemination of global or regional crop information from

20
various hosts, including Crop Watch, increases the availability and transparency of food-
related information by providing regularly released bulletins and reports.

PROPOSED WORK

Crop prediction is an important application of machine learning that can help farmers make

informed decisions about their crops. The proposed work of crop prediction through ML

involves using various machine learning algorithms to predict the yield of different crops

based on various input parameters.

The first step in this process is to gather data on various factors that can affect crop yield,

such as soil moisture, temperature, rainfall, and nutrient levels. This data can be collected

through various sources such as weather stations, soil sensors, and satellite imagery.

Once the data has been collected, it can be processed and analyzed using machine learning

algorithms such as linear regression, decision trees, random forests, and neural networks.

21
These algorithms can be trained on historical data to identify patterns and relationships

between the input parameters and crop yield.

The next step is to use the trained model to predict crop yields for future growing seasons.

This can help farmers make informed decisions about when to plant, what crops to plant, and

how much fertilizer and water to use.

One important aspect of this work is the need for accurate and up-to-date data. This can be a

challenge in some regions where data may be scarce or unreliable. In addition, the models

may need to be customized for different crop types and growing conditions.

CHAPTER 2:
DESIGN METHODOLOGY

Data is a very important part of any Machine Learning System. As the climate
changes from place to place, it was necessary to get data at district level.
Historical data about the crop and the climate of a particular region was needed to
implement the system. This data was gathered from different government
websites. The data about the crops of was gathered from www.data.gov.in and
the data about the climate was gathered from www.imd.gov.in. The climatic
parameters which affect the crop the most are precipitation, temperature, cloud
cover, vapour pressure, wet day frequency. So, the data about these climatic
parameters was gathered at a monthly level. Dataset Collection: In this phase, we
collect data from various sources and prepare datasets. And the provided dataset
is in the use of analytics (descriptive and diagnostic). There are several online

22
abstracts sources such as Data.gov.in and indiastat.org. For at least ten years the
yearly abstracts of a crop will be used. These datasets usually accept behaviour of
anarchic time series. Combined the primary and necessary abstracts. Random
Forests for Global and Regional Crop Yield Predictions. Data Partitioning: The
Entire dataset is partitioned into 2 parts: for example, say, 75% of the dataset is
used for training the model and 25% of the data is set aside to test the model. To
predict future events Machine Learning Algorithms: Supervised learning:
Supervised machine learning algorithms can apply what has been learned in the
past to new data using labelled examples. After Sufficient training the system can
provide targets for any new input. IN order to change the model accordingly the
learning algorithm can also differentiate its results with the correct, intended
output and find errors. Unsupervised learning: IN comparison, unsupervised
machine learning algorithms are used when the information used to train is
neither labelled nor classified. Unsupervised learning does analysis of how
systems can infer a function to describe a hidden structure from unlabelled data.
In order to describe hidden structures from unlabelled data the system doesn’t
figure out the right output, but it examines the data and can draw inferences from
datasets.

Data Collection: Collecting relevant data is the first step in any ML-based
prediction model. The data can include historical data on crop yield, weather
patterns, soil quality, fertilizer usage, and other factors that may impact crop
production.

Data Preprocessing: The collected data needs to be preprocessed to remove any


inconsistencies, errors, or missing values. This step involves data cleaning,
normalization, and feature extraction.

Data Analysis: In this step, exploratory data analysis (EDA) techniques are
applied to understand the relationships between different features and crop yields.
This step helps identify the most relevant features for crop yield prediction.

23
Model Selection: Once the relevant features are identified, the next step is to
select an appropriate ML model for crop yield prediction.

Some popular models include linear regression, decision trees, random forests,
and neural networks.

Model Training: The selected model is trained using the preprocessed data, and
the training process involves iteratively adjusting the model's parameters to
minimize the error between the predicted crop yield and the actual yield.

Model Evaluation: The trained model's performance is evaluated using a


separate validation dataset

DFD(Data Flow Diagram


A data flow diagram (DFD) is a graphical or visual representation using a standardized set of
symbols and notations to describe a business's operations through data movement.

24
Fig 2. A data flow diagram (DFD) maps out the flow of information for any
process or system.

25
26
Fig. 1. Proposed Approach Fig. 1. Shows the proposed approach and how the
data is summarized, and Random Forest algorithm is applied, and the result is
calculated.

Firstly, the research questions are defined. When research questions are ready, databases are
used to select the relevant studies. The databases that were used in this study are Science
Direct, Scopus, Web of Science, Springer Link, Wiley, and Google Scholar. After the
selection of relevant studies, they were filtered and assessed using a set of exclusion and
quality criteria. All the relevant data from the selected studies are extracted, and eventually,
the extracted data were synthesized in response to the research questions. The approach we
followed can be split up into three parts: plan review, conduct review, and report review.

The first stage is planning the review. In this stage, research questions are identified, a
protocol is developed, and eventually, the protocol is validated to see if the approach is
feasible. In addition to the research questions, publication venues, initial search strings, and
publication selection criteria are also defined. When all of this information is defined, the
protocol is revised one more time to see if it represents a proper review protocol

27
The second stage is conducting the review, which is represented in fig. When
conducting the review, the publications were selected by going through all the
databases. The data was extracted, which means that their information regarding
authors, year of publication, type of publication, and more information regarding
the research questions were stored. After all the necessary data was extracted
correctly, the data was synthesized in order to provide an overview of the relevant
papers published so far.

28
CHAPTER 3
IMPLEMENTATION
3.1 Pseudo Code of the Proposed System

1. Import numpy and pandas and import the matplotlib.pyplot.

2.Barplotting

29
30
3.Head data

4.Data information

6.Functions of data

31
5.Tail data

32
3.2 Code Format

import pyttsx3 #
Importing pyttsx3 library to convert text into speech.
import pandas as pd #
Importing pandas library
from sklearn import preprocessing #
Importing sklearn library. This is a very powerfull library for machine
learning. Scikit-learn is probably the most useful library for machine
learning in Python. The sklearn library contains a lot of efficient tools for
machine learning and statistical modeling including classification,
regression, clustering and dimensionality reduction.
from sklearn.neighbors import KNeighborsClassifier #
Importing Knn Classifier from sklearn library.
import numpy as np #
Importing numpy to do stuffs related to arrays
import PySimpleGUI as sg #
Importing pysimplegui to make a Graphical User Interface.

excel = pd.read_excel('Crop.xlsx', header = 0) #


Importing our excel data from a specific file.
print(excel) #
Printing our excel file data.
print(excel.shape) #
Checking out the shape of our data.

engine = pyttsx3.init('sapi5') #
Defining the speech rate, type of voice etc.
voices = engine.getProperty('voices')
rate = engine.getProperty('rate')
engine.setProperty('rate', rate-20)
engine.setProperty('voice',voices[0].id)

def speak(audio): #
Defining a speak function. We can call this function when we want to make our
program to speak something.
engine.say(audio)
engine.runAndWait()

le = preprocessing.LabelEncoder() #
Various machine learning algorithms require numerical input data, so you need
to represent categorical columns in a numerical column. In order to encode
this data, you could map each value to a number. This process is known as
label encoding, and sklearn conveniently will do this for you using Label
Encoder.

33
crop = le.fit_transform(list(excel["CROP"])) #
Mapping the values in weather into numerical form.

NITROGEN = list(excel["NITROGEN"]) #
Making the whole row consisting of nitrogen values to come into nitrogen.
PHOSPHORUS = list(excel["PHOSPHORUS"]) #
Making the whole row consisting of phosphorus values to come into phosphorus.
POTASSIUM = list(excel["POTASSIUM"]) #
Making the whole row consisting of potassium values to come into potassium.
TEMPERATURE = list(excel["TEMPERATURE"]) #
Making the whole row consisting of temperature values to come into
temperature.
HUMIDITY = list(excel["HUMIDITY"]) #
Making the whole row consisting of humidity values to come into humidity.
PH = list(excel["PH"]) #
Making the whole row consisting of ph values to come into ph.
RAINFALL = list(excel["RAINFALL"]) #
Making the whole row consisting of rainfall values to come into rainfall.

features = list(zip(NITROGEN, PHOSPHORUS, POTASSIUM, TEMPERATURE, HUMIDITY,


PH, RAINFALL)) # Zipping all the features together
features = np.array([NITROGEN, PHOSPHORUS, POTASSIUM, TEMPERATURE, HUMIDITY,
PH, RAINFALL]) # Converting all the features into a array
form

features = features.transpose()
# Making transpose of the features
print(features.shape)
# Printing the shape of the features after getting transposed.
print(crop.shape)
# Printing the shape of crop. Please note that the shape of the features and
crop should match each other to make predictions.

model = KNeighborsClassifier(n_neighbors=3)
# The number of neighbors is the core deciding factor. K is generally an odd
number if the number of classes is 2. When K=1, then the algorithm is known as
the nearest neighbor algorithm.
model.fit(features, crop)
# fit your model on the train set using fit() and perform prediction on the
test set using predict().
layout = [[sg.Text(' Crop Recommendation Assistant',
font=("Helvetica", 30), text_color = 'yellow')],
# Defining the layout of the Graphical User Interface. It consist of some
text, Buttons, and blanks to take Input.
[sg.Text('Please enter the following details :-', font=("Helvetica",
20))],

34
# We have defined the text size, font type, font size, blank size, colour of
the text in the GUI.
[sg.Text('Enter ratio of Nitrogen in the soil
:', font=("Helvetica", 20)), sg.Input(font=("Helvetica",20), size = (20,1) )],
[sg.Text('Enter ratio of Phosphorous in the soil
:', font=("Helvetica", 20)), sg.Input(font=("Helvetica", 20),size = (20,1))],
[sg.Text('Enter ratio of Potassium in the soil
:', font=("Helvetica", 20)), sg.Input(font=("Helvetica", 20),size = (20,1))],
[sg.Text('Enter average Temperature value around the field :',
font=("Helvetica", 20)), sg.Input(font=("Helvetica", 20),size = (20,1)),
sg.Text('*C', font=("Helvetica", 20))],
[sg.Text('Enter average percentage of Humidity around the field :',
font=("Helvetica", 20)), sg.Input(font=("Helvetica", 20),size = (20,1)),
sg.Text('%', font=("Helvetica", 20))],
[sg.Text('Enter PH value of the soil
:', font=("Helvetica", 20)), sg.Input(font=("Helvetica", 20),size = (20,1))],
[sg.Text('Enter average amount of Rainfall around the field
:', font=("Helvetica", 20) ), sg.Input(font=("Helvetica", 20),size =
(20,1)),sg.Text('mm', font=("Helvetica", 20))],
[sg.Text(size=(50,1),font=("Helvetica",20) , text_color = 'yellow',
key='-OUTPUT1-' )],
[sg.Button('Submit', font=("Helvetica", 20)),sg.Button('Quit',
font=("Helvetica", 20))] ]
window = sg.Window('Crop Recommendation Assistant', layout)

while True:
event, values = window.read()
if event == sg.WINDOW_CLOSED or event == 'Quit':
# If the user will press the quit button then the program will end up.
break
print(values[0])
nitrogen_content = values[0]
# Taking input from the user about nitrogen content in the soil.
phosphorus_content = values[1]
# Taking input from the user about phosphorus content in the soil.
potassium_content = values[2]
# Taking input from the user about potassium content in the soil.
temperature_content = values[3]
# Taking input from the user about the surrounding temperature.
humidity_content = values[4]
# Taking input from the user about the surrounding humidity.
ph_content = values[5]
# Taking input from the user about the ph level of the soil.
rainfall = values[6]
# Taking input from the user about the rainfall.
predict1 = np.array([nitrogen_content,phosphorus_content,
potassium_content, temperature_content, humidity_content, ph_content,

35
rainfall],dtype=float) # Converting all the data that we collected from the
user into a array form to make further predictions.
print(predict1)
# Printing the data after being converted into a array form.
predict1 = predict1.reshape(1,-1)
# Reshaping the input data so that it can be applied in the model for getting
accurate results.
print(predict1)
# Printing the input data value after being reshaped.
predict1 = model.predict(predict1)
# Applying the user input data into the model.
print(predict1)
# Finally printing out the results.
crop_name = str()
if predict1 == 0:
# Above we have converted the crop names into numerical form, so that we can
apply the machine learning model easily. Now we have to again change the
numerical values into names of crop so that we can print it when required.
crop_name = 'Apple(सेब)'
elif predict1 == 1:
crop_name = 'Banana(केला)'
elif predict1 == 2:
crop_name = 'Blackgram(काला चना)'
elif predict1 == 3:
crop_name = 'Chickpea(काबुली चना)'
elif predict1 == 4:
crop_name = 'Coconut(नारियल)'
elif predict1 == 5:
crop_name = 'Coffee(कॉफ़ी)'
elif predict1 == 6:
crop_name = 'Cotton(कपास)'
elif predict1 == 7:
crop_name = 'Grapes(अंगूर)'
elif predict1 == 8:
crop_name = 'Jute(जूट)'
elif predict1 == 9:
crop_name = 'Kidneybeans(राज़में)'
elif predict1 == 10:
crop_name = 'Lentil(मसूर की दाल)'
elif predict1 == 11:
crop_name = 'Maize(मक्का)'
elif predict1 == 12:
crop_name = 'Mango(आम)'
elif predict1 == 13:
crop_name = 'Mothbeans(मोठबीन)'
elif predict1 == 14:
crop_name = 'Mungbeans(मूंग)'
elif predict1 == 15:

36
crop_name = 'Muskmelon(खरबूजा)'
elif predict1 == 16:
crop_name = 'Orange(संतरा)'
elif predict1 == 17:
crop_name = 'Papaya(पपीता)'
elif predict1 == 18:
crop_name = 'Pigeonpeas(कबूतर के मटर)'
elif predict1 == 19:
crop_name = 'Pomegranate(अनार)'
elif predict1 == 20:
crop_name = 'Rice(चावल)'
elif predict1 == 21:
crop_name = 'Watermelon(तरबूज)'

if int(humidity_content) >=1 and int(humidity_content)<= 33 :


# Here I have divided the humidity values into three categories i.e low humid,
medium humid, high humid.
humidity_level = 'low humid'
elif int(humidity_content) >=34 and int(humidity_content) <= 66:
humidity_level = 'medium humid'
else:
humidity_level = 'high humid'

if int(temperature_content) >= 0 and int(temperature_content)<= 6:


# Here I have divided the temperature values into three categories i.e cool,
warm, hot.
temperature_level = 'cool'
elif int(temperature_content) >=7 and int(temperature_content) <= 25:
temperature_level = 'warm'
else:
temperature_level= 'hot'

if int(rainfall) >=1 and int(rainfall) <= 100:


# Here I have divided the humidity values into three categories i.e less,
moderate, heavy rain.
rainfall_level = 'less'
elif int(rainfall) >= 101 and int(rainfall) <=200:
rainfall_level = 'moderate'
elif int(rainfall) >=201:
rainfall_level = 'heavy rain'

if int(nitrogen_content) >= 1 and int(nitrogen_content) <= 50:


# Here I have divided the nitrogen values into three categories.
nitrogen_level = 'less'
elif int(nitrogen_content) >=51 and int(nitrogen_content) <=100:
nitrogen_level = 'not to less but also not to high'
elif int(nitrogen_content) >=101:
nitrogen_level = 'high'

37
if int(phosphorus_content) >= 1 and int(phosphorus_content) <= 50:
# Here I have divided the phosphorus values into three categories.
phosphorus_level = 'less'
elif int(phosphorus_content) >= 51 and int(phosphorus_content) <=100:
phosphorus_level = 'not to less but also not to high'
elif int(phosphorus_content) >=101:
phosphorus_level = 'high'

if int(potassium_content) >= 1 and int(potassium_content) <=50:


# Here I have divided the potassium values into three categories.
potassium_level = 'less'
elif int(potassium_content) >= 51 and int(potassium_content) <= 100:
potassium_level = 'not to less but also not to high'
elif int(potassium_content) >=101:
potassium_level = 'high'

if float(ph_content) >=0 and float(ph_content) <=5:


# Here I have divided the ph values into three categories.
phlevel = 'acidic'
elif float(ph_content) >= 6 and float(ph_content) <= 8:
phlevel = 'neutral'
elif float(ph_content) >= 9 and float(ph_content) <= 14:
phlevel = 'alkaline'

print(crop_name)
print(humidity_level)
print(temperature_level)
print(rainfall_level)
print(nitrogen_level)
print(phosphorus_level)
print(potassium_level)
print(phlevel)

speak("Sir according to the data that you provided to me. The ratio of
nitrogen in the soil is " + nitrogen_level + ". The ratio of phosphorus in
the soil is " + phosphorus_level + ". The ratio of potassium in the soil is
" + potassium_level + ". The temperature level around the field is " +
temperature_level + ". The humidity level around the field is " +
humidity_level + ". The ph type of the soil is " + phlevel + ". The amount of
rainfall is " + rainfall_level ) # Making our program to speak about the
data that it has received about the crop in front of the user.
window['-OUTPUT1-'].update('The best crop that you can grow : ' +
crop_name ) # Suggesting the best crop
after prediction.
speak("The best crop that you can grow is " + crop_name)
# Speaking the name of the predicted crop.

38
window.close()

39
3.3 Results

To predict the crop yield rate a application is created. This application includes three parts.
First is managing datasets, second is testing datasets and third is analyzing datasets. In
managing datasets we can get the datasets of previous years and they can also be converted
into supporting format.

40
41
CHAPTER 4
TESTING/RESULT AND ANALYSIS

4.1 Collecting the Raw Data

The practice of cumulating and Data collection is a way to keep track of past
occurrences so that one can utilize da repetitive patterns. The ‘Crop
Recommendation’ dataset is collected from the Kaggle website. The dataset takes
into account 22 different crops (N) (ii) Phosphorus content ratio (P) expressed in
degree Celsius (v) Percentage of Relative Humidity measured in millimeters.(vi)
ph value and (vii) Rainfall measured in millimeters.

4.2. Data Preprocessing

The process of modifying raw data into a form learning algorithms to find
insights or forecast outcomes is called Data preprocessing. In this project the data
processing method is to find missing values. Getting every data point for every
record in dataset is tough. Empty cells, values like null or a specific character,
such as a question mark, might all indicate that data is missing. The dataset used
in the project didn’t have any missing values.

42
4.3. Train and Test

It is a process of splitting the dataset into a training dataset and testing dataset
using train_test_split() method of scikit learn module. 2200 data in the dataset has
been divided as 80% of a dataset into training dataset-1760 and 20% of a dataset
into testing dataset-440 data.

4.4. Fitting the model

Modifying the model’s parameters to increase accuracy is referred to as fitting.


To construct a machine learning model, an algorithm is performed on data for
which the target variable is known. The model’s accuracy is determined by
comparing the model's outputs to the target variable's actual, observed values.
Model fitting is the ability of a machine learning model to generalize data
comparable to that with which it was trained. When given unknown inputs, a
good model fit refers to a model that properly approximates the output.

4.5. Checking the score over a training dataset

Scoring, often known as prediction, is the act of creating values from new input
data using a trained machine learning model. Using model.score() method
calculating the score of each model over a training dataset shows how well the
model has learned.

4.6. Predicting the model

When forecasting the likelihood of a specific result, “prediction” refers to the


outcome of after it has been trained on a previous dataset and applied to new data.
Predicting the model using predict() method using test feature dataset. It has given
the output as an array of predicted values.

43
4.7 Result

44
CHAPTER 5
CONCLUSION AND FUTURE
ENHANCEMENTS

5.1 Conclusion

The comparative study of three different supervised machine learning models (KNN,
Decision Tree,and Random Forest) is done to predict the best-suited crop for the particular
land that can help farmers to grow crops more efficiently. In completion, we concluded that
the crop prediction dataset showed the best accuracy with Random Forest Classifier both in
Entropy and Gini Criterion with 99.32%. In contrast, K-Nearest Neighbor has the lowest
accuracy among the three with 97.04%, and the accuracy of Decision Tree Classifier is in
between KNN and Random Forest Classifier. When comparing the accuracy value, Decision
Tree Gini criterion gave a better accuracy of 98.86% compared to Decision Tree Entropy
Criterion. In the future, new data from the fields can be collected to get a clear image of the
soil and incorporate other machine learning algorithms and deep learning algorithms such as
ANN or CNN to classify more varieties of crops.

The KNN algorithm is a popular machine learning algorithm used in crop prediction. It is
used to predict crop yield and production based on historical data of weather, soil, and crop
types. The algorithm works by finding the k-nearest neighbors of a new data point, and then
predicting the outcome based on the majority vote of those neighbors.

In conclusion, KNN algorithm is a promising approach for crop prediction. However, the
accuracy of the model heavily depends on the quality and quantity of data used for training.
Additionally, there are other factors such as weather changes, pest attacks, and changes in
soil composition that can affect crop yield, and therefore, these factors should also be
considered in the crop prediction model.

45
5.2 Future Enhancements
We have to collect all required data by giving GPS locations of a land and by taking access
from Rain forecasting system of by the government, we can predict crops by just giving GPS
location. Also, we can develop the model to avoid over and under crisis of the food. We
believe the proposed system will be able to help farmers to take the right decision of
cultivating the right crop. A farmer can plant different crop in different districts based on the
system recommendations. So, every farmer will get the chance of maximizing their yield and
profit by using the system. Our main goal is to produce more with less as even being a
developing country; we are almost using all our resources to keep up to data with the rest of
the world. In addition, any sort of contribution to the agriculture can be beneficial for the
country as well to its people.
The proposed model is constructed by using AI algorithms to reduce the farmers’ problems of
getting losses in their farms due to lack of knowledge of cultivation in different soil and
weather conditions. The model is created by using machine learning (SVM) and deep
learning (LSTM, RNN) techniques. The model predicts best crops that should be grown on
land with less expenses among a number of crops available after analyzing the prediction
parameters. To the best of studies, there is no such work in existence that uses the same
techniques in predicting the crops. Hence, it is concluded that there is an enhancement in the
accuracy of this research work when compared to the existing work that used another
techniques for prediction of crops. The accuracy is calculated as 97%. It has a vast extension
in future and can be actualized and interfaced with a flexible and multi-skilled application.
The farmers need to be educated and hence, will get a clear information regarding best crop
yield on their mobiles. With this, even if the rancher is at home, the work can be managed at
that particular instant of time, without facing any kind of loss ahead. The progress in the
agribusiness field will be extremely appreciable which will further result in helping the
farmers in production of crops.

46
REFERENCES
[1] Dahikar S and Rode S V 2014 Agricultural crop yield prediction using artificial neural
network approach International Journal of Innovative Research in Electrical, Electronics,
Instrumentation and Control Engineering vol 2 Issue 1 pp 683-6.

[2] Suresh A, Ganesh P and Ramalatha M 2018 Prediction of major crop yields of
Tamilnadu using K-means and Modified KNN 2018 3rd International Conference on
Communication and Electronics Systems (ICCES) pp 88-93 doi:
10.1109/CESYS.2018.8723956.

[3] Medar R, Rajpurohit V S and Shweta S 2019 Crop yield prediction using machine
learning techniques IEEE 5th International Conference for Convergence in Technology
(I2CT) pp 1-5 doi: 10.1109/I2CT45611.2019.9033611.

[4] Nishant P S, Venkat P S, Avinash B L and Jabber B 2020 Crop yield prediction based
on Indian agriculture using machine learning 2020 International Conference for Emerging
Technology (INCET) pp 1-4 doi: 10.1109/INCET49848.2020.9154036.

[5] Kalimuthu M, Vaishnavi P and Kishore M 2020 Crop prediction using machine
learning 2020 Third International Conference on Smart Systems and Inventive Technology
(ICSSIT) pp 926-32 doi: 10.1109/ICSSIT48917.2020.9214190.

[6] Geetha V, Punitha A, Abarna M, Akshaya M, Illakiya S and Janani A P 2020 An


effective crop prediction using random forest algorithm 2020 International Conference on
System, Computation, Automation and Networking (ICSCAN) pp 1-5 doi:
10.1109/ICSCAN49426.2020.9262311.

[7] Pande S M, Ramesh P K, Anmol A, Aishwaraya B R, Rohilla K and Shaurya K 2021


Crop recommender system using machine learning approach 2021 5th International
Conference on Computing Methodologies and Communication (ICCMC) pp 1066-71 doi:
10.1109/ICCMC51019.2021.9418351.

[8] Sellam V, and Poovammal E 2016 Prediction of crop yield using regression analysis
Indian

47
Journal of Science and Technology vol 9(38) pp 1-5.
[9] Bharath S, Yeshwanth S, Yashas B L and Vidyaranya R Javalagi 2020 Comparative Analysis of
Machine Learning Algorithms in The Study of Crop and Crop yield Prediction International Journal
of Engineering Research & Technology (IJERT) NCETESFT – 2020 vol 8 Issue 14.
[10] Mahendra N, Vishwakarma D, Nischitha K, Ashwini and Manjuraju M. R 2020 Crop
prediction using machine learning approaches, International Journal of Engineering
Research & Technology (IJERT) vol 9 Issue 8 (August 2020).
[11] Gulati P and Jha S K 2020 Efficient crop yield prediction in India using machine
learning techniques International Journal of Engineering Research & Technology (IJERT)
ENCADEMS – 2020 vol 8 Issue 10.
[12] Gupta A, Nagda D, Nikhare P, Sandbhor A, 2021, Smart crop prediction using IoT
and machine learning International Journal of Engineering Research & Technology
(IJERT) NTASU – 2020 vol 9 Issue 3.

48

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy