1822 B.E Cse Batchno 336
1822 B.E Cse Batchno 336
LEARNING ALGORITHM
Submitted in partial fulfillment of the requirements for
the award of Bachelor of Engineering Degree in
Computer Science and Engineering
By
LAKSHMAN KUMAR SERU
(38110515)
SAI MAANAS GANDHAM
(38110482)
April 2022
1
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
BONAFIDE CERTIFICATE
This is to certify that this Project Report is the bonafide work of Lakshman
Kumar Sreu (38110515) and Sai Maanas Gandham (38110482) who carried
out the project entitled “CROP RECOMMENDATION SYSTEM USING MACHINE
LEARNING ALGORITHM” under my supervision from November 2021 to April
2022.
Internal Guide
2
DECLARATION
DATE:
3
ACKNOWLEDGEMENT
4
No. Title. Page No.
1 Abstract 6
2 Introduction 7
2 .1 Objective 8
3 Literature Survey 9-14
3.1 Existing System 15
3.2 Proposed System 16
4 Aim and Scope of 17
present investigation
5 18-33
Experimental or materials
and methods algorithms
used
6 Results and Discussions 34
7 Conclusion 37
8 References 38
9 Source Code 39
5
1. ABSTRACT
6
2. INTRODUCTION
7
2.1 OBJECTIVE
8
3. LITERATURE SURVEY
To keep up nutrition levels in the soil in case of deficiency, fertilizers are added to
soil. The standard issue existing among the Indian agriculturists choose
approximate amount of fertilizers and add them manually. Excess or deficient
extension of fertilizers can harm the plants life and reduce the yield. This paper
gives overview of various data mining frameworks used on cultivating soil dataset
for fertilizer recommendation.
Authors : M.C.S.Geetha
Agriculture is the most critical application area especially in the developing nations
like India .Use of information technology in agriculture can change the situation of
decision making and farmers can yield in better way.. This paper integrates the
work of several authors in a single place so it is valuable for specialists to get data
of current situation of data mining systems and applications in context to farming
field.
This paper communicates the idea regarding the making of AgroNutri an android
application that helps in conveying the harvest particular fertilizer amount to be
applied. The idea is to calculate the measure of NPK composts to be applied
depend on the blanked proposal of the crop of interest. This application works
depend on the product chosen by the farmer and that is taken as input, thus
providing the farmers. The future scope of the AgroNutri is that GPRS can be
included so that according to location nutrients are suggested.
9
[4]Title: Machine Learning: Applications in Indian Agriculture, 2016.
Agriculture is a field that has been lacking from adaption of technologies and their
advancements. Indian agriculturists should be up to the mark with the universal
procedures. Machine learning is a native concept that can be applied to every field
on all inputs and outputs. It has effectively settled its ability over ordinary
calculations of software engineering and measurements. Machine learning
calculations have improved the exactness of artificial intelligence machines
including sensor based frameworks utilized in accuracy farming. This paper has
evaluated the different uses of machine learning in the farming area. It additionally
gives a knowledge into the inconveniences looked by Indian farmers and how they
can be resolved using these procedures.
Author: Uwe A. Schneider a,⇑, Petr Havlik b, Erwin Schmid c, Hugo Valin b,
Aline Mosnier b,c, Michael Obersteiner b, Hannes Bottcher b, Rastislav
Skalsky´ d, Juraj Balkovicˇ d, Timm Sauer a, Steffen Fritz b
Throughout the following decades humanity will request more food from less land
and water assets. This investigation evaluates the food production effects of four
elective advancement situations from the Millennium Ecosystem Assessment and
the Special Report on Emission Scenarios. partialy and jointly considered are land
and water supply impacts from population development, and specialized change,
and forests and agriculture demand request shifts from population development
and economic improvement. The income impacts on nourishment request are
registered with dynamic flexibilities. Worldwide farming area increments by up to
14% somewhere in the range of 2010 and 2030.Deforestation restrictions strongly
impact the price of land and water resources but have little consequences for the
global level of food production and food prices. While projected income changes
have the highest partial impact on per capita food consumption levels, population
growth leads to the highest increase in total food production. The impact of
technical change is amplified or mitigated by adaptations of land management
intensities
10
[6]Title: Brief history of agricultural systems modelling,2016.
11
[7]Title: A Smart Agricultural Model by Integrating Iot, Mobile and
Cloud-based Big Data Analytics, 2017.
In the cultivating field, the system models play a significant role to the
enhancement of the agro-normal and money related conditions. In the proportions
of benefits of the field and farm examinations to give the information and to
recognize fitting and fruitful organization practices. It can recognize the
organization to arrive managers and transversely over reality as long as the
required soil, the board, environment, and money related information. Decision
Support Systems (DSSs) use to make the information for the vermin the board,
develop the officials. These systems are not using the impelled strategies to
process the data. Thusly, use the adroit system thoughts to take the decisions for
the issue. It expects a crucial activity in the comprehension of agronomic results,
and their use as decision sincerely steady systems for farmers is extending.
Authors: Olakunle Elijah, Tharek Abdul Rahman, Igbafe Orikumhi, Chee Yen
Leow, Nour Hindia.
A blueprint of Iot and DA in agriculture has been shown in this paper. A couple of
zones related to the association of Iot in agribusiness have been discussed in
detail. The investigation of composing exhibits that there are clusters of work
advancing being produced of Iot development that can be used to increase
operational efficiency and gainfulness of plant and creatures. The benefits of Iot
and DA, and open troubles have been identified and inspected in this paper. Iot is
depended upon to offer a couple of benefits to the agribusiness division.
Regardless, there are up 'til now different issues to be steered to make it moderate
for close to nothing and medium-scale farmers. The key issues are security and
cost. It is typical that as contention increases in the cultivating part
12
[9]Title: Circulation Mode Selection Based on Cost Analysis, 2017.
If every farmer and each average production base will join their optimal conditions
in making cooperatives, it will accomplish economies of scale. Furthermore,
producers will have an all the more favourable position in the plans with
downstream firms (shipper or retailer).Second, the main customers of wholesale
market are not inhabitants nearby who buy small quantities products but lower
distributors or retailers. More redesigned transportation mode respects intensive
attempt of new agrarian things, which prompts bolster the movement of new chain
joint logistics and strengthen resource utilize and made logistics advantage quality.
Refresh everything considered agrarian things spread. By then, regard the
examination of gigantic worth control of standard things and achieve the mind
blowing control to stream process.
Using support vector machine (SVM) is to realize the self learning of fuzzy
inference system (FIS), based on a fast modified varying metric method (MDFP)
and a support vector machine identifier (SVMI), a SVM-FIS self-learning controller
for the threephase induction machine adjustable speed system has been
designed. The proposed controller is not only of the advantages that FIS does not
depend on the plant model, strong robustness, and adaptive self-learning ability,
but also learning ability and generalization performance of SVM. The designed
processes of SVM-FIS, MDFP, and SVMI algorithms have been described in
details. Simulation results show the feasibility, correctness and effectiveness of the
proposed control strategy, such as the excellent static and dynamic performances,
and strong anti-interference ability.
13
[11]Title: Machine Learning Facilitated Rice Prediction in Bangladesh, 2015.
In this examination, self organising map (SOM) was utilized to group the
information relationship between the information factors. After that chi-square test
strategy was utilized to set up the level of reliance between the related variable
qualities. It was discovered that the day by day outrageous climate conditions, for
example, most extreme and least fluctuation in temperature, precipitation,
dampness and wind speed were the principle drivers of product development,
yield and wine quality
we utilize different kernel functions in the CPPI models to depict the connection
between fractional winter wheat area and MODIS EVI time series data. We tried
three straight and non-direct kernel functions, including linear regression, artificial
neural system, and support vector machine.. For areas like DT where multiple crop
types have comparative phenology cycles, ANN-CPPI is found to play out the
best. The two crop types to be specific winter wheat and rapeseed, can be
separated well. These tests give elective answers for the uses of CPPI in mixed
areas.
14
3.1 EXISTING SYSTEM
The computational and data demands of structural price forecasting generally far
exceed than what is routinely available in developing countries. Consequently,
researchers often rely on parsimonious representations of price processes for their
forecasting needs. Contemporary parsimonious form of price forecasting relies
heavily on time series modelling. In time series modelling, past observations of
the same variable are collected and analyzed to develop a model describing the
underlying relationship. During the past few decades, much effort has been
devoted to the development and improvement of time series forecasting models.
Time series modelling requires less onerous data input for regular and up-to date
price forecasting. Hence there is a need for better classification which would be an
ensemble or hybrid classification model.
● Efficiency is low.
● The existing system which recommends crop yield is either hardware-based
being costly to maintain, or not easily accessible.
● Despite many solutions that have been recently proposed, there are still
open challenges in creating a user-friendly application with respect to crop
recommendation.
● More number of repeated work.
15
3.2 PROPOSED SYSTEM
In proposed system, the data analysis technology is used to update the crop yield
rate change. The concept of this paper is to implement the crop selection method
so that this method helps in solving many agriculture and farmers problems. This
improves our Indian economy by maximizing the yield rate of crop production.
Different types of land condition. So the quality of the crops are identified using
ranking process. By this process the rate of the low quality and high quality crop is
also intimated. The usage of ensemble of classifiers paves a path way to make a
better decision on predictions due to the usage of multiple classifiers. Further, a
ranking process is applied for decision making in order to select the classifiers
results. This system is used to predict the cost of the fertilizers for further. This
project uses Ensemble of classifiers such as Decision tree and Random forest
classifier. In addition, this project uses Ranking technique.
16
4. AIM AND SCOPE OF THE PRESENT INVESTIGATION
4.1 Aim:
Our Aim from the project is to make a ML model which takes student data trains
itself using various Machine Learning techniques and Algorithms(Random Forest,
Decision Tree) and predict the yield and best fertilizer that suits for the crops in
virtual environment by considering the overall factors that contribute in his overall
yield.
Secondly, to learn the required tech stacks and use it to make model with an
python application and lastly to execute it get output about yield and best fertilizer
for the crop.
4.2 Scope:
This Project can be used to get the student performance with more accuracy than
any other model published earlier and we can also make some mobile or web
application based on the model.
17
5. EXPERIMENTAL OR MATERIALS AND METHODS;
ALGORITHAMS USED.
Decision Tree is a Supervised learning technique that can be used for both
classification and Regression problems, but mostly it is preferred for solving
Classification problems. It is a tree-structured classifier, where internal nodes
18
represent the features of a dataset, branches represent the decision
rules and each leaf node represents the outcome.
In a Decision tree, there are two nodes, which are the Decision Node and Leaf
Node.
Decision nodes are used to make any decision and have multiple branches,
whereas Leaf nodes are the output of those decisions and do not contain any
further branches.
The decisions or the test are performed on the basis of features of the given
dataset. It is a graphical representation for getting all the possible solutions to a
problem/decision based on given conditions.
It is called a decision tree because, similar to a tree, it starts with the root node,
which expands on further branches and constructs a tree-like structure.
In order to build a tree, we use the CART algorithm, which stands for Classification
and Regression Tree algorithm.
A decision tree simply asks a question, and based on the answer (Yes/No), it
further split the tree into subtrees.
The complete process can be better understood using the below algorithm:
Step-1: Begin the tree with the root node, says S, which contains the complete
dataset.
Step-2: Find the best attribute in the dataset using Attribute Selection Measure
(ASM).
Step-3: Divide the S into subsets that contains possible values for the best
attributes.
Step-4: Generate the decision tree node, which contains the best attribute.
Step-5: Recursively make new decision trees using the subsets of the dataset
created in step -3. Continue this process until a stage is reached where you
cannot further classify the nodes and called the final node as a leaf node.
19
5.4 SYSTEM REQUIREMENTS:
20
Python:
History of Python:
Python was developed by Guido van Rossum in the late eighties and early
nineties at the National Research Institute for Mathematics and Computer Science
in the Netherlands.
Python is derived from many other languages, including ABC, Modula-3, C, C++,
Algol-68, SmallTalk, and Unix shell and other scripting languages.
Python is copyrighted. Like Perl, Python source code is now available under the
GNU General Public License (GPL).
21
Python Features:
● Easy-to-read − Python code is more clearly defined and visible to the eyes.
● A broad standard library − Python's bulk of the library is very portable and
cross-platform compatible on UNIX, Windows, and Macintosh.
● Portable − Python can run on a wide variety of hardware platforms and has
the same interface on all platforms.
5.6 MODULES:
22
● Admin Login
● Metadata
● Data Pre-processing
● Crop Prediction Module
● Crop Recommendation Module
MODULES DESCRIPTION:
Admin Login:
This is the first activity, Admin needs to provide a correct contact number and a
password, which user enters while registering, in order to login into the webpage.
If information provided by the admin matches with the data in the database table
then user successfully login into the webpage else message of login failed is
displayed and user need to re-enter correct information.
Metadata:
All the main data used in the data set are initialized with the number to use in the
algorithm it is like initializing all the details. In this metadata, we are going to
initialize all the crop names with the numbers. This data makes us use the data
easily in the algorithm. Hear the metadata of all the crops is given with a particular
number. This number is not duplicated that is one number is given to one crop, the
same number is not given to the other crop. This metadata consists of more than
a hundred crops that grown all over India.
Data Pre-processing:
Hear the raw data in the crop data is cleaned and the metadata is appending to it
by removing the things which are converted to the integer. So, the data is easy to
train. Hear all the data. In this pre-processing, we first load the metadata into this
and then this metadata will be attached to the data and replace the converted data
with metadata. Then this data will be moved further and remove the unwanted
data in the list and it will divide the data into the train and the test data.
23
The obtained result will be helpful for the farmers to know the Yield of the crop so,
he can go for the better crop which gives high yield and also say them the efficient
use of agriculture field. This way we can help the farmers to grow the crop which
gives them better yield.
In this module, we have proposed a model that addresses these issues. The
novelty of the proposed system is to guide the farmers to maximize the crop yield
as well as suggest the most profitable crop for the specific region.
ADVANTAGES
● To represent complete systems using object oriented concepts
● To establish an explicit coupling between concepts and executable code
● To take into account the scaling factors that are inherent to complex and
● Critical end.
● To creating a modelling language usable by both humans and machines
24
various processing carried out on this data, and the output data is
generated by this system.
2. The data flow diagram (DFD) is one of the most important modeling tools. It
is used to model the system components. These components are the
system process, the data used by the process, an external entity that
interacts with the system and the information flows in the system.
3. DFD shows how the information moves through the system and how it is
modified by a series of transformations. It is a graphical technique that
depicts information flow and the transformations that are applied as data
moves from input to output.
4. DFD is also known as bubble chart. A DFD may be used to represent a
system at any level of abstraction. DFD may be partitioned into levels that
represent increasing information flow and functional detail.
25
Use case diagrams overview the usage requirement for system. They are
useful for presentations to management and/or project stakeholders, but
for actual development you will find that use cases provide significantly
more value because they describe “the meant” of the actual requirements.
A use case describes a sequence of action that provides something of
measurable value to an action and is drawn as a horizontal ellipse.
26
5.9.1 INPUT DESIGN:
The input design is the link between the information system and the user. It
comprises the developing specification and procedures for data preparation and
those steps are necessary to put transaction data in to a usable form for
processing can be achieved by inspecting the computer to read data from a written
or printed document or it can occur by having people keying the data directly into
the system. The design of input focuses on controlling the amount of input
required, controlling the errors, avoiding delay, avoiding extra steps and keeping
the process simple. The input is designed in such a way so that it provides security
and ease of use with retaining the privacy. Input Design considered the following
things:
OBJECTIVES:
2. It is achieved by creating user-friendly screens for the data entry to handle large
volume of data. The goal of designing input is to make data entry easier and to be
free from errors. The data entry screen is designed in such a way that all the data
manipulates can be performed. It also provides record viewing facilities.
3. When the data is entered it will check for its validity. Data can be entered with
the help of screens. Appropriate messages are provided as when needed so that
the user will not be in maize of instant. Thus the objective of input design is to
create an input layout that is easy to follow
27
A quality output is one, which meets the requirements of the end user and
presents the information clearly. In any system results of processing are
communicated to the users and to other system through outputs. In output design
it is determined how the information is to be displaced for immediate need and
also the hard copy output. It is the most important and direct source information to
the user. Efficient and intelligent output design improves the system’s relationship
to help user decision-making.
The output form of an information system should accomplish one or more of the
following objectives.
28
5.10 SYSTEM STUDY:
FEASIBILITY STUDY:
The feasibility of the project is analyzed in this phase and business proposal is put
forth with a very general plan for the project and some cost estimates. During
system analysis the feasibility study of the proposed system is to be carried out.
This is to ensure that the proposed system is not a burden to the company. For
feasibility analysis, some understanding of the major requirements for the system
is essential.
Three key considerations involved in the feasibility analysis are
♦ ECONOMICAL FEASIBILITY
♦ TECHNICAL FEASIBILITY
♦ SOCIAL FEASIBILITY
ECONOMICAL FEASIBILITY:
This study is carried out to check the economic impact that the system will have on
the organization. The amount of fund that the company can pour into the research
and development of the system is limited. The expenditures must be justified.
Thus the developed system as well within the budget and this was achieved
because most of the technologies used are freely available. Only the customized
products had to be purchased.
TECHNICAL FEASIBILITY:
This study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand
on the available technical resources. This will lead to high demands on the
available technical resources. This will lead to high demands being placed on the
client. The developed system must have a modest requirement, as only minimal or
null changes are required for implementing this system.
SOCIAL FEASIBILITY:
29
The aspect of study is to check the level of acceptance of the system by the user.
This includes the process of training the user to use the system efficiently. The
user must not feel threatened by the system, instead must accept it as a necessity.
The level of acceptance by the users solely depends on the methods that are
employed to educate the user about the system and to make him familiar with it.
His level of confidence must be raised so that he is also able to make some
constructive criticism, which is welcomed, as he is the final user of the system.
SYSTEM TESTING:
The purpose of testing is to discover errors. Testing is the process of trying to
discover every conceivable fault or weakness in a work product. It provides a way
to check the functionality of components, sub assemblies, assemblies and/or a
finished product It is the process of exercising software with the intent of ensuring
that the
Software system meets its requirements and user expectations and does not fail in
an unacceptable manner. There are various types of test. Each test type
addresses a specific testing requirement.
TYPES OF TESTS:
Unit testing:
Unit testing involves the design of test cases that validate that the internal program
logic is functioning properly, and that program inputs produce valid outputs. All
decision branches and internal code flow should be validated. It is the testing of
individual software units of the application .it is done after the completion of an
individual unit before integration. This is a structural testing, that relies on
knowledge of its construction and is invasive. Unit tests perform basic tests at
component level and test a specific business process, application, and/or system
configuration. Unit tests ensure that each unique path of a business
Integration testing:
Integration tests are designed to test integrated software components to determine
if they actually run as one program. Testing is event driven and is more concerned
with the basic outcome of screens or fields. Integration tests demonstrate that
30
although the components were individually satisfaction, as shown by successfully
unit testing, the combination of components is correct and consistent. Integration
testing is specifically aimed at exposing the problems that arise from the
combination of components.
Functional test:
Functional tests provide systematic demonstrations that functions tested are
available as specified by the business and technical requirements, system
documentation, and user manuals.
System Test:
System testing ensures that the entire integrated software system meets
requirements. It tests a configuration to ensure known and predictable results. An
example of system testing is the configuration oriented system integration
test.System testing is based on process descriptions and flows, emphasizing
pre-driven process links and integration points.
31
Black Box Testing:
Black Box Testing is testing the software without any knowledge of the inner
workings, structure or language of the module being tested. Black box tests, as
most other kinds of tests, must be written from a definitive source document, such
as specification or requirements document, such as specification or requirements
document. It is a testing in which the software under test is treated, as a black box
.you cannot “see” into it. The test provides inputs and responds to outputs without
considering how the software works.
Unit Testing:
Unit testing is usually conducted as part of a combined code and unit test phase of
the software lifecycle, although it is not uncommon for coding and unit testing to be
conducted as two distinct phases.
Features to be tested:
Integration Testing:
Software integration testing is the incremental integration testing of two or more
integrated software components on a single platform to produce failures caused by
interface defects.
32
The task of the integration test is to check that components or software
applications, e.g. components in a software system or – one step up – software
applications at the company level – interact without error.
Test Results: All the test cases mentioned above passed successfully. No defects
encountered.
Acceptance Testing:
User Acceptance Testing is a critical phase of any project and requires significant
participation by the end user. It also ensures that the system meets the functional
requirements.
Test Results: All the test cases mentioned above passed successfully. No defects
encountered.
33
in order to select the classifiers results. This system is used to predict the cost
of the crop yield.
6.1 SCREENSHOTS
34
35
36
7. CONCLUSION
This open attitude determines the degree and scope of information sharing. Big
data analysis technology can effectively improve the crop yield production is
updation. This project proposes a novel intelligent system for agricultural crop
price prediction. The key idea is to use ensemble of classifiers for prediction.
The usage of ensemble of classifiers paves a path way to make a better
decision on predictions due to the usage of multiple classifiers. Further, a
ranking process is applied for decision making in order to select the classifiers
results. This system is used to predict the cost of the crop rate for further.
37
8. REFERENCES
[1] Manpreet Kaur, Heena Gulati, Harish Kundra, “Data Mining in Agriculture on
Crop Price Prediction: Techniques and Applications”, International Journal of
Computer Applications, Volume 99– No.12, August 2014.
[2] J. Meng, “Research on the cost of agricultural products circulation and its
control under the new normal economic development,” Commercial Times, no. 23,
pp. 145147, 2016.
[3] A. Kaloxylos et al., “Farm management systems and the future Internet era,”
Comput. Electron. Agricult., vol. 89, pp. 130–144, Nov. 2012.
[4] N. N. Li, T. S. Li, Z. S. Yu, Y. Rui, Y. Y. Miao, and Y. S. Li, “Factors influencing
farmers’ adoption of new technology based on Logistic-ISM model-a case study of
potato planting technology in Dingxi City, Gansu Province,” Progress in
Geography, vol. 33, no. 4, pp. 542-551, 2014.
[5] Y. Wang, "A neural network adaptive control based on rapid learning method
and its application," Advances In Modeling and Analysis, Vol. 46(3), pp.
27-34,1994.
38
9. SOURCE CODE
import numpy as np
import pandas as pd
import pickle
import numpy as np
import pandas as pd
import pickle
import numpy as np
import pandas as pd
39
from sklearn.ensemble import RandomForestClassifier, BaggingClassifier,
AdaBoostClassifier, VotingClassifier
#forest = pickle.load(open('boosting.pkl','rb'))
crop = pickle.load(open('crop.pkl','rb'))
@app.route('/')
@app.route('/index')
def index():
return render_template('index.html')
@app.route('/analysis')
def analysis():
return render_template('analysis.html')
@app.route('/chart')
def chart():
return render_template('chart.html')
#@app.route('/future')
40
#def future():
# return render_template('future.html')
@app.route('/login')
def login():
return render_template('login.html')
@app.route('/upload')
def upload():
return render_template('upload.html')
@app.route('/preview',methods=["POST"])
def preview():
if request.method == 'POST':
dataset = request.files['datasetfile']
df = pd.read_csv(dataset,encoding = 'unicode_escape')
df.set_index('Id', inplace=True)
#@app.route('/home')
#def home():
# return render_template('home.html')
41
@app.route('/prediction', methods = ['GET', 'POST'])
def prediction():
return render_template('prediction.html')
#@app.route('/upload')
#def upload_file():
# return render_template('BatchPredict.html')
@app.route('/predict',methods=['POST'])
def predict():
final_features = [np.array(int_feature)]
y_pred=crop.predict(final_features)
42
elif y_pred[0] == 'Paddy CR Dhan 501 (IET 19189)':
43
label="Crop: Paddy CR Dhan 401 (REETA) Duration of
cultivation: 105-124 "
44
label="Crop: Cashewnut Duration of cultivation: 1030-1035"
45
label="Crop: Maize PMH 5 (JH 3110) Duration of cultivation:
95-105"
46
elif y_pred[0] == 'Varagu':
47
label="Crop: Millet HHB 226 (MH 1479) Duration of
cultivation: 65-76"
48
label="Crop: Flax Duration of cultivation: 120-140"
49
label="Crop: Peas Duration of cultivation: 50-100"
50
return render_template('prediction.html', prediction_text=label)
#@app.route('/performance')
#def performance():
return render_template('performance.html')
if __name__ == "__main__":
app.run(debug=True)
51