209 IJAEMa MARCH 2022
209 IJAEMa MARCH 2022
DEPARTMENT OF CSE, HOLY MARY INSTITUTE OF TECHNOLOGY & SCIENCE, HYDERABAD, TELANGANA-501 301
An architecture developed in [11] uses harmful for the plants and society, this
input; selects needed features; classification procedure has to be changed to find out the
and association rule mining is applied and suitable crop for the soil. The nutrient
Bangladesh has its high production as rice. crop production are examined to determine
Statistical Methodologies has been used to the need of a data mining system to detect
predict its crop production. Shakil Ahamed the crops suite for the soil by analyzing the
[12] applied clustering and classification features of the soil. This helps in giving
recommend for yield and planting of crops. crops that regulate the nutrition levels of the
Factors implementing crop yield were soil. Farmer can know which crop is suitable
considered. They are for that area and also he will get 100% yield
b. Biotic factors-soil pH and salinity based on the soil test results. System makes
of a class is unrelated to the presence (or present some theoretical aspects of the naive
absence) of any other feature Yet, despite bayes classifier. Then, we implement the
this, it appears robust and efficient. Its approach on a dataset with Tanagra. We
performance is comparable to other compare the obtained results (the parameters
supervised learning techniques. Various of the model) to those obtained with other
reasons have been advanced in the literature. linear approaches such as the logistic
In this tutorial, wehighlight an explanation regression, the linear discriminant analysis
based on the representation bias. The naive and the linear SVM. We note that the results
bayes classifier is a linear classifier, as well are highly consistent. This largely explains
as linear discriminant analysis, logistic the good performance of the method in
regression or linear SVM (support vector comparison to others. In the second part, we
machine). The difference lies on the method use various tools on the same dataset (Weka
of estimating the parameters of the classifier 3.6.0, R 2.9.2, Knime 2.1.1, Orange 2.0b
(the learning bias). While the Naive Bayes and RapidMiner 4.6.0). We try above all to
classifier is widely used in the research understand the obtained results.
world, it is not widespread among RANDOM FOREST
practitioners which want to obtain usable Random forests or random decision forests
results. On the one hand, the researchers are an ensemble learning method for
found especially it is very easy to program classification, regression and other tasks that
and implement it, its parameters are easy to operates by constructing a multitude of
estimate, learning is very fast even on very decision trees at training time. For
large databases, its accuracy is reasonably classification tasks, the output of the random
good in comparison to the other approaches. forest is the class selected by most trees. For
On the other hand, the final users do not regression tasks, the mean or average
obtain a model easy to interpret and deploy, prediction of the individual trees is returned.
they does not understand the interest of such Random decision forests correct for decision
a technique. Thus, we introduce in a new trees' habit of overfitting to their training set.
presentation of the results of the learning Random forests generally outperform
process. The classifier is easier to decision trees, but their accuracy is lower
understand, and its deployment is also made than gradient boosted trees. However, data
easier. In the first part of this tutorial, we characteristics can affect their performance.
The first algorithm for random decision to grow in that field. Proposed system
forests was created in 1995 by Tin Kam predicts the crops using various data mining
Ho[1] using the random subspace method, techniques aspecially using a Naïve
which, in Ho's formulation, is a way to Bayesain algorithm to get accurate results.
implement the "stochastic discrimination" This system also useful to agricultural
approach to classification proposed by departments to predict the right crop in right
Eugene Kleinberg. time which gives the efficient results. If we
have such kind of an automation, then it will
An extension of the algorithm was
be useful to farmers and agricultural field.
developed by Leo Breiman and Adele
The goals that have been achieved by the
Cutler, who registered "Random Forests" as
developed system are, Simplified and reduce
a trademark in 2006 (as of 2019, owned by
the manual work of the agricultural
Minitab, Inc.).The extension combines
department, Large volumes of data can be
Breiman's "bagging" idea and random
stored and It provides Smooth work flow
selection of features, introduced first by
Ho[1] and later independently by Amit and REFERENCES
Geman[13] in order to construct a collection
[1] Lokesh.K,Shakti.J, Sneha Wilson,
of decision trees with controlled variance.
Tharini.M.S, “Automated crop prediction
Random forests are frequently used as based on efficient soil nutrient estimation
"blackbox" models in businesses, as they using sensor network”, July 2016,National
generate reasonable predictions across a Conference on Product Design (NCPD
wide range of data while requiring little 2016)
configuration.
[2] Rakesh Kumar, M.P. Singh, Prabhat
CONCLUSION Kumar and J.P. Singh (2015), “Crop
Selection Method to Maximize Crop Yield
Nowadays farmers facing lots of problems
Rate using Machine Learning Technique”,
in the agricultural field due the crop
International Conference on Smart
production and they don’t know the proper
Technologies and Management for
information regarding how to improve crop
Computing, Communication, Controls,
production for what they invest and also to
Energy and Materials (ICSTM).
cultivate. This proposed system helps the
farmers to know about what is a right crop