Abstract
Data Mining is a popular knowledge discovery technique. In data mining decision trees are of the simple and powerful decision making models. One of the limitations in decision trees is towards the data source which they tackle. If data sources which are given as input to decision tree are of imbalance nature then the efficiency of decision tree drops drastically, we propose a decision tree structure which mimics human learning by performing balance of data source to some extent. In this paper, we propose a novel method based on sampling strategy. Extensive experiments, using C4.5 decision tree as base classifier, show that the performance measures of our method is comparable to state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Juanli, H., Deng, J., Sui, M.: A new approach for decision tree based on principal component analysis. In: Proceedings of Conference on Computational Intelligence and Software Engineering, pp. 1–4 (2009)
Bergsma, S.: Large-scale semi-supervised learning for natural language processing. PhD Thesis, University of Alberta (2010)
Durkin, J.: Expert systems: design and development. Prentice Hall, Englewood Clis (1994)
Quinlan, J.: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA (1993)
Purdila, V., Pentiuc, S.-G.: MR-Tree - A Scalable MapReduce Algorithm for Building Decision Trees. Journal of Applied Computer Science & Mathematics, 16(8) (2014). Suceava
Farid, D.M., Harbi, N., Mohammad Zahidur, R.: Combining naive bayes and decision tree for adaptive intrusion detect. International Journal of Network Security & Its Applications (IJNSA), 2(2) (April 2010)
Mohammad, K., Mahmood, A.: The Use of Genetic Algorithm, Clustering and Feature Selection Techniques in Constrcution of Decision Tree Models for Credit Scoring. International Journal of Managing Information Technology (IJMIT) 5(4) (November 2013). doi:10.5121/ijmit.2013.5402
Dianhong, W., Xingwen, L., Liangxiao, J., Xiaoting, Z., Yongguang, Z.: Rough Set Approach to Multivariate Decision Trees Inducing? Journal of Computers, 7(4) (April 2012)
Xinmeng, Z., Shengyi, J.: A Splitting Criteria Based on Similarity in Decision Tree Learning. Journal of Software, 7(8) (August 2012)
Ying, W., Xinguang, P., Jing, B.: Computer Crime Forensics Based on Improved Decision Tree Algorithm. Journal of Networks, 9(4) (April 2014)
Dong-sheng, L., Shujiang, F.: A Modified Decision Tree Algorithm Based on Genetic Algorithm for Mobile User Classification Problem. Scientific World Journal, Article ID 468324, 11 (2014). Hindawi Publishing Corporation. http://dx.doi.org/10.1155/2014/468324
Win-Tsung, L., Yue-Shan, C., Ruey-Kai, S., Chun-Chieh, C., Shyan-Ming, Y.: CUDT: A CUDA Based Decision Tree Algorithm. Scientific World Journal, Article ID 745640, 12 (2014). Hindawi Publishing Corporation. http://dx.doi.org/10.1155/2014/745640
Tarun, C., Jayashri, V.: Fault Diagnosis in Benchmark Process Control System Using Stochastic Gradient Boosted Decision Trees. International Journal of Soft Computing and Engineering (IJSCE), 1(3) (July 2011). ISSN: 2231-2307
Ganga Devi, S.V.S.: Fuzzy Rule Extraction for Fruit Data Classification. Compusoft, An international journal of advanced computer technology, 2(12) (December 2013)
Hamilton, A., Asuncion, D., Newman.: UCI Repository of Machine Learning Database (School of Information and Computer Science). Univ. of California, Irvine (2007). http://www.ics.uci.edu/∼mlearn/MLRepository.html
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Quinlan, J.: Induction of decision trees. Machine Learning 1, 81–106 (1986)
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth, Belmont (1984)
Chawla, N.V., et al.: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research. 16, 321–357 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Battula, B.P., Bhattacharyya, D., Prasad, C.V.P.R., Kim, Th. (2015). A Novel Prototype Decision Tree Method Using Sampling Strategy . In: Gervasi, O., et al. Computational Science and Its Applications -- ICCSA 2015. ICCSA 2015. Lecture Notes in Computer Science(), vol 9155. Springer, Cham. https://doi.org/10.1007/978-3-319-21404-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-21404-7_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21403-0
Online ISBN: 978-3-319-21404-7
eBook Packages: Computer ScienceComputer Science (R0)