Abstract
This paper presents a constructive method for association rule extraction, where the knowledge of data is encoded into an SVM classification tree (SVMT), and linguistic association rule is extracted by decoding of the trained SVMT. The method of rule extraction over the SVMT (SVMT-rule), in the spirit of decision-tree rule extraction, achieves rule extraction not only from SVM, but also over the decision-tree structure of SVMT. Thus, the obtained rules from SVMT-rule have the better comprehensibility of decision-tree rule, meanwhile retains the good classification accuracy of SVM. Moreover, profiting from the super generalization ability of SVMT owing to the aggregation of a group of SVMs, the SVMT-rule is capable of performing a very robust classification on such datasets that have seriously, even overwhelmingly, class-imbalanced data distribution. Experiments with a Gaussian synthetic data, seven benchmark cancers diagnosis, and one application of cell-phone fraud detection have highlighted the utility of SVMT and SVMT-rule on comprehensible and effective knowledge discovery, as well as the superior properties of SVMT-rule as compared to a purely support-vector based rule extraction. (A version of SVMT Matlab software is available online at http://kcir.kedri.info)
Similar content being viewed by others
References
Nunez H, Angulo C, Catala A (2002) Rule-extraction from support vector Machines. In: The European symposiumon aritificial neural networks, Burges, pp 107–112
Zhang Y, Su HY, Jia T, Chu J (2005) Rule extraction from trained support vector machines, PAKDD 2005, LANI3518. Springer, Heidelberg, pp 61–70
Wang L, Fu X (2005) Rule extraction from support vector machine. In: Data mining with computational intelligence, nced information and knowlegde processing. Springer, Berlin
Barakat N, Bradley AP (2006) Rule extraction from support vector machines: measuring the explanation capability using the area under the ROC curve. In: The 18th international conference on pattern recognition (ICPR’06), August, 2006, Hong Kong
Fung G, Sandilya S, Rao B (2005) Rule extraction for linear support vector machines, KDD2005, August 21–24, 2005, Chicago
Fu X, Ong C, Keerthi S, Huang GG, Goh L (2004) Extracting the knowledge embedded in support vector machines. In: Proceedings of IEEE international joint conference on neural networks, vol 1, no 25–29 July 2004, pp 291–296
Vapnik V (1982) Estimation of dependences based on empirical data. Springer, Heidelberg
Vapnik V (1995) The nature of statistical learning theory. Spinger, Heidelberg
Cortes C, Vapnik V (1995) Support vector network. Mach Learning 20: 273–297
Pang S, Ozawa S, Kasabov N (2004) One-pass incremental membership authentication by face classification. ICBA 2004, LNCS, vol 3072. Springer, Heidelberg, pp 155–161
Pang S, Kim D, Bang SY (2003) Membership authentication in the dynamic group by face classification using SVM ensemble. Patt Recogn Lett 24: 215–225
Pang S (2005) SVM aggregation: SVM, SVM ensemble, SVM classification tree, IEEE SMC eNewsletter Dec. 2005. http://www.ieeesmc.org/Newsletter/Dec2005/R11Pang.php
Pang S, Kim D, Bang SY (2005) Face membership authentication using svm classification tree generated by membership-based LLE data partition. IEEE Trans Neural Netw 16(2): 436–446
Schölkopf JC, Platt JC, Shawe-Taylor J, Smola AJ, Williamson RC (1999) Estimating the support of a high-dimensional distribution. Technical report, Microsoft Research, MSR-TR-99-87
Tax DMJ (2001) One-class classification, concept-learning in the absence of counter-examples. PhD Thesis
Tax DMJ, Duin RPW (2001) Combining one-class classifiers. LNCS 2096: 299–308
Xu Y, Brereton RG (2005) Diagnostic pattern recognition on gene expression profile data by using one-class classifiers. J Chem Inf Model 45: 1392–1401
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo
Kim H-C, Pang S, Je H-M, Kim D, Yang Bang S (2003) Constructing support vector machine ensemble. Patt Recogn 36(12): 2757–2767
Shipp MA, Ross KN et al (2002) Supplementary information for diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med 8(1): 68–74
Golub TR (2004) Toward a functional taxonomy of cancer. Cancer Cell 6(2): 107–8
Pomeroy S, Tamayo P et al (2002) Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415(6870): 436–442
Alon U, Barkai N et al (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 8: 6745–6750
Petricoin EF, Ardekani AM et al (2002) Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359: 572–577
Van’t Veer LJ et al (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415: 530–536
Gordon GJ, Jensen R et al (2002) Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res 62: 4963–4967
Dudoit S, Yang YH, Callow MJ, Speed TP (2002) Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Stat Sin 12: 111–139
Schuster A, Wolff R, Trock D (2005) A high-performance distributed algorithm for mining association rules. Knowl Inf Syst 7: 458–475
Kam Ho T (1998) The random subspace method for constructing decision forests Tin Kam Ho. IEEE Trans Patt Anal Mach Intell 20(8): 832–844
NeuCom—A Neuro-computing Decision Support Enviroment, Knowledge Engineering and Discovery Research Institute, Auckland University of Technology, http://www.theneucom.com
Nez H, Angulo C, Catal A (2003) Hybrid Architecture based on support vector machines. In: Book Computational Methods in Neural Modeling Lecture Notes in Computer Science, vol 2686, pp 646–653
Zhou ZH, Jiang Y (2003) Medical diagnosis with C4.5 rule preceded by artificial neural netowrk ensemble. IEEE Trans Inf Technol Biomed 7(1): 37–42
Chen Y, Wang JZ (2003) Support vector learning for fuzzy rule-based classification systems. IEEE Trans Fuzzy Syst 11(6): 716–728
Nunez H, Angulo C, Catala A (2002) Support vector machines with symbolic interpretation. In: Proceedings of VII Brazilian symposium on neural networks, 11–14 Nov. 2002, pp 142–147
Duch W, Setiono R, Zurada JM (2004) Computational intelligence methods for rule-based data understanding. Proc IEEE 92(5): 771–805
Pang S, Kim D, Bang SY (2001) Fraud detection using support vector machine ensemble. ICONIP2001, Shanghai, China
Terabe M, Washio T, Motoda H, Katai O, Sawaragi T (2002) Attribute generation based on association rules. Knowl Inf Syst 4: 329–349
Barakat N, Diederich J (2004) Learning-based rule-extraction from support vector machines: performance on benchmark data sets. In: Proceedings of conference on neuro-computing and evolving intelligence, Dec. 2004
Barakat N, Diederich J (2005) Eclectic rule-extraction from support vector machines. Int J Comput Intell 2(1): 59–62
Barakat N, Bradley A (2007) Rule extraction from support vector machines: a sequential covering apporoach. IEEE Trans Knowl Data Eng 19(6): 729–741
Provost F (2000) Machine Learning from Imbalanced Data Sets 101. Working Notes AAAI’00 workshop learning from imbalanced data sets, pp 1–3
Wu G, Chang E (2005) KBA: Kernel boundary alignment considering imbalance data distribution. IEEE Trans Knowl Data Eng 17(6): 786–795
Wu G, Chang E (2003) Adaptive feature-space conformal transformation for imbalanced data learning. In: Proceedings of 20th internatuional conference on machine learning, pp 816–823
Lin Y, Lee Y, Wahba G (2002) Support vector machines for classification in nonstandard situations. Mach Learn 46: 191–101
Veropoulos K, Campbell C, Cristianini N (1999) Controlling the sensitivity of support vector machine. In: Proceedings of international joint conference on artifical intelligence, pp 55–60
Estabrooks J, Japkowicz N (2004) A multiple resampling method for learning from imbalanced data sets. Comput Intell 20: 18–36
Zhou ZH, Liu XY (2006) Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng 19(1): 63–77
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res (JAIR) 16: 321–357
Falco De, Della Cioppa A, Iazzetta A, Tarantino E (2005) An evolutionary approach for automatically extracting intelligible classification rules. Knowl Inf Syst 7: 179–201
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pang, S., Kasabov, N. Encoding and decoding the knowledge of association rules over SVM classification trees. Knowl Inf Syst 19, 79–105 (2009). https://doi.org/10.1007/s10115-008-0147-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-008-0147-1