Data Complexity and Its Effect on EBRB System Accuracy

Xian, Yiqing; Zeng, Guoyan; Liu, Jun

doi:10.1007/978-3-031-77571-0_80

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 1212))

Included in the following conference series:

International Conference on Ubiquitous Computing and Ambient Intelligence

580 Accesses

Abstract

In recent years, the relationship between data quality and machine learning models has garnered increasing attention. The present work quantifies dataset quality using complexity measurements and conducts detailed experimental analysis on the impact of these measurements on the accuracy of the Extend Belief Rule Base (EBRB) model, a representative advanced rule-based system that has attracted great attention in the last few years. The purpose is to identify several complexity measurements that most significantly affect and are most sensitive to the accuracy of the EBRB system. In the experimental section, we validate the impact of complexity measurements on the EBRB model across 108 classification task datasets, and perform sensitivity analysis in determining the top five complexity measurements that most significantly impact the performance of the EBRB model. Accordingly, we provide guidance that can help decision-makers in the data preprocessing aspects of the EBRB model, thereby improving its performance and enhancing the accuracy of the EBRB model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 179.99; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/inspection/_permutation_importance.py.

References

Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Article Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press (2000)
Google Scholar
Fang, W., Gong, X., Liu, G., Wu, Y., Fu, Y.: A balance adjusting approach of extended belief-rule-based system for imbalanced classification problem. IEEE Access 8, 41201–41212 (2020)
Article Google Scholar
Fu, Y.G., Huang, H.Y., Guan, Y., Wang, Y.M., Liu, W., Fang, W.J.: EBRB cascade classifier for imbalanced data via rule weight updating. Knowl. Based Syst. 223, 107010 (2021)
Article Google Scholar
Garcia, L., Lorena, A., Lehmann, J.: ECoL: complexity measures for classification problems (2018)
Google Scholar
Garcia, L.P., de Carvalho, A.C., Lorena, A.C.: Effect of label noise in the complexity of classification problems. Neurocomputing 160, 108–119 (2015)
Article Google Scholar
Garcia, L.P., Lorena, A.C., de Souto, M.C., Ho, T.K.: Classifier recommendation using data complexity measures. In: 2018 24th International Conference on Pattern Recognition (ICPR), pp. 874–879. IEEE (2018)
Google Scholar
Ho, T.K., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 289–300 (2002)
Article Google Scholar
Ho, T.K., Basu, M., Law, M.H.C.: Measures of geometrical complexity in classification problems. In: Data Complexity in Pattern Recognition, pp. 1–23 (2006)
Google Scholar
Komorniczak, J., Ksieniewicz, P.: problexity–an open-source python library for supervised learning problem complexity assessment. Neurocomputing 521, 126–136 (2023)
Article Google Scholar
Leyva, E., González, A., Perez, R.: A set of complexity measures designed for applying meta-learning to instance selection. IEEE Trans. Knowl. Data Eng. 27(2), 354–367 (2014)
Article Google Scholar
Liu, J., Martinez, L., Calzada, A., Wang, H.: A novel belief rule base representation, generation and its inference methodology. Knowl. Based Syst. 53, 129–141 (2013)
Article Google Scholar
Lorena, A.C., Costa, I.G., Spolaôr, N., De Souto, M.C.: Analysis of complexity indices for classification problems: cancer gene expression data. Neurocomputing 75(1), 33–42 (2012)
Article Google Scholar
Lorena, A.C., Garcia, L.P., Lehmann, J., Souto, M.C., Ho, T.K.: How complex is your classification problem? A survey on measuring classification complexity. ACM Comput. Surv. (CSUR) 52(5), 1–34 (2019)
Article Google Scholar
Mendel, J.M., Bonissone, P.P.: Critical thinking about explainable AI (XaI) for rule-based fuzzy systems. IEEE Trans. Fuzzy Syst. 29(12), 3579–3593 (2021). https://doi.org/10.1109/TFUZZ.2021.3079503
Article Google Scholar
Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill Inc., USA (1997)
Google Scholar
Montgomery, D.C., Peck, E.A., Vining, G.G.: Introduction to Linear Regression Analysis. John Wiley & Sons (2021)
Google Scholar
Orriols-Puig, A., Macia, N., Ho, T.K.: Documentation for the data complexity library in C++. Universitat Ramon Llull, La Salle 196(1–40), 12 (2010)
Google Scholar
Sotoca, J.M., Mollineda, R.A., Sánchez, J.S.: A meta-learning framework for pattern classication by means of data complexity measures. Inteligencia Artificial. Revista Iberoamericana de Inteligencia Artificial 10(29), 31–38 (2006)
Google Scholar
Tanwani, A.K., Farooq, M.: Classification potential vs. classification accuracy: a comprehensive study of evolutionary algorithms with biomedical datasets. In: Bacardit, J., Browne, W., Drugowitsch, J., Bernadó-Mansilla, E., Butz, M.V. (eds.) IWLCS 2008-2009. LNCS (LNAI), vol. 6471, pp. 127–144. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17508-4_9
Chapter Google Scholar
Van Rijn, J.N., et al.: OpenML: a collaborative science platform. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013, Part III 13. LNCS (LNAI), vol. 8190, pp. 645–649. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40994-3_46
Chapter Google Scholar
Yang, J.B., Liu, J., Wang, J., Sii, H.S., Wang, H.W.: Belief rule-base inference methodology using the evidential reasoning approach-Rimer. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 36(2), 266–285 (2006)
Article Google Scholar
Yang, L.H., Liu, J., Wang, Y.M., Martínez, L.: A micro-extended belief rule-based system for big data multiclass classification problems. IEEE Trans. Syst. Man Cybern. Syst. 51(1), 420–440 (2018)
Article Google Scholar
Yang, L.H., Ren, T.Y., Ye, F.F., Hu, H., Wang, H., Zheng, H.: Extended belief rule base with ensemble imbalanced learning for lymph node metastasis diagnosis in endometrial carcinoma. Eng. Appl. Artif. Intell. 126, 106950 (2023)
Article Google Scholar
Ye, F.F., Yang, L.H., Wang, Y.M., Lu, H.: A data-driven rule-based system for china’s traffic accident prediction by considering the improvement of safety efficiency. Comput. Ind. Eng. 176, 108924 (2023)
Article Google Scholar

Download references

Acknowledgements

The authors would like to extend the sincere thanks to the National-Local Joint Engineering Laboratory of System Credibility Automatic Verification for generous support.

Author information

Authors and Affiliations

National-Local Joint Engineering Laboratory of System Credibility Automatic Verification, Southwest Jiaotong University, Chengdu, China
Yiqing Xian & Guoyan Zeng
School of Computing, Ulster University, Belfast, Northern Ireland, UK
Jun Liu

Authors

Yiqing Xian
View author publications
You can also search for this author in PubMed Google Scholar
Guoyan Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Jun Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Liu .

Editor information

Editors and Affiliations

Castilla-La Mancha University, Ciudad Real, Spain
José Bravo
Ulster University, Belfast, UK
Chris Nugent
Ulster University, Belfast, UK
Ian Cleland

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xian, Y., Zeng, G., Liu, J. (2024). Data Complexity and Its Effect on EBRB System Accuracy. In: Bravo, J., Nugent, C., Cleland, I. (eds) Proceedings of the International Conference on Ubiquitous Computing and Ambient Intelligence (UCAmI 2024). UCAmI 2024. Lecture Notes in Networks and Systems, vol 1212. Springer, Cham. https://doi.org/10.1007/978-3-031-77571-0_80

Download citation

DOI: https://doi.org/10.1007/978-3-031-77571-0_80
Published: 21 December 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-77570-3
Online ISBN: 978-3-031-77571-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Data Complexity and Its Effect on EBRB System Accuracy

Abstract

Access this chapter

Subscribe and save

Buy Now

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Data Complexity and Its Effect on EBRB System Accuracy

Abstract

Access this chapter

Subscribe and save

Buy Now

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.