Abstract
In this paper, we argue that search heuristics for inductive rule learning algorithms typically trade off consistency and coverage, and we investigate this trade-off by determining optimal parameter settings for five different parametrized heuristics. This empirical comparison yields several interesting results. Of considerable practical importance are the default values that we establish for these heuristics, and for which we show that they outperform commonly used instantiations of these heuristics. We also gain some theoretical insights. For example, we note that it is important to relate the rule coverage to the class distribution, but that the true positive rate should be weighted more heavily than the false positive rate. We also find that the optimal parameter settings of these heuristics effectively implement quite similar preference criteria.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Asuncion, A., Newman, D.: UCI machine learning repository (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
Cestnik, B.: Estimating probabilities: A crucial task in Machine Learning. In: Aiello, L. (ed.) Proceedings of the 9th European Conference on Artificial Intelligence (ECAI 1990), Stockholm, Sweden, pp. 147–150. Pitman (1990)
Cohen, W.W.: Fast Effective Rule Induction. In: Prieditis, A., Russell, S. (eds.) Proceedings of the 12th International Conference on Machine Learning, Tahoe City, CA, July 9–12, 1995, pp. 115–123. Morgan Kaufmann, San Francisco (1995), http://citeseer.nj.nec.com/cohen95fast.html
Demsar, J.: Statistical comparisons of classifiers over multiple datasets. Machine Learning Research (7), 1–30 (2006)
Fürnkranz, J.: Separate-and-Conquer Rule Learning. Artificial Intelligence Review 13(1), 3–54 (1999), citeseer.ist.psu.edu/26490.html
Fürnkranz, J., Flach, P.A.: ROC ’n’ Rule Learning - Towards a Better Understanding of Covering Algorithms. Machine Learning 58(1), 39–77 (2005), http://www.cs.bris.ac.uk/Publications/Papers/2000264.pdf
Janssen, F., Fürnkranz, J.: An empirical quest for optimal rule learning heuristics. Technical Report TUD-KE-2008-01, Knowledge Engineering Group, TU Darmstadt (2008), http://www.ke.informatik.tu-darmstadt.de/publications/reports/tud-ke-2008-01.pdf
Janssen, F., Fürnkranz, J.: On meta-learning rule learning heuristics. In: Proceedings of the 7th IEEE Conference on Data Mining (ICDM 2007), Omaha, NE, pp. 529–534 (2007)
Klösgen, W.: Problems for Knowledge Discovery in Databases and their Treatment in the Statistics Interpreter Explora. International Journal of Intelligent Systems 7, 649–673 (1992)
Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York (1986)
Todorovski, L., Flach, P., Lavrac, N.: Predictive performance of weighted relative accuracy. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 255–264. Springer, Heidelberg (2000), http://www.cs.bris.ac.uk/Publications/Papers/1000516.pdf
Witten, I.H., Frank, E.: Data Mining — Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann Publishers, San Francisco (2005), http://www.cs.waikato.ac.nz/~ml/weka/
Wrobel, S.: An Algorithm for Multi-relational discovery of Subgroups. In: Komorowski, J., Żytkow, J.M. (eds.) PKDD 1997. LNCS, vol. 1263, pp. 78–87. Springer, Heidelberg (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer Berlin Heidelberg
About this paper
Cite this paper
Janssen, F., Fürnkranz, J. (2008). An Empirical Investigation of the Trade-Off between Consistency and Coverage in Rule Learning Heuristics. In: Jean-Fran, JF., Berthold, M.R., Horváth, T. (eds) Discovery Science. DS 2008. Lecture Notes in Computer Science(), vol 5255. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88411-8_7
Download citation
DOI: https://doi.org/10.1007/978-3-540-88411-8_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88410-1
Online ISBN: 978-3-540-88411-8
eBook Packages: Computer ScienceComputer Science (R0)