Skip to main content

Decision Tree Induction Methods for Distributed Environment

  • Conference paper
Man-Machine Interactions

Part of the book series: Advances in Intelligent and Soft Computing ((AINSC,volume 59))

Abstract

Since the amount of information is rapidly growing, there is an overwhelming interest in efficient distributed computing systems including Grids, public-resource computing systems, P2P systems and cloud computing. In this paper we take a detailed look at the problem of modeling and optimization of network computing systems for parallel decision tree induction methods. First, we present a comprehensive discussion on mentioned induction methods with a special focus on their parallel versions. Next, we propose a generic optimization model of a network computing system that can be used for distributed implementation of parallel decision trees. To illustrate our work we provide results of numerical experiments showing that the distributed approach enables significant improvement of the system throughput.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Ben-Haim, Y., Yom-Tov, E.: A streaming parallel decision tree algorithm. In: Proceedings of the PASCAL Workshop on Large Scale Learning Challenge, Helsinki, Finland (2008)

    Google Scholar 

  2. Blum, A.L., Langley, P.: Selection of relevant features and examples in machine learning. Artificial Intelligence 97(1-2), 245–271 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  3. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth (1984)

    Google Scholar 

  4. Brodley, C.E., Utgoff, P.E.: Multivariate decision trees. Machine Learning 19(1), 45–77 (1995)

    MATH  Google Scholar 

  5. Cover, T.M.: The best two independent measurements are not the two best. IEEE Transactions on Systems, Man and Cybernetics 4(1), 116–117 (1974)

    MATH  Google Scholar 

  6. Dash, M., Liu, H.: Feature selection for classification. Intelligent Data Analysis 1(1-4), 131–156 (1997)

    Article  Google Scholar 

  7. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Willey and Sons, New York (2001)

    MATH  Google Scholar 

  8. Foster, I., Iamnitchi, A.: On death, taxes and the convergence of peer-to-peer and grid computing. In: Kaashoek, M.F., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735, pp. 118–128. Springer, Heidelberg (2003)

    Google Scholar 

  9. ILOG: CPLEX 11.0. user’s manual (2007)

    Google Scholar 

  10. Jin, R., Agrawal, G.: Communication and memory efficient parallel decision tree construction. In: Proceedings of the 3rd SIAM Conference on Data Mining, San Francisco, US, pp. 119–129 (2003)

    Google Scholar 

  11. Kufrin, R.: Decision trees on parallel processors. Parallel Processing for Artificial Intelligence 3, 279–306 (1997)

    Article  Google Scholar 

  12. Kurzyński, M.: The optimal strategy of a tree classifier. Pattern Recognition 16(1), 81–87 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  13. Landwehr, N., et al.: Logistic model trees. Machine Learning 95(1-2), 161–205 (2005)

    Article  Google Scholar 

  14. Mehta, M., et al.: SLIQ: A fast scalable classifier for data mining. In: Proceedings of the 5th International Conference on Extending Database Technology, pp. 18–32. Avignon, France (1996)

    Google Scholar 

  15. Mitchell, T.M.: Machine Learning. McGraw-Hill Company, Incorporated, New York (1997)

    MATH  Google Scholar 

  16. Nabrzyski, J., Schopf, J., Wêglarz, J.: Grid resource management: state of the art and future trends. Kluwer Academic Publishers, Boston (2004)

    MATH  Google Scholar 

  17. Paliouras, G., Bree, D.S.: The effect of numeric features on the scalability of inductive learning programs. In: Lavrač, N., Wrobel, S. (eds.) ECML 1995. LNCS, vol. 912, pp. 218–231. Springer, Heidelberg (1995)

    Google Scholar 

  18. Pióro, M., Medhi, D.: Routing, Flow, and Capacity Design in Communication and Computer Networks. Morgan Kaufman Publishers, San Francisco (2004)

    MATH  Google Scholar 

  19. Quinlan, J.R.: C4.5: Program for Machine Learning. Morgan Kaufman, San Mateo (1993)

    Google Scholar 

  20. Shafer, J., et al.: SPRINT: A scalable parallel classifier for data mining. In: Proceedings of the 22nd Conference on Very Large Databases, pp. 544–555 (1996)

    Google Scholar 

  21. Srivastava, A., et al.: Parallel formulations of decision tree classification algorithms. Data Mining and Knowledge Discovery 3(3), 237–261 (1999)

    Article  Google Scholar 

  22. Taylor, I.: From P2P to Web services and grids: peers in a client/server world. Springer, Heidelberg (2005)

    MATH  Google Scholar 

  23. Yidiz, O.T., Dikmen, O.: Parallel univariate decision trees. Pattern Recognition Letters 28, 825–832 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Walkowiak, K., Woźniak, M. (2009). Decision Tree Induction Methods for Distributed Environment. In: Cyran, K.A., Kozielski, S., Peters, J.F., Stańczyk, U., Wakulicz-Deja, A. (eds) Man-Machine Interactions. Advances in Intelligent and Soft Computing, vol 59. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00563-3_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00563-3_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00562-6

  • Online ISBN: 978-3-642-00563-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy