Evolution-Based Online Automated Machine Learning

Kulbach, Cedric; Montiel, Jacob; Bahri, Maroua; Heyden, Marco; Bifet, Albert

doi:10.1007/978-3-031-05933-9_37

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13280))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

3348 Accesses
5 Citations

Abstract

Automated Machine Learning (AutoML) deals with finding well-performing machine learning models and their corresponding configurations without the need of machine learning experts. However, if one assumes an online learning scenario, where an AutoML instance executes on evolving data streams, the question for the best model and its configuration with respect to occurring changes in the data distribution remains open. Algorithms developed for online learning settings rely on few and homogeneous models and do not consider data mining pipelines or the adaption of their configuration. We, therefore, introduce EvoAutoML, an evolution-based online learning framework consisting of heterogeneous and connectable models that supports large and diverse configuration spaces and adapts to the online learning scenario. We present experiments with an implementation of EvoAutoML on a diverse set of synthetic and real datasets, and show that our proposed approach outperforms state-of-the-art online algorithms as well as strong ensemble baselines in a traditional test-then-train evaluation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Online AutoML: an adaptive AutoML framework for online learning

Article 06 December 2022

Automatically Optimized Gradient Boosting Trees for Classifying Large Volume High Cardinality Data Streams Under Concept Drift

Online Automated Imbalanced Learning via Adaptive Thompson Sampling

Notes

1.
Changes in data distributions or patterns are also referred to as concept drift [36].
2.
https://github.com/kulbachcedric/EvOAutoML.git.
3.
https://github.com/kulbachcedric/EvOAutoML.git.

References

Agrawal, R., Imielinski, T., Swami, A.N.: Database mining: a performance perspective. IEEE TKDE 5(6), 914–925 (1993)
Google Scholar
Alberg, D., Last, M., Kandel, A.: Knowledge discovery in data streams with regression tree methods. Wiley Interdisc. DMKD 2(1), 69–78 (2012)
Google Scholar
Bahri, M., Bifet, A., Gama, J., Gomes, H.M., Maniu, S.: Data stream analysis: foundations, major tasks and tools. Wiley Interdisc.: DMKD 11(3), e1405 (2021)
Google Scholar
Bifet, A., Gavaldà, R.: Learning from time-changing data with adaptive windowing. In: SIAM ICD, pp. 443–448 (2007)
Google Scholar
Bifet, A., Gavaldà, R.: Adaptive learning from evolving data streams. In: Adams, N.M., Robardet, C., Siebes, A., Boulicaut, J.-F. (eds.) IDA 2009. LNCS, vol. 5772, pp. 249–260. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03915-7_22
Chapter Google Scholar
Bifet, A., Gavaldà, R., Holmes, G., Pfahringer, B.: Machine Learning for Data Streams: With Practical Examples in MOA. MIT Press, Cambridge (2018)
Book Google Scholar
Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: massive online analysis. JMLR 11, 1601–1604 (2010)
Google Scholar
Bifet, A., Holmes, G., Pfahringer, B.: Leveraging bagging for evolving data streams. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS (LNAI), vol. 6321, pp. 135–150. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15880-3_15
Chapter Google Scholar
Bifet, A., Read, J., Žliobaitė, I., Pfahringer, B., Holmes, G.: Pitfalls in benchmarking data stream classification and how to avoid them. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8188, pp. 465–479. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40988-2_30
Chapter MATH Google Scholar
Breiman, L.: Random forests. ML 45(1), 5–32 (2001)
MATH Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth (1984)
Google Scholar
Brochu, E., Cora, V.M., de Freitas, N.: A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. CoRR (2010)
Google Scholar
Celik, B., Vanschoren, J.: Adaptation strategies for automated machine learning on evolving data. CoRR abs/2006.06480 (2020)
Google Scholar
Domingos, P.M., Hulten, G.: Mining high-speed data streams. In: Ramakrishnan, R., Stolfo, S.J., Bayardo, R.J., Parsa, I. (eds.) SIGKDD, pp. 71–80. ACM (2000)
Google Scholar
Feurer, M., Eggensperger, K., Falkner, S., Lindauer, M., Hutter, F.: Auto-Sklearn 2.0: the next generation. CoRR (2020)
Google Scholar
Feurer, M., Klein, A., Eggensperger, E.A.: Efficient and robust automated machine learning. In: Cortes, C., Lawrence, N.D., Lee, D.D. (eds.) Advances in Neural Information Processing Systems 28: NIPS, pp. 2962–2970 (2015)
Google Scholar
Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28645-5_29
Chapter Google Scholar
Gijsbers, P., Vanschoren, J.: GAMA: genetic automated machine learning assistant. J. Open Sour. Softw. 4(33), 1132 (2019)
Article Google Scholar
Gomes, H.M., et al.: Adaptive random forests for evolving data stream classification. ML 106(9–10), 1469–1495 (2017)
MathSciNet Google Scholar
Gomes, H.M., Read, J., Bifet, A.: Streaming random patches for evolving data stream classification. In: ICDM. IEEE (2019)
Google Scholar
Hall, M.A., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)
Article Google Scholar
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: SIGKDD, pp. 97–106. ACM (2001)
Google Scholar
Hutter, F., Kotthoff, L., Vanschoren, J. (eds.): Automated Machine Learning - Methods, Systems, Challenges. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05318-5
Book Google Scholar
Imbrea, A.: An empirical comparison of automated machine learning techniques for data streams. B.S. thesis, University of Twente (2020)
Google Scholar
Kotthoff, L., Thornton, C., Hoos, H.H., Hutter, F., Leyton-Brown, K.: Auto-WEKA 2.0: automatic model selection and hyperparameter optimization in WEKA. JMLR 18, 25:1–25:5 (2017)
Google Scholar
Le, T.T., Fu, W., Moore, J.H.: Scaling tree-based automated machine learning to biomedical big data with a feature set selector. Bioinformatics 36(1), 250–256 (2020)
Article Google Scholar
Montiel, J., et al.: River: machine learning for streaming data in Python (2020)
Google Scholar
Montiel, J., Read, J., Bifet, A., Abdessalem, T.: Scikit-multiflow: a multi-output streaming framework. JMLR 19, 72:1–72:5 (2018)
Google Scholar
Oza, N.C.: Online bagging and boosting. In: ICSMC, pp. 2340–2345. IEEE (2005)
Google Scholar
Oza, N.C., Russell, S.J.: Experimental comparisons of online and batch versions of bagging and boosting. In: Lee, D., Schkolnick, M., Provost, F.J., Srikant, R. (eds.) ACM SIGKDD, pp. 359–364. ACM (2001)
Google Scholar
Oza, N.C., Russell, S.J.: Online bagging and boosting. In: Richardson, T.S., Jaakkola, T.S. (eds.) Workshop on AISTATS (2001)
Google Scholar
Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: AAAI, pp. 4780–4789. AAAI (2019)
Google Scholar
van Rijn, J.N., Holmes, G., Pfahringer, B., Vanschoren, J.: Having a blast: meta-learning and heterogeneous ensembles for data streams. In: Aggarwal, C.C., Zhou, Z., Tuzhilin, A., Xiong, H., Wu, X. (eds.) ICDM, pp. 1003–1008 (2015)
Google Scholar
Stetsenko, P.: Machine learning with Python and H2O (2020). http://docs.h2o.ai/h2o/latest-stable/h2o-docs/booklets/PythonBooklet.pdf
Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: SIGKDD, pp. 847–855. ACM (2013)
Google Scholar
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. ML 23(1), 69–101 (1996)
Google Scholar
Zöller, M., Huber, M.F.: Survey on automated machine learning. CoRR abs/1904.12054 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Research Center for Information Technology (FZI), Haid-Und-Neu-Str. 10-14, 76131, Karlsruhe, Germany
Cedric Kulbach & Marco Heyden
University of Waikato, Private Bag 3105, Hamilton, 3240, New Zealand
Jacob Montiel & Albert Bifet
Inria Paris, 2 Rue Simone IFF, 75012, Paris, France
Maroua Bahri

Authors

Cedric Kulbach
View author publications
You can also search for this author in PubMed Google Scholar
Jacob Montiel
View author publications
You can also search for this author in PubMed Google Scholar
Maroua Bahri
View author publications
You can also search for this author in PubMed Google Scholar
Marco Heyden
View author publications
You can also search for this author in PubMed Google Scholar
Albert Bifet
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cedric Kulbach .

Editor information

Editors and Affiliations

Laboratory of Artificial Intelligence and Decision Support, University of Porto, Porto, Portugal
João Gama
School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, China
Tianrui Li
National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Yang Yu
School of Computer Science and Technology, University of Science and Technology of China, Hefei, China
Enhong Chen
JD iCity, JD Technology & JD Intelligent Cities Research, Beijing, China
Yu Zheng
School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, China
Fei Teng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kulbach, C., Montiel, J., Bahri, M., Heyden, M., Bifet, A. (2022). Evolution-Based Online Automated Machine Learning. In: Gama, J., Li, T., Yu, Y., Chen, E., Zheng, Y., Teng, F. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2022. Lecture Notes in Computer Science(), vol 13280. Springer, Cham. https://doi.org/10.1007/978-3-031-05933-9_37

Download citation

DOI: https://doi.org/10.1007/978-3-031-05933-9_37
Published: 10 May 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-05932-2
Online ISBN: 978-3-031-05933-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Evolution-Based Online Automated Machine Learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Online AutoML: an adaptive AutoML framework for online learning

Automatically Optimized Gradient Boosting Trees for Classifying Large Volume High Cardinality Data Streams Under Concept Drift

Online Automated Imbalanced Learning via Adaptive Thompson Sampling

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Evolution-Based Online Automated Machine Learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Online AutoML: an adaptive AutoML framework for online learning

Automatically Optimized Gradient Boosting Trees for Classifying Large Volume High Cardinality Data Streams Under Concept Drift

Online Automated Imbalanced Learning via Adaptive Thompson Sampling

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.