Triplets Oversampling for Class Imbalanced Federated Datasets

Xiao, Chenguang; Wang, Shuo

doi:10.1007/978-3-031-43415-0_22

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14170))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

1517 Accesses
3 Citations

Abstract

Class imbalance is a pervasive problem in machine learning, leading to poor performance in the minority class that is inadequately represented. Federated learning, which trains a shared model collaboratively among multiple clients with their data locally for privacy protection, is also susceptible to class imbalance. The distributed structure and privacy rules in federated learning introduce extra complexities to the challenge of isolated, small, and highly skewed datasets. While sampling and ensemble learning are state-of-the-art techniques for mitigating class imbalance from the data and algorithm perspectives, they face limitations in the context of federated learning. To address this challenge, we propose a novel oversampling algorithm called "Triplets" that generates synthetic samples for both minority and majority classes based on their shared classification boundary. The proposed algorithm captures new minority samples by leveraging three triplets around the boundary, where two come from the majority class and one from the minority class. This approach offers several advantages over existing oversampling techniques on federated datasets. We evaluate the effectiveness of our proposed algorithm through extensive experiments using various real-world datasets and different models in both centralized and federated learning environments. Our results demonstrate the effectiveness of our proposed algorithm, which outperforms existing oversampling techniques. In conclusion, our proposed algorithm offers a promising solution to the class imbalance problem in federated learning. The source code is released at github.com/Xiao-Chenguang/Triplets-Oversampling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

FedGCS: Addressing Class Imbalance in Long-Tail Federated Learning

Privacy Protection Bottom-up Hierarchical Federated Learning with Class Imbalanced Data

Client Selection Mechanism for Federated Learning Based on Class Imbalance

References

Akosa, J.S.: Predictive accuracy: a misleading performance measure for highly imbalanced data (2017)
Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
Article MATH Google Scholar
Chakraborty, D., Ghosh, A.: Improving the robustness of federated learning for severely imbalanced datasets. arXiv preprint arXiv:2204.13414 (2022)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Article MATH Google Scholar
Domingos, P.: Metacost: a general method for making classifiers cost-sensitive. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 155–164 (1999)
Google Scholar
Duan, M., Liu, D., Chen, X., Liu, R., Tan, Y., Liang, L.: Self-balancing federated learning with global imbalanced data in mobile systems. IEEE Trans. Parallel Distrib. Syst. 32(1), 59–71 (2021). https://doi.org/10.1109/TPDS.2020.3009406
Article Google Scholar
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)
Article MathSciNet MATH Google Scholar
Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 878–887. Springer, Heidelberg (2005). https://doi.org/10.1007/11538059_91
Chapter Google Scholar
He, H., Bai, Y., Garcia, E.A., Li, S.: Adasyn: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pp. 1322–1328. IEEE (2008)
Google Scholar
Japkowicz, N., Stephen, S.: The class imbalance problem: a systematic study. Intell. Data Anal. 6(5), 429–449 (2002). https://doi.org/10.3233/ida-2002-6504
Article MATH Google Scholar
Kairouz, P., et al.: Advances and open problems in federated learning (2019). https://doi.org/10.48550/ARXIV.1912.04977
Li, Q., et al.: A survey on federated learning systems: vision, hype and reality for data privacy and protection. IEEE Trans. Knowl. Data Eng. 35, 3347–3366 (2021). https://doi.org/10.1109/tkde.2021.3124599
Article Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017)
Google Scholar
McMahan, H.B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017 54 (2017)
Google Scholar
Sarkar, D., Narang, A., Rai, S.: Fed-focal loss for imbalanced data classification in federated learning. arXiv (2020)
Google Scholar
Tomek, I.: Two modifications of cnn. IEEE Trans. Syst. Man Cybern. SMC 6(11), 769–772 (1976). https://doi.org/10.1109/TSMC.1976.4309452
Article MathSciNet MATH Google Scholar
Wang, L., Wang, X., Xu, S., Zhu, Q.: Towards class imbalance in federated learning. arXiv (2020)
Google Scholar
Weiss, G.M.: Mining with rarity: a unifying framework. ACM Sigkdd Explor. Newsl. 6(1), 7–19 (2004)
Article Google Scholar
Xiao, C., Wang, S.: An experimental study of class imbalance in federated learning. In: 2021 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–7. IEEE (2021)
Google Scholar
Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: concept and applications. ACM Trans. Intell. Syst. Technol. (TIST) 10(2), 1–19 (2019)
Article Google Scholar

Download references

Acknowledgments

This work was supported by the Royal Academy of Engineering Leverhulme Trust Research Fellowship [LTRF2122-18-106] and the National Natural Science Foundation for Young Scientists of China [62206239]. The computations described in this research were performed using the Baskerville Tier 2 HPC service (https://www.baskerville.ac.uk/). Baskerville was funded by the EPSRC and UKRI through the World Class Labs scheme (EP/T022221/1) and the Digital Research Infrastructure programme (EP/W032244/1) and is operated by Advanced Research Computing at the University of Birmingham. Chenguang Xiao is partially supported by the Chinese Scholarship Council.

Author information

Authors and Affiliations

School of Computer Science, University of Birmingham, Birmingham, B15 2TT, UK
Chenguang Xiao & Shuo Wang

Authors

Chenguang Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Shuo Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuo Wang .

Editor information

Editors and Affiliations

University of Michigan, Ann Arbor, MI, USA
Danai Koutra
University of Vienna, Vienna, Austria
Claudia Plant
Max Planck Institute for Software Systems, Kaiserslautern, Germany
Manuel Gomez Rodriguez
Politecnico di Torino, Turin, Italy
Elena Baralis
CENTAI, Turin, Italy
Francesco Bonchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xiao, C., Wang, S. (2023). Triplets Oversampling for Class Imbalanced Federated Datasets. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14170. Springer, Cham. https://doi.org/10.1007/978-3-031-43415-0_22

Download citation

DOI: https://doi.org/10.1007/978-3-031-43415-0_22
Published: 17 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43414-3
Online ISBN: 978-3-031-43415-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

Triplets Oversampling for Class Imbalanced Federated Datasets

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

FedGCS: Addressing Class Imbalance in Long-Tail Federated Learning

Privacy Protection Bottom-up Hierarchical Federated Learning with Class Imbalanced Data

Client Selection Mechanism for Federated Learning Based on Class Imbalance

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Triplets Oversampling for Class Imbalanced Federated Datasets

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

FedGCS: Addressing Class Imbalance in Long-Tail Federated Learning

Privacy Protection Bottom-up Hierarchical Federated Learning with Class Imbalanced Data

Client Selection Mechanism for Federated Learning Based on Class Imbalance

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.