Skip to main content

Exploring Resampling with Neighborhood Bias on Imbalanced Regression Problems

  • Conference paper
  • First Online:
Progress in Artificial Intelligence (EPIA 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10423))

Included in the following conference series:

Abstract

Imbalanced domains are an important problem that arises in predictive tasks causing a loss in the performance of the most relevant cases for the user. This problem has been intensively studied for classification problems. Recently it was recognized that imbalanced domains occur in several other contexts and for a diversity of types of tasks. This paper focus on imbalanced regression tasks. Resampling strategies are among the most successful approaches to imbalanced domains. In this work we propose variants of existing resampling strategies that are able to take into account the information regarding the neighborhood of the examples. Instead of performing sampling uniformly, our proposals bias the strategies for reinforcing some regions of the data sets. In an extensive set of experiments we provide evidence of the advantage of introducing a neighborhood bias in the resampling strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Further details regarding SmoteR algorithm can be obtained in [17].

  2. 2.

    Further details available in [12].

  3. 3.

    Available at http://www.dcc.fc.up.pt/~rpribeiro/uba/.

References

  1. Branco, P.: Re-sampling approaches for regression tasks under imbalanced domains. Master’s thesis, Department of Computer Science, Faculty of Sciences - University of Porto (2014)

    Google Scholar 

  2. Branco, P., Ribeiro, R.P., Torgo, L.: UBL: an R package for utility-based learning. arXiv preprint arXiv:1604.08079 (2016)

  3. Branco, P., Torgo, L., Ribeiro, R.P.: A survey of predictive modeling on imbalanced domains. ACM Comput. Surv. (CSUR) 49(2), 31 (2016)

    Article  Google Scholar 

  4. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. JAIR 16, 321–357 (2002)

    Article  Google Scholar 

  5. Dimitriadou, E., Hornik, K., Leisch, F., Meyer, D., Weingessel, A.: e1071: Misc Functions of the Department of Statistics (e1071), TU Wien (2011)

    Google Scholar 

  6. He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: IEEE International Joint Conference on Neural Networks, pp. 1322–1328. IEEE (2008)

    Google Scholar 

  7. He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)

    Article  Google Scholar 

  8. Krawczyk, B.: Learning from imbalanced data: open challenges and future directions. Prog. Artif. Intell. 5, 1–12 (2016)

    Article  Google Scholar 

  9. Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)

    Google Scholar 

  10. López, V., Fernández, A., García, S., Palade, V., Herrera, F.: An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf. Sci. 250, 113–141 (2013)

    Article  Google Scholar 

  11. Milborrow, S.: earth: Multivariate Adaptive Regression Spline Models. Derived from mda:mars by Trevor Hastie and Rob Tibshirani (2012)

    Google Scholar 

  12. Ribeiro, R.P.: Utility-based regression. Ph.D. thesis, Department Computer Science, Faculty of Sciences, University of Porto (2011)

    Google Scholar 

  13. Torgo, L.: An infra-structure for performance estimation and experimental comparison of predictive models in r. CoRR abs/1412.0436 (2014)

    Google Scholar 

  14. Torgo, L., Ribeiro, R.P.: Precision and recall for regression. In: Gama, J., Costa, V.S., Jorge, A.M., Brazdil, P.B. (eds.) DS 2009. LNCS, vol. 5808, pp. 332–346. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04747-3_26

    Chapter  Google Scholar 

  15. Torgo, L., Branco, P., Ribeiro, R.P., Pfahringer, B.: Resampling strategies for regression. Expert Syst. 32(3), 465–476 (2015)

    Article  Google Scholar 

  16. Torgo, L., Ribeiro, R.P.: Utility-based regression. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS, vol. 4702, pp. 597–604. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74976-9_63

    Chapter  Google Scholar 

  17. Torgo, L., Ribeiro, R.P., Pfahringer, B., Branco, P.: SMOTE for regression. In: Correia, L., Reis, L.P., Cascalho, J. (eds.) EPIA 2013. LNCS, vol. 8154, pp. 378–389. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40669-0_33

    Chapter  Google Scholar 

Download references

Acknowledgments

This work is financed by the ERDF European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation - COMPETE 2020 Programme within project POCI-01-0145-FEDER-006961, and by National Funds through the FCT Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) as part of project UID/EEA/50014/2013. P. Branco is supported by a Ph.D. scholarship of FCT (PD/BD/105788/2014). Prof. L. Torgo would also like to thank the support of Projects NORTE-01-0145-FEDER-000036 and UTAP-ICDT/CTM-NAN/0025/2014.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paula Branco .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Branco, P., Torgo, L., Ribeiro, R.P. (2017). Exploring Resampling with Neighborhood Bias on Imbalanced Regression Problems. In: Oliveira, E., Gama, J., Vale, Z., Lopes Cardoso, H. (eds) Progress in Artificial Intelligence. EPIA 2017. Lecture Notes in Computer Science(), vol 10423. Springer, Cham. https://doi.org/10.1007/978-3-319-65340-2_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-65340-2_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-65339-6

  • Online ISBN: 978-3-319-65340-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy