Skip to main content

Advertisement

Log in

Advanced Algorithm for Parameters Estimation of Negative Binomial Distribution with High Dimensional Sparse Group Structure

  • Published:
Journal of Systems Science and Complexity Aims and scope Submit manuscript

Abstract

Negative binomial regression is a powerful technique for modeling count data, particularly when dealing with overdispersion. However, estimating the parameters for large-dimensional sparse models is challenging due to the complexity of optimizing the mean and dispersion parameter of the negative binomial distribution. To address this issue, the authors propose a novel approach that employs two iterations of the majorize-minimize (MM) algorithm, one for estimating the dispersion parameter and the other for estimating the mean parameters. These approaches improve the convergence speed and stability of the algorithm. The authors also use group penalty for variable selection, which enhances the accuracy and efficiency of the algorithm. The proposed method provides an explicit solution, simplifies the iteration process, and maintains good stability while ensuring algorithm convergence. Furthermore, the authors apply the proposed algorithm to the zero-inflated model and demonstrate its promising predictive performance on specific data sets. The research has important implications for count data modeling and analysis in various fields, such as data mining, machine learning, and bioinformatics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Blasco Moreno A, Pérez Casany M, Puig P, et al., What does a zero mean? Understanding false, random and structural zeros in ecology, Methods in Ecology and Evolution, 2019, 10(7): 949–959.

    Article  Google Scholar 

  2. Hafemeister C and Satija R, Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression, Genome Biology, 2019, 20(1): 296–296.

    Article  Google Scholar 

  3. Green J A, Too many zeros and/or highly skewed? A tutorial on modelling health behaviour as count data with Poisson and negative binomial regression, Health Psychology and Behavioral Medicine, 2021, 9(1): 436–455.

    Article  Google Scholar 

  4. Feng Y, Wang Y, Wang W, et al., Robust estimation of semiparametric transformation model for panel count data, Journal of Systems Science & Complexity, 2021, 34(6): 2334–2356.

    Article  MathSciNet  Google Scholar 

  5. Zhang S, Sun Z, Ma W, et al., The effect of cooperative membership on agricultural technology adoption in Sichuan, China, China Economic Review, 2020, 62(C): 101334.

    Article  Google Scholar 

  6. Li S and Shao Q, Exploring the determinants of renewable energy innovation considering the institutional factors: A negative binomial analysis, Technology in Society, 2021, 67(C): 101680.

    Article  Google Scholar 

  7. Ayers K L and Cordell H J, Identification of grouped rare and common variants via penalized logistic regression, Genetic Epidemiology, 2013, 37(6): 592–602.

    Article  Google Scholar 

  8. Chatterjee S, Chowdhury S, Mallick H, et al., Group regularization for zeroinflated negative binomial regression models with an application to health care demand in Germany, Statistics in Medicine, 2018, 37(20): 3012–3026.

    Article  MathSciNet  Google Scholar 

  9. Agresti A, Foundations of Linear and Generalized Linear Models, John Wiley & Sons, New York, 2015.

    Google Scholar 

  10. León-Novelo L, Fuentes C, and Emerson S, Marginal likelihood estimation of negative binomial parameters with applications to RNA-seq data, Biostatistics, 2017, 18(4): 637–650.

    Article  MathSciNet  Google Scholar 

  11. Kandemir Çetinkaya M and Kaçranlar S, Improved two-parameter estimators for the negative binomial and Poisson regression models, Journal of Statistical Computation and Simulation, 2019, 89(14): 2645–2660.

    Article  MathSciNet  Google Scholar 

  12. Kenne Pagui E C, Salvan A, and Sartori N, Improved estimation in negative binomial regression, Statistics in Medicine, 2022, 41(13): 2403–2416.

    Article  MathSciNet  Google Scholar 

  13. Breheny P and Huang J, Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection, The Annals of Applied Statistics, 2011, 5(1): 232–253.

    Article  MathSciNet  Google Scholar 

  14. Wei F and Zhu H, Group coordinate descent algorithms for nonconvex penalized regression, Computational Statistics & Data Analysis, 2012, 56(2): 316–326.

    Article  MathSciNet  Google Scholar 

  15. Breheny P and Huang J, Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors, Statistics and Computing, 2015, 25(2): 173–187.

    Article  MathSciNet  Google Scholar 

  16. Huang J, Jiao Y, Kang L, et al., GSDAR: A fast Newton algorithm for l0 regularized generalized linear models with statistical guarantee, Computational Statistics, 2022, 37(1): 507–533.

    Article  MathSciNet  Google Scholar 

  17. Fan J, Liu H, Sun Q, et al., I-LAMM for sparse learning: Simultaneous control of algorithmic complexity and statistical error, Annals of Statistics, 2018, 46(2): 814–841.

    Article  MathSciNet  Google Scholar 

  18. Jiang D and Huang J, Majorization minimization by coordinate descent for concave penalized generalized linear models, Statistics and Computing, 2014, 5(24): 871–883.

    Article  MathSciNet  Google Scholar 

  19. Wang Z, Liu H, and Zhang T, Optimal computational and statistical rates of convergence for sparse nonconvex learning problems, Annals of Statistics, 2014, 42(6): 2164–2201.

    Article  MathSciNet  Google Scholar 

  20. Lee Y and Nelder J A, Hierarchical generalised linear models: A synthesis of generalised linear models, randomeffect models and structured dispersions, Biometrika, 2001, 88(4): 987–1006.

    Article  MathSciNet  Google Scholar 

  21. Tseng P, Convergence of a block coordinate descent method for nondifferentiable minimization, Journal of Optimization Theory and Applications, 2001, 109(3): 475–494.

    Article  MathSciNet  Google Scholar 

  22. Kwon S and Kim Y, Large sample properties of the scad-penalized maximum likelihood estimation on high dimensions, Statistica Sinica, 2012, 22(2): 629–653.

    Article  MathSciNet  Google Scholar 

  23. Jochmann M, What belongs where? Variable selection for zero-inflated count models with an application to the demand for health care, Computational Statistics, 2013, 28: 1947–1964.

    Article  MathSciNet  Google Scholar 

  24. Wang Z, Ma S, and Wang C Y, Variable selection for zeroinflated and overdispersed data with application to health care demand in Germany, Biometrical Journal, 2015, 57(5): 867–884.

    Article  MathSciNet  Google Scholar 

  25. Riphahn R T, Wambach A, and Million A, Incentive effects in the demand for health care: A bivariate panel count data estimation, Journal of Applied Econometrics, 2003, 18(4): 387–405.

    Article  Google Scholar 

  26. Wang Z, Ma S, Wang C Y, et al., EM for regularized zeroinflated regression models with applications to postoperative morbidity after cardiac surgery in children, Statistics in Medicine, 2014, 33(29): 5192–5208.

    Article  MathSciNet  Google Scholar 

  27. Loeys T, Moerkerke B, De Smet O, et al., The analysis of zeroinflated count data: Beyond zeroinflated Poisson regression, British Journal of Mathematical and Statistical Psychology, 2012, 65(1): 163–180.

    Article  MathSciNet  Google Scholar 

  28. She Y, An iterative algorithm for fitting nonconvex penalized generalized linear models with grouped predictors, Computational Statistics & Data Analysis, 2012, 56(10): 2976–2990.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meiqi Li.

Ethics declarations

JIN Baisuo is an editorial member for Journal of Systems Science & Complexity and was not involved in the editorial review or the decision to publish this article. All authors declare that there are no competing interests.

Additional information

This research was supported in part by the National Natural Science Foundation of China under Grant Nos. 72111530199, 12231017 and 72293573, and in part by the Natural Science Foundation of Anhui Province of China under Grant No. 2108085J02.

This paper was recommended for publication by Editor TANG Liansheng.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, M., Jin, B. Advanced Algorithm for Parameters Estimation of Negative Binomial Distribution with High Dimensional Sparse Group Structure. J Syst Sci Complex 37, 2173–2195 (2024). https://doi.org/10.1007/s11424-024-3202-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11424-024-3202-4

Keywords

Navigation

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy