Abstract
Spatial analysis tools and synthesis of results are key to identifying the best solutions in biodiversity conservation. The importance of process automation is associated with increased efficiency and performance both in the data pre-processing phase and in the post-analysis of the results generated by the packages and modeling programs. The Model-R framework was developed with the main objective of unifying pre-existing ecological niche modeling tools into a common framework and building a web interface that automates steps of the modeling process and occurrence data retrieval. The web interface includes RJabot, a functionality that allows for searching and retrieving occurrence data from Jabot, the main reference on botanical collections management system in Brazil. It returns data in a suitable format to be consumed by other components of the framework. Currently, the tools are multi-projection, they can thus be applied to different sets of temporal and spatial data. Model-R is also multi-algorithm, with seven algorithms available for modeling: BIOCLIM, Mahalanobis distance, Maxent, GLM, RandomForest, SVM, and DOMAIN. The algorithms as well as the entire modeling process may be parametrized using command-line tools or through the web interface. We hope that the use of this application, not only by modeling specialists but also as a tool for policy makers, will be a significant contribution to the continuous development of biodiversity conservation analysis. The Model-R web interface can be installed locally or on a server. A software container is provided to automate the installation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
References
Araújo, M.B., Williams, P.H.: Selecting areas for species persistence using occurrence data. Biol. Conserv. 96(3), 331–345 (2000)
Engler, R., Guisan, A., Rechsteiner, L.: An improved approach for predicting the distribution of rare and endangered species from occurrence and pseudo-absence data. J. Appl. Ecol. 41(2), 263–274 (2004)
Ortega-Huerta, M.A., Peterson, A.T.: Modelling spatial patterns of biodiversity for conservation prioritization in North-Eastern Mexico. Divers. Distrib. 10(1), 39–54 (2004)
Chen, Y.: Conservation biogeography of the snake family colubridae of China. North-West. J. Zool. 5(2), 251–262 (2009)
Peterson, A.T., Soberón, J., Pearson, R.G., Anderson, R.P., Martínez-Meyer, E., Nakamura, M., Araújo, M.B.: Ecological Niches and Geographic Distributions. Princeton University Press, Princeton (2011)
Anderson, R.P., Lew, D., Peterson, A.: Evaluating predictive models of species’ distributions: criteria for selecting optimal models. Ecol. Model. 162(3), 211–232 (2003)
Sillero, N.: What does ecological modelling model? A proposed classification of ecological niche models based on their underlying methods. Ecol. Model. 222(8), 1343–1346 (2011)
Santana, F., de Siqueira, M., Saraiva, A., Correa, P.: A reference business process for ecological niche modelling. Ecol. Inf. 3(1), 75–86 (2008)
Chang, W.: Shiny: Web Application Framework for R (2016). https://cran.r-project.org/web/packages/shiny
Gadelha, L., Guimarães, P., Moura, A.M., Drucker, D.P., Dalcin, E., Gall, G., Tavares, J., Palazzi, D., Poltosi, M., Porto, F., Moura, F., Leo, W.V.: SiBBr: Uma Infraestrutura para Coleta, Integração e Análise de Dados sobre a Biodiversidade Brasileira. In: VIII Brazilian e-Science Workshop (BRESCI 2014). Proceedings XXXIV Congress of the Brazilian Computer Society (2014)
Tyberghein, L., Verbruggen, H., Pauly, K., Troupin, C., Mineur, F., De Clerck, O.: Bio-ORACLE: a global environmental dataset for marine species distribution modelling. Global Ecol. Biogeogr. 21, 272–281 (2012)
Agafonkin, V.: Leaflet - a JavaScript library for interactive maps (2016). http://leafletjs.com/
Guisan, A., Zimmermann, N.E.: Predictive habitat distribution models in ecology. Ecol. Model. 135(2–3), 147–186 (2000)
Lomba, A., Pellissier, L., Randin, C., Vicente, J., Moreira, F., Honrado, J., Guisan, A.: Overcoming the rare species modelling paradox: a novel hierarchical framework applied to an Iberian endemic plant. Biol. Conserv. 143(11), 2647–2657 (2010)
Hijmans, R.J., Elith, J.: dismo: Species Distribution Modeling (2016). https://cran.r-project.org/web/packages/dismo
Thuiller, W., Lafourcade, B., Engler, R., Araújo, M.B.: BIOMOD - a platform for ensemble forecasting of species distributions. Ecography 32(3), 369–373 (2009)
Araújo, M.B., Whittaker, R.J., Ladle, R.J., Erhard, M.: Reducing uncertainty in projections of extinction risk from climate change: uncertainty in species’ range shift projections. Glob. Ecol. Biogeogr. 14(6), 529–538 (2005)
Freire, J., Koop, D., Santos, E., Silva, C.: Provenance for computational tasks: a survey. Comput. Sci. Eng. 10(3), 11–21 (2008)
Gadelha Jr., L.M.R., Mattoso, M.: Applying provenance to protect attribution in distributed computational scientific experiments. In: Ludäscher, B., Plale, B. (eds.) IPAW 2014. LNCS, vol. 8628, pp. 139–151. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16462-5_11
Sandve, G.K., Nekrutenko, A., Taylor, J., Hovig, E.: Ten simple rules for reproducible computational research. PLoS Comput. Biol. 9(10), e1003285 (2013)
Wilson, G., Aruliah, D.A., Brown, C.T., Chue Hong, N.P., Davis, M., Guy, R.T., Haddock, S.H.D., Huff, K.D., Mitchell, I.M., Plumbley, M.D., Waugh, B., White, E.P., Wilson, P.: Best practices for scientific computing. PLoS Biol. 12(1), e1001745 (2014)
Carvalho, G.: Flora: tools for interacting with the Brazilian flora 2020 (2016). https://cran.r-project.org/web/packages/flora/index.html
Cayuela, L., Oksanen, J.: Taxonstand: taxonomic standardization of plant species names (2016). https://cran.r-project.org/web/packages/Taxonstand
Chamberlain, S.A., Szöcs, E.: Taxize: taxonomic search and retrieval in R. F1000Research 2, 191 (2013)
Chamberlain, S., Szoecs, E., Foster, Z., Boettiger, C., Ram, K., Bartomeus, I., Baumgartner, J., O’Donnell, J.: Taxize: taxonomic information from around the web (2016). https://cran.r-project.org/web/packages/taxize
Allouche, O., Tsoar, A., Kadmon, R.: Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS). J. Appl. Ecol. 43(6), 1223–1232 (2006)
Knaus, J.: Snowfall: easier cluster computing (based on snow) (2016). https://cran.r-project.org/web/packages/snowfall
Wickham, H.: Advanced R. Chapman and Hall/CRC, Boca Raton (2014)
Simmonds, C.: Mastering embedded linux programming. Packt, Birmingham (2015)
Biomodelos: Instituto Alexander von Humboldt (2016). http://biomodelos.humboldt.org.co
Vicario, S., Hardisty, A., Haitas, N.: BioVeL: Biodiversity virtual e-Laboratory. EMBnet.journal 17(2), 5 (2011)
Liu, J., Pacitti, E., Valduriez, P., Mattoso, M.: A survey of data-intensive scientific workflow management. J. Grid Comput. 13(4), 457–493 (2015)
Souza Muñoz, M.E., Giovanni, R., Siqueira, M.F., Sutton, T., Brewer, P., Pereira, R.S., Canhos, D.A.L., Canhos, V.P.: openModeller: a generic approach to species’ potential distribution modelling. GeoInformatica 15(1), 111–135 (2009)
Naimi, B., Araújo, M.B.: Sdm: a reproducible and extensible R platform for species distribution modelling. Ecography 39(4), 368–375 (2016)
Kass, J., Anderson, R.P., Aiello-Lammens, M., Muscarella, B., Vilela, B.: Wallace (beta v0.1): Harnessing Digital Biodiversity Data for Predictive Modeling, Fueled by R (2016). http://devpost.com/software/wallace-beta-v0-1-harnessing-digital-biodiversity-data-for-predictive-modeling-fueled-by-r
Pennington, D.D., Higgins, D., Peterson, A.T., Jones, M.B., Ludäscher, B., Bowers, S.: Ecological niche modeling using the kepler workflow system. In: Taylor, I.J., Deelman, E., Gannon, D.B., Shields, M. (eds.) Workflows for e-Science, pp. 91–108. Springer, London (2007). https://doi.org/10.1007/978-1-84628-757-2_7
Talbert, C., Talbert, M., Morisette, J., Koop, D.: Data management challenges in species distribution modeling. IEEE Bull. Techn. Committee Data Eng. 36(4), 31–40 (2013)
Morisette, J.T., Jarnevich, C.S., Holcombe, T.R., Talbert, C.B., Ignizio, D., Talbert, M.K., Silva, C., Koop, D., Swanson, A., Young, N.E.: VisTrails SAHM: visualization and workflow management for species habitat modeling. Ecography 36(2), 129–135 (2013)
Candela, L., Castelli, D., Coro, G., Pagano, P., Sinibaldi, F.: Species distribution modeling in the cloud. Concurrency Comput. Pract. Exp. 28(4), 1056–1079 (2016)
Candela, L., Castelli, D., Coro, G., Lelii, L., Mangiacrapa, F., Marioli, V., Pagano, P.: An infrastructure-oriented approach for supporting biodiversity research. Ecol. Inf. 26, 162–172 (2014)
Amaral, R., Badia, R.M., Blanquer, I., Braga-Neto, R., Candela, L., Castelli, D., Flann, C., De Giovanni, R., Gray, W.A., Jones, A., Lezzi, D., Pagano, P., Perez-Canhos, V., Quevedo, F., Rafanell, R., Rebello, V., Sousa-Baena, M.S., Torres, E.: Supporting biodiversity studies with the EUBrazilOpenBio hybrid data infrastructure. Concurrency Comput. Pract. Exp. 27(2), 376–394 (2015)
Forzza, R., Mynssen, C., Tamaio, N., Barros, C., Franco, L., Pereira, M.: As coleções do herbário. 200 anos do Jardim Botânico do Rio de Janeiro. Jardim Botânico do Rio de Janeiro, Rio de Janeiro (2008)
Mondelli, M.L., Galheigo, M., Medeiros, V., Bastos, B.F., Gomes, A.T.A., Vasconcelos, A.T.R., Gadelha Jr., L.M.R.: Integrating scientific workflows with scientific gateways: a bioinformatics experiment in the brazilian national high-performance computing network. In: X Brazilian e-Science Workshop. Anais do XXXVI Congresso da Sociedade Brasileira de Computação, SBC, pp. 277–284 (2016)
Wilde, M., Hategan, M., Wozniak, J.M., Clifford, B., Katz, D.S., Foster, I.: Swift: a language for distributed parallel scripting. Parallel Comput. 37(9), 633–652 (2011)
Gadelha, L.M.R., Wilde, M., Mattoso, M., Foster, I.: Exploring provenance in high performance scientific computing. In: Proceedings of the 1st Annual Workshop on High Performance Computing meets Databases - HPCDB 2011, pp. 17–20. ACM Press (2011)
Mondelli, M.L., de Souza, M.T., Ocaña, K., de Vasconcelos, A.T.R., Gadelha Jr., L.M.R.: HPSW-Prof: a provenance-based framework for profiling high performance scientific workflows. In: Proceedings of Satellite Events of the 31st Brazilian Symposium on Databases (SBBD 2016), SBC, pp. 117–122 (2016)
Armbrust, M., Das, T., Davidson, A., Ghodsi, A., Or, A., Rosen, J., Stoica, I., Wendell, P., Xin, R., Zaharia, M.: Scaling spark in the real world: performance and usability. Proc. VLDB Endowment 8(12), 1840–1843 (2015)
Venkataraman, S., Stoica, I., Zaharia, M., Yang, Z., Liu, D., Liang, E., Falaki, H., Meng, X., Xin, R., Ghodsi, A., Franklin, M.: SparkR: scaling R programs with spark. In: Proceedings of the 2016 International Conference on Management of Data - SIGMOD 2016, 1099–1104. ACM Press, New York, USA (2016)
Chamberlain, S.: rgbif: Interface to the Global ‘Biodiversity’ Information Facility ‘API’ (2017). R package version 0.9.8. https://CRAN.R-project.org/package=rgbif
Liaw, A., Wiener, M.: Classification and regression by randomForest. R News 2(3), 18–22 (2002)
Karatzoglou, A., Smola, A., Hornik, K., Zeileis, A.: kernlab - an S4 package for kernel methods in R. J. Stat. Softw. 11(9), 1–20 (2004). http://www.jstatsoft.org/v11/i09/
Acknowledgments
This work has been supported by CNPq (Grants 461572/2014-1 SiBBr - SEPED/MCTIC and 441929/2016-8 Edital MCTI/CNPQ/Universal).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Sánchez-Tapia, A. et al. (2018). Model-R: A Framework for Scalable and Reproducible Ecological Niche Modeling. In: Mocskos, E., Nesmachnow, S. (eds) High Performance Computing. CARLA 2017. Communications in Computer and Information Science, vol 796. Springer, Cham. https://doi.org/10.1007/978-3-319-73353-1_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-73353-1_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73352-4
Online ISBN: 978-3-319-73353-1
eBook Packages: Computer ScienceComputer Science (R0)