Abstract
We analyze the problem of processing of very large datasets on parallel systems and find that the natural approaches to parallelization fail for two reasons. One is connected to long-range correlations between data and the other comes from nonscalar nature of the data. To overcome those difficulties the new paradigm of the data processing is proposed, based on a statistical simulation of the datasets, which in its turn for different types of data is realized on three approaches – decomposition of the statistical ensemble, decomposition on the base of principle of mixing and decomposition over the indexing variable. Some examples of proposed approach show its very effective scaling.
Chapter PDF
Similar content being viewed by others
Keywords
- Significant Wave Height
- Statistical Ensemble
- Parallel Computer System
- Intrinsic Parallelization
- Random Value
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Adomian, G.: Stochastic systems. Academic Press, London (1983)
Anderson, T.W., Gupta, S.D., Styan, G.P.H.: A bibliography of multivariate statistical analysis. Robert E. Krieder Pub. Company, Hantington (1977)
Blais, J.A.R.: Estimation and spectral analysis. Univ. of Calgary Press (1988)
Boukhanovsky, A.V., Degtyarev, A.B., Rozhkov, V.A.: Peculiarities of computer simulation and statistical representation of time–spatial metocean fields. In: Alexandrov, V.N., Dongarra, J., Juliano, B.A., Renner, R.S., Tan, C.J.K. (eds.) ICCS-ComputSci 2001. LNCS, vol. 2073, pp. 463–472. Springer, Heidelberg (2001)
Boukhanovsky, A.V.: Multivariate stochastic models of metocean fields: computational aspects and applications. In: Sloot, P.M.A., Tan, C.J.K., Dongarra, J., Hoekstra, A.G. (eds.) ICCS-ComputSci 2002. LNCS, vol. 2329, pp. 216–225. Springer, Heidelberg (2002)
Boukhanovsky, A.V., Krogstad, H., Lopatoukhin, L., Rozhkov, V., Athanassoulis, G.: Stochastic simulation of inhomogeneous metocean fields. Part II: Synoptic variability and rare events. In: Sloot, P.M.A., Abramson, D., Bogdanov, A.V., Gorbachev, Y.E., Dongarra, J., Zomaya, A.Y. (eds.) ICCS 2003. LNCS, vol. 2658, pp. 223–233. Springer, Heidelberg (2003)
Boukhanovsky, A., Ivanov, S.: Stochastic simulation of inhomogeneous metocean fields. Part III: High-performance parallel algorithms. In: Sloot, P.M.A., Abramson, D., Bogdanov, A.V., Gorbachev, Y.E., Dongarra, J., Zomaya, A.Y. (eds.) ICCS 2003. LNCS, vol. 2658, pp. 234–244. Springer, Heidelberg (2003)
Boukhanovsky, et al.: Telemedicine complex on the base of supercomputer technologies. In: Proceeding of X Russian Scientific Conference “Telematica-2003”, vol. 1, pp. 288–289 (2003) (in Russian)
Dempster, A.P.: Elements of continuous multivariate analysis. Addison-Wesley Pub. Company, Reading (1969)
Foster, J.: Designing and Building Parallel Programs. Addison-Wesley, Reading (1995)
Gerbessiotis, A.V.: Architecture independent parallel algorithm design: theory vs practice. Future Generation Computer Systems 18, 573–593 (2002)
Hamon, B.V., Hannan, E.J.: Estimating relations between time series. J. of Geophysical Research 68(21), 6033–6042 (1963)
Jenkins, G.M., Watts, D.G.: Spectral analysis and its application, Holden-Day, San-Francisco (1969)
Koski, A.: Modelling ECG signals with hidden Markov models. Artificial intelligence in medicine (8), 453–471 (1996)
Leadbetter, M., Lindgren, G., Rootzen, H.: Extremes and related properties of random sequences and processes. Springer, NY (1986)
Loeve, M.: Fonctions aleatories de second odre. C.R. Acad. Sci. 220 (1945)
Lopatoukhin, L.J., Rozhkov, V.A., Ryabinin, V.E., Swail, V.R., Boukhanovsky, A.V., Degtyarev, A.B.: Estimation of extreme wave heights. JCOMM Technical Report, WMO/TD, #1041 (2000)
Lopatoukhin, L.J., et al.: The spectral wave climate in the Barents sea. In: Proceedings of Int. Conf OMAE 2002, Oslo, Norway, June 23-28 (2002) (CD-version)
Lutkepohl, H.: Introduction to multivariate time series analysis. Springer, Heidelberg (1991)
Ogorodnikov, V.A., Prigarin, S.M.: Numerical modelling of random processes and fields: algorithms and applications., VSP, Utrecht, the Netherlands (1996)
Pande, S., Agrawal, D.P. (eds.): Compiler Optimizations for Scalable Parallel Systems. LNCS, vol. 1808. Springer, Heidelberg (2001)
Pesaran, M.H., Slater, L.J.: Dynamic regression: theory and algorithms. Ellis Horwood Limited, NY (1980)
Rubinstein, R.Y.: Simulation and the Monte-Carlo method. John Wiley & Sons, Chichester (1981)
Yakowitz, S.J.: Computational Probability and Simulation. Addison-Wesley, Reading (1977)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bogdanov, A.V., Boukhanovsky, A.V. (2004). Advanced High Performance Algorithms for Data Processing. In: Bubak, M., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds) Computational Science - ICCS 2004. ICCS 2004. Lecture Notes in Computer Science, vol 3036. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24685-5_30
Download citation
DOI: https://doi.org/10.1007/978-3-540-24685-5_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22114-2
Online ISBN: 978-3-540-24685-5
eBook Packages: Springer Book Archive