Dynamic Detection of Uniform and Affine Vectors in GPGPU Computations

Collange, Caroline; Defour, David; Zhang, Yao

doi:10.1007/978-3-642-14122-5_8

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6043))

Included in the following conference series:

European Conference on Parallel Processing

1476 Accesses
12 Citations

Abstract

We present a hardware mechanism which dynamically detects uniform and affine vectors used in SPMD architecture such as Graphics Processing Units, to minimize pressure on the register file and reduce power consumption with minimal architectural modifications. A preliminary experimental analysis conducted with the Barra simulator shows that this optimization can benefit up to 34 % of register file reads and 22 % of the computations in common GPGPU applications.

Download to read the full chapter text

Chapter PDF

Soft-Error Effects on Graphics Processing Units

GPU Architecture

Accelerating Parallel Operation for Compacting Selected Elements on GPUs

References

NVIDIA: NVIDIA CUDA Compute Unified Device Architecture Programming Guide, Version 2.2 (2009)
Google Scholar
Lindholm, J.E., Nickolls, J., Oberman, S., Montrym, J.: NVIDIA Tesla: A unified graphics and computing architecture. IEEE Micro 28(2), 39–55 (2008)
Article Google Scholar
Bakhoda, A., Yuan, G., Fung, W.W.L., Wong, H., Aamodt, T.M.: Analyzing CUDA workloads using a detailed GPU simulator. In: Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Boston, April 2009, pp. 163–174 (2009)
Google Scholar
Diamos, G., Kerr, A., Kesavan, M.: Translating GPU binaries to tiered SIMD architectures with Ocelot. Technical Report GIT-CERCS-09-01, Georgia Institute of Technology (2009)
Google Scholar
Collange, C., Defour, D., Parello, D.: Barra, a parallel functional GPGPU simulator. Technical Report hal-00359342, Université de Perpignan (2009), http://hal.archives-ouvertes.fr/hal-00359342/en/
Rizk, G., Lavenier, D.: GPU accelerated RNA folding algorithm. In: Allen, G., Nabrzyski, J., Seidel, E., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2009. LNCS, vol. 5544, pp. 1004–1013. Springer, Heidelberg (2009)
Google Scholar
Mueller, S., Jacobi, C., Oh, H.J., Tran, K., Cottier, S., Michael, B., Nishikawa, H., Totsuka, Y., Namatame, T., Yano, N., Machida, T., Dhong, S.: The vector floating-point unit in a synergistic processor element of a CELL processor. In: 17th IEEE Symposium on Computer Arithmetic (ARITH-17), June 2005, pp. 59–67 (2005)
Google Scholar
Lindholm, E., Siu, M.Y., Moy, S.S., Liu, S., Nickolls, J.R.: Simulating multiported memories using lower port count memories. US Patent US 7339592 B2, NVIDIA Corporation (March 2008)
Google Scholar
Collange, C., Defour, D., Tisserand, A.: Power Consumption of GPUs from a Software Perspective. In: Allen, G., Nabrzyski, J., Seidel, E., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2009. LNCS, vol. 5544, pp. 922–931. Springer, Heidelberg (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

ELIAUS, Université de Perpignan, 66860, Perpignan, France
Caroline Collange & David Defour
ECE Department, University of California, Davis, USA
Yao Zhang

Authors

Caroline Collange
View author publications
You can also search for this author in PubMed Google Scholar
David Defour
View author publications
You can also search for this author in PubMed Google Scholar
Yao Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Insitute for Applied Mathematics, Delft University of Technology, 2628, Delft, The Netherlands
Hai-Xiang Lin
Scaledinfra technologies GmbH, Köllnerhofgasse 3/15A, 1010, Vienna, Austria
Michael Alexander
VTT, Kaitovayla 1, 90570, Oulu, Finland
Martti Forsell
Technische Universität Dresden, 01069, Dresden, Germany
Andreas Knüpfer
Institute for Computer Science, Technical University of Innsbruck, 6020, Innsbruck, Austria
Radu Prodan
Instituto Superior Técnico/INESC-ID., Rua Alves Redol 9, 1000-029, Lisbon, Portugal
Leonel Sousa
Jülich Supercomputing Centre, 52425, Jülich, Germany
Achim Streit

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Collange, C., Defour, D., Zhang, Y. (2010). Dynamic Detection of Uniform and Affine Vectors in GPGPU Computations. In: Lin, HX., et al. Euro-Par 2009 – Parallel Processing Workshops. Euro-Par 2009. Lecture Notes in Computer Science, vol 6043. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14122-5_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-14122-5_8
Published: 17 June 2010
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14121-8
Online ISBN: 978-3-642-14122-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Dynamic Detection of Uniform and Affine Vectors in GPGPU Computations

Abstract

Chapter PDF

Similar content being viewed by others

Soft-Error Effects on Graphics Processing Units

GPU Architecture

Accelerating Parallel Operation for Compacting Selected Elements on GPUs

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Dynamic Detection of Uniform and Affine Vectors in GPGPU Computations

Abstract

Chapter PDF

Similar content being viewed by others

Soft-Error Effects on Graphics Processing Units

GPU Architecture

Accelerating Parallel Operation for Compacting Selected Elements on GPUs

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.