Christian Eh An Sen 2
Christian Eh An Sen 2
with CUDA
John Nickolls, Ian Buck, Michael Garland and Kevin Skadron
Outline
● John Nickolls
– Director of Architecture at NVIDIA
– MS and PhD degrees in electrical engineering
from Stanford University
– Previously at Broadcom and Sun Microsystems
● Ian Buck
– GPU-Compute Software Manager at NVIDIA
– PhD in computer science from Stanford
– Has previously worked on Brook
Hardware Platform
Hardware Platform
● CUDA is a minimal extension to C and C++ (like CILK,
but not quite as easy)
● A serial program calls parallel kernels that may be a
function or a full program
● Function type qualifiers
– __device__, __global__, __host__
● Value type qualifiers
– __device__, __constant__, __shared__
What is CUDA?
● Kernels execute over a set of parallel threads
● Threads are organized in a hierarchy of grids of thread
blocks
● Blocks can have up to 3 dimensions and contain up to
512 threads
– Threads in blocks can communicate
● Grids can also have up to 3 dimensions and 65,536²
blocks
– No communication between blocks
What is CUDA?
What is CUDA
● Computing y <- ax + y with a Serial ● Computing y <- ax + y in parallel
Loop using CUDA
Programming in CUDA
Programming in CUDA
● Lots of different examples on nvidia.com
– Examples are image analysis (e.g. facial recognition), MRI
mapping, ray tracing, neural networks, and molecular
dynamics simulation
– Speed-ups from 1.3x (numerical weather prediction) to
250x (graphic-card cluster for astrophysics simulations)
Other Applications
N-Body Simulation
● OpenCL
● CTM
● RapidMind
GPGPU/MC Approaches
● Extremely high (and cheap) processing power
– 8800GTS: 640 GFLOP/s
– Core2Duo 2.66GHz: 17 GFLOP/s
– Core2Quad 3GHz (3,500kr): 43 GFLOP/s
– 2 x 8800GT(2,000kr): 1 TFLOP/s
– 8600GTM: 30 GFLOP/s
Conclusion
● Is GPGPU taking over multi-core CPUs?
– No (not yet, anyway)
● GPGPU programming has some problems
– Only applicable to large applications (or so it seems)
– When is it worth it to do it on the GPU?
– Possible problems with optimization
– Most programmers not used to working with GPUs
● Many rumors in the press on unified CPU and GPU in
the future, but nothing confirmed yet.
Conclusion
● Nice article, well written
● Gives good insight into what CUDA is, but the hardware
description is lacking
● Good sales speech, does not mention possible
problems with CUDA
Presenters Opinion
● Thank you
All Done