0% found this document useful (0 votes)
6 views9 pages

Weiskopf 2004

This paper presents a GPU-based implementation of nonlinear ray tracing that eliminates data transfer to main memory, enhancing performance through techniques like early ray termination and adaptive ray integration. The architecture involves ray setup, integration, intersection, and local illumination, allowing for the visualization of complex physical phenomena such as gravitational lensing and continuous refraction. The authors discuss various applications and the mathematical framework underpinning the rendering process, emphasizing the efficiency of their approach in handling curved light rays.

Uploaded by

saomai123464
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views9 pages

Weiskopf 2004

This paper presents a GPU-based implementation of nonlinear ray tracing that eliminates data transfer to main memory, enhancing performance through techniques like early ray termination and adaptive ray integration. The architecture involves ray setup, integration, intersection, and local illumination, allowing for the visualization of complex physical phenomena such as gravitational lensing and continuous refraction. The authors discuss various applications and the mathematical framework underpinning the rendering process, emphasizing the efficiency of their approach in handling curved light rays.

Uploaded by

saomai123464
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

EUROGRAPHICS 2004 / M.-P. Cani and M.

Slater Volume 23 (2004), Number 3


(Guest Editors)

GPU-Based Nonlinear Ray Tracing

Daniel Weiskopf, Tobias Schafhitzel and Thomas Ertl

Institute of Visualization and Interactive Systems


University of Stuttgart

Abstract
In this paper, we present a mapping of nonlinear ray tracing to the GPU which avoids any data transfer back to
main memory. The rendering process consists of the following parts: ray setup according to the camera param-
eters, ray integration, ray–object intersection, and local illumination. Bent rays are approximated by polygonal
lines that are represented by textures. Ray integration is based on an iterative numerical solution of ordinary dif-
ferential equations whose initial values are determined during ray setup. To improve the rendering performance,
we propose acceleration techniques such as early ray termination and adaptive ray integration. Finally, we dis-
cuss a variety of applications that range from the visualization of dynamical systems to the general relativistic
visualization in astrophysics and the rendering of the continuous refraction in media with varying density.
Categories and Subject Descriptors (according to ACM CCS): I.3.3 [Computer Graphics]: Picture/Image Generation
I.3.7 [Computer Graphics]: Three-Dimensional Graphics and Realism

1. Introduction computed. Typically, nonlinear ray tracing is slower by more


than one or two orders of magnitude, compared to linear ray
Ray tracing is a versatile technique for the computation of
tracing for the same scene. The goal of this paper is to pro-
global illumination and, therefore, is widely used in com-
vide a fast GPU implementation of nonlinear ray tracing that
puter graphics. It is based on geometric optics and is deter-
exploits the strengths of current GPUs by realizing all ray-
mined by two major components—the propagation of light
tracing steps without any data transfer back to main mem-
between scene objects and the interaction between light and
ory. In addition to this direct GPU mapping, we introduce
matter. The underlying mathematical framework can be for-
a number of acceleration techniques to further improve the
mulated in the form of the rendering equation [Kaj86]. In
rendering performance, e.g., early ray termination and adap-
traditional ray tracing, the interaction is restricted to reflec-
tive ray integration. Finally, we discuss examples from the
tion and transmission points on the objects’ surfaces and the
visualization of dynamical systems, general relativistic visu-
light propagation is assumed to be linear between these in-
alization in astrophysics, and the rendering of the continuous
tersection points. Nonlinear ray tracing generalizes the light-
refraction in media with varying density.
propagation step by including curved light rays, while keep-
ing the traditional reflection and transmission computations.
This model is an appropriate description for a number of 2. Previous Work
physical scenarios, such as gravitational lensing by strong
Related previous work can be separated into two different
gravitational sources (like galaxy clusters or neutron stars)
categories that have not yet been investigated together: GPU-
or light propagation within a medium with a space-variant
based methods and nonlinear ray tracing. With respect to
index of refraction (like mirages or in the vicinity of explo-
GPU techniques, the implementations of linear ray tracing
sions).
[PBMH02] and the computation of intersections between
Nonlinear ray tracing builds upon linear ray tracing and rays and triangles [CHH02] are most relevant for this paper.
essentially extends the representation of light rays—bent Recently, this line of research has been extended to photon
rays are approximated by polygonal lines. Unfortunately, mapping [PDC∗ 03], and radiosity and subsurface scattering
this polygonal representation leads to a significant increase [CHH03] on GPUs. In general, there is a prevailing trend
in the number of ray–object intersections that have to be towards using GPUs for a variety of general purpose com-

c The Eurographics Association and Blackwell Publishing 2004. Published by Blackwell


Publishing, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden,
MA 02148, USA.
626 D. Weiskopf, T. Schafhitzel & T. Ertl / GPU-Based Nonlinear Ray Tracing

putations [GPG04], e.g., for matrix and optimization opera-


Camera Ray Setup
tions [BFGS03, HMG03, KW03], or the simulation of cloud
dynamics [HBSL03].
In the second category, Gröller [Grö95] presents a generic Ray
ODE
approach for CPU-based nonlinear ray tracing and discusses Integration
a number of applications for visualizing mathematical and
physical systems. A specific application of nonlinear ray
tracing is the visualization of gravitational light bending Scene
Intersection
Objects n segs
within general relativity [HW01, KWR02, Wei00]. In the
physics literature, the light deflection by neutron stars and
black holes is of special interest [NRHK89, Nem93]. Fi- Normals
Local
nally, nonlinear ray tracing can be used to simulate the Material
Illumination
Texture
continuous refraction in a medium that exhibits a space-
variant index of refraction, e.g., in the vicinity of explo-
sions [YOH00] or for the refraction aspects found in mirages
[BTL90].
Figure 1: Architecture for nonlinear ray tracing.

3. Basic Architecture for Nonlinear Ray Tracing


Our architecture for nonlinear ray tracing is built
upon the structure of GPU-based linear ray tracing
rays. Then, the nrays × nobjs intersection computations are
[CHH02, PBMH02]. Two categories of entities have to be
performed for the current segment index. Finally, pixels are
represented: scene objects and light rays. Neglecting possi-
shaded via a local illumination model when an intersection
ble acceleration data structures (like octrees), a key element
is found. Since the number of ray segments is fixed and the
of linear ray tracing is the intersection between nrays rays
same for all rays, the loop over segments can be conceptu-
and nobjs objects. With [CHH02], this nrays × nobjs problem
ally unrolled. Therefore, the ray-tracing architecture is still
could be regarded to have a “crossbar” structure. On a GPU,
compatible with the streaming model on GPUs [PBMH02].
the scene objects can be represented by vertex-based geom-
In the following subsections, the previously mentioned steps
etry, and the rays can be represented by textures. Fragment
are discussed in more detail.
operations in programmable pixel shaders allow us to com-
bine textures and geometry in such a “crossbar” fashion.
Nonlinear ray tracing introduces an additional dimension: 3.1. Ray Setup
each ray consists of a number of ray segments, nsegs . Stated
differently, a three-dimensional “crossbar” structure would Ray setup computes the initial values for the primary rays.
be required for this nrays ×nobjs ×nsegs problem. We separate In general, the state of a point on a ray is described by its
the three-dimensional structure into two levels. In an inner position x ∈ R3 and its direction v ∈ R3 . We denote the
loop, nrays × nobjs intersection computations are performed, combination p ≡ (x, v) as an element in ray phase space
one for a single segment of each ray. This part is analogous P = R6 . The ray phase space elements are stored in 2D tex-
to the implementation of linear ray tracing. The outer loop tures whose texels correspond to the associated pixels on the
iterates over all segments nsegs . image plane. The six components of p are distributed over
two textures: one texture for positions, the other texture for
We restrict ourselves to a ray casting model, traversing directions. In our implementation, the components of these
only primary eye rays. Shadow rays are problematic in non- textures are stored with floating-point precision for maxi-
linear ray tracing because a linear projection from light mum accuracy and flexibility. During ray setup, the initial
sources to scene objects is generally not possible (see the values for x are set to the camera position, and the values for
discussion in [Grö95]). Therefore, secondary effects are ne- v are determined by the direction towards the corresponding
glected. Figure 1 illustrates the complete architecture for this pixel. The texture for v is filled by rendering a quadrilateral
basic nonlinear ray tracing. Ray segments are stored in 2D that covers the image plane, using a pixel shader program for
textures that have a one-to-one mapping to the correspond- the computation of the initial directions. The texture for x is
ing pixels on the image plane. Scene objects are represented initialized by writing the (constant) camera position.
by vertex-based geometry. In the first step, the initial values
for the first segment of each ray are set according to the cam-
era parameters. The subsequent three steps are iterated over 3.2. Ray Integration
all segments nsegs . This loop begins with the computation
of the current ray segment, guided by the ordinary differen- In a generic approach to nonlinear ray tracing, the bending
tial equation (ODE) of an underlying model for the curved of rays is determined by an equation of motion in the form

c The Eurographics Association and Blackwell Publishing 2004.


D. Weiskopf, T. Schafhitzel & T. Ertl / GPU-Based Nonlinear Ray Tracing 627

of a system of ODEs, program in the form of attached texture coordinates. The


dx(τ) intersection between an object and all rays is triggered by
= v(τ) , rendering a viewport-filling quad. The pixel shader program

checks whether an intersection takes place and, if that is the
dv(τ)
= f(x(τ), v(τ), . . . ) , (1) case, computes the intersection point.

where τ describes the parameterization along the rays. The Visibility is determined by including a test against the
properties of the equation of motion are determined by the depth value that is stored in a depth texture. If the current
function f. The dots indicate that there could be additional intersection point is closer, the depth texture will be updated
parameters that affect the propagation of light. The ini- and the corresponding pixel will be drawn (as described in
tial value problem for these differential equations could be the following subsection on local illumination). Note that the
solved by any explicit numerical integration scheme. For z value (in eye or clip coordinates) is not appropriate to de-
simplicity we consider first-order Euler integration, scribe a depth ordering for nonlinear ray tracing. Instead, the
monotonically increasing index i serves as a measure for the
xi+1 = h vi + xi , distance between camera and a point on a ray. For the loca-
vi+1 = h f(xi , vi , . . . ) + vi , (2) tion of a point within a segment, we use a local coordinate
that has value 0 at the beginning of the segment and value
with the stepsize h and the index i for the points along the
1 at the end. The summation of the integer index i and the
polygonal approximation of the light rays. At each integra-
fractional part from the local coordinate results in an accu-
tion step, a read access to the textures with index i and a
rate depth ordering. The depth buffer is implemented by a
write access to the textures with index i + 1 is required. To
floating-point texture. The simultaneous read and write ac-
save memory, only these two copies of textures are held on
cess for this depth buffer is realized by ping-pong rendering.
the GPU. After each integration step, the two copies are al-
ternatingly exchanged in a ping-pong rendering scheme. The
numerical operations for Eq. (2) are implemented in a pixel 3.4. Local Illumination
shader program that outputs its results to multiple render tar-
gets, namely the x and v textures. Once again, the fragments The last step is the local illumination at the ray–object inter-
are generated by rendering a single viewport-filling quadri- section points. The current implementation supports ambi-
lateral. ent lighting and light-emitting objects. Textures can be used
to modulate the material properties of objects. In addition,
We allow the user to choose the integration stepsize h in- “fake” Blinn-Phong illumination is implemented, based on
dependently from the ray segment length. Typically, a large straight connections between a hit point on the surface and
number of integration steps is required to achieve an appro- the light sources. This model is mainly used for testing pur-
priate numerical accuracy. In contrast, the number of ray seg- poses.
ments could be smaller without introducing inaccuracy arti-
facts in the ray–object intersection. A user-specified param- Based on the position of the intersection, texture coor-
eter nintegration describes the number of internal integration dinates, normal vectors, and material colors are computed
steps before a single ray segment is output to the intersec- by interpolation and used to evaluate the local illumination
tion process. A typical value for nintegration is 10. model. Finally, the value is updated in the image buffer,
which is a texture that holds the intermediate pixel colors.
Since local illumination requires information about the inter-
3.3. Ray–Object Intersection section point, its implementation is combined with the above
A ray segment is defined by the line between xi and xi+1 intersection computation in a single pixel shader program.
from the two latest points in ray phase space. Ray–object
intersections are computed directly after the ray integration
4. Acceleration Techniques and Extensions
step. In this way, previous, older segments are no longer
needed and can be discarded; i.e., this is a reason why only 4.1. Early Ray Termination
two copies of the ray phase space textures have to be stored
Nonlinear ray tracing builds upon a large number of ray seg-
simultaneously.
ments of finite length. Therefore, the total length of rays di-
Our implementation supports triangles and spheres as rectly determines the rendering time. The goal of early ray
primitive objects. For triangle–ray intersection, we have termination is to reduce the computational steps by pruning
adopted the approach by Carr et al. [CHH02] and added a rays. We consider two different conditions under which a
check for the finite extent of a segment. The computation of ray can be stopped without introducing any errors. The first
sphere–ray intersection is based on [Gla93] and also takes pruning criterion exploits the fact that only the first hit is re-
into account a finite segment length. Following [CHH02], quired for opaque scene objects: Once (at least) one intersec-
a single scene object is represented by a vertex-based primi- tion is found within a ray segment, no further segments need
tive. The object parameters are transferred to the pixel shader to be generated and traversed. The second criterion relies on

c The Eurographics Association and Blackwell Publishing 2004.


628 D. Weiskopf, T. Schafhitzel & T. Ertl / GPU-Based Nonlinear Ray Tracing

therefore exploit the early z test to discard pixel operations


Ray Setup
for terminated rays.

Test for Ray


Termination 4.2. Adaptive Ray Integration
Adaptive ray integration is an approach to increase both ren-
Ray dering performance and quality. We adopt the idea of step-
Integration size control from adaptive numerical integration schemes.
In a step-doubling approach, each integration step is taken
if terminated

n segs twice, once as a full step, then, independently as two half


Intersection steps. The difference between both results is a measure for
the accuracy associated with the current stepsize. If the dif-
ference is below a given threshold, the stepsize is increased;
Local if the difference is larger than another threshold, the step-
Illumination size is decreased. Since pixel shader programs do not sup-
port conditional branching, this decision is realized by using
compare instructions (cmp command in DirectX).
The GPU implementation is illustrated in Figure 3. The
Figure 2: Data flow with early ray termination. steps with sizes h and h/2 are computed by the Euler integra-
tion process from Section 3.2. The current stepsize, which
may differ from ray to ray, is stored in the previously unused
fourth component of the v texture. The “Compute New Step-
cutting rays that have left the scene and propagate to infinity size” pixel shader compares both results and determines the
without any chance of intersecting objects. Here, a bound- stepsize for the subsequent integration step.
ing geometry is laid around the scene, and rays that inter-
sect this geometry are discarded. Our implementation uses a
Previous Point in ODE Integration
bounding sphere that contains all scene objects and is placed Ray Phase Space With Step h
in almost flat space in which rays cannot “turn around” and
propagate back into the scene.
ODE Integration
Early ray termination essentially leads to a conditional
With Step h/2
break in the loop over ray segments. Since breaks are not
compatible with the streaming model of GPUs, we propose
the following approach. As shown in Figure 2, a complete ODE Integration
loop over all segments nsegs is performed for all rays. How- With Step h/2
ever, the expensive computations for ray integration, inter-
sections, and local illumination are conceptually skipped for
terminated rays. The early z test is effective in aborting pixel Compute New
operations early and therefore cuts down computation times Stepsize
significantly. The early z test allows us to essentially skip the
pixel shader operations for the integration, computation, and
illumination of terminated rays, even though the correspond- Figure 3: Adaptive ray integration.
ing fragments are still generated during rasterization.
The early z test is implicitly enabled on modern GPUs,
but works only under some conditions. Most importantly, Stepsize control not only improves the speed and quality
the depth value of a fragment must not be modified by a of numerical integration, but, at the same time, can reduce
pixel shader program. In the architecture from Figure 2, the the number of ray segments. For example, regions with only
pixel shader for the “Test for Ray Termination” checks the weakly curved rays are covered with large segments and,
aforementioned termination criteria. It outputs a depth value therefore, only few intersection computations are required.
of 1 (i.e., the distance of the far clipping plane) if the ray is Of course, this speed-up is only effective in combination
not terminated, and a value of 0 (i.e., near clipping plane) with early ray termination at scene boundaries and objects
if the ray is terminated. This step does not yet use the early because otherwise integration and intersection computations
z test. The subsequent steps, however, are realized by ren- would be performed for a constant, maximum number of
dering quadrilaterals with a constant z value of 0.5 and can segments.

c The Eurographics Association and Blackwell Publishing 2004.


D. Weiskopf, T. Schafhitzel & T. Ertl / GPU-Based Nonlinear Ray Tracing 629

4.3. Environment Mapping for Asymptotically Flat Sky gauge bosons of the Maxwell field) and therefore has µ = 0.
Similarly, Newton’s law of gravitation also has vanishing µ.
Light rays that leave the scene boundary usually result in
On the other hand, the strong interaction between nucleons
the background color. As an alternative, we use an environ-
(such as protons or neutrons) is mediated by heavy particles,
ment texture that represents light-emitting objects at infinity.
which reduces the range of interaction and is described by a
A cube texture implements such a “sky box”, where the di-
non-vanishing positive value for µ. Adopting a generalized
rection v of a ray at the boundary serves as texture coordi-
radial field, ξ(r) can represent any continuous function. The
nates. This model is valid for scenarios in which the bound-
example in Figure 4 (c) uses
ary geometry already is in (almost) flat regions (where rays
are not or only slightly bent). r3 r2
ξ(r) = 2 3
−3 2 +1 ,
R R
5. Applications for 0 ≤ x ≤ R [Grö95].

5.1. Visualization of Nonlinear Dynamics


5.3. Air with Continuously Varying Index of Refraction
One field of application for nonlinear ray tracing is the vi-
sualization of dynamical systems. Nonlinear dynamics and Within a medium with a space-variant index of refraction,
chaotic behavior can be investigated by examining paths in light is subject to continuous refraction. Typically, a vary-
phase space that describe the temporal evolution of a dynam- ing index of refraction is caused by a non-constant density
ical system [ASY96]. We follow [Grö95] in discussing two of air, for example for mirages [BTL90] or in the vicinity of
examples for chaotic systems—the Lorenz and the Rössler explosions [YOH00]. The model of [YOH00] allows for a
systems. The Lorenz system [Lor63] is governed by discretization of continuous refraction: The index of refrac-
tion is updated along the light ray; and when the index of
σ(y − x)
 
refraction changes by more than a threshold, the new direc-
f(x) = ρx − y − xz , tion of the ray is computed according to Snell’s law using
xy − βz the gradient of the refraction index as the surface normal.
with x = (x, y, z). Figure 4 (b) shows an example for ray Figure 4 (d) shows an example of light propagation in
tracing with the parameters β = 8/3, σ = 10, ρ = 28. The a medium with varying index of refraction. The index of
Rössler system [Rös76] is described by refraction resembles the explosion model from [YOH00].

−(x + z)
 However, we do not use a numerical simulation to compute
f(x) =  x + αy  . the spatial distribution of the index of refraction, but a noise-
β + z(x − γ) based procedural model that is not based on a physics simu-
lation.
A possible choice of parameters is α = 3/8, β = 2, and γ = 4.

5.4. General Relativistic Visualization


5.2. Motion in a Potential Field
Light is bent by gravitational sources and therefore non-
Another closely related example for nonlinear ray tracing linear ray tracing is ideally suited for the visualization of
uses the motion in a potential field to model curved paths. the effects of general relativity on light propagation. Light
The equation of motion for a particle within a radial force rays are identical to null geodesics within the curved space-
field that is centered around the point xc yields: times of general relativity (see Appendix A). The underlying
(x − xc ) geodesic equation is a second-order ODE that can be trans-
f(x) = − ξ(r) , formed into the structure of Eq. (1). Figure 4 (e) illustrates
r
light bending around a non-rotating black hole whose space-
where r = ||x − xc || is the distance to the center point, and time is described by the Schwarzschild metric. For compari-
the function ξ(r) describes the radial behavior. This force son, the test scene from Figure 4 (a) is used. The astrophysi-
field subsumes the Yukawa potential, cal scenario in Figure 4 (f) shows a neutron star (blue) and a
e−µr much less heavier, accompanying star (yellow) in front of a
V (x) = ζ , background star field. Checkerboard textures are attached to
r
the two stars to visualize the distortions on the surfaces. The
which represents the effective potential for a large class of
background is represented by an environment texture that is
fundamental physical particle–particle interactions [Gro93].
mapped onto a “sky box” as described in Section 4.3.
The corresponding force is computed by determining the
gradient of the potential, i.e., f(x) = −∇V . The parameter Relativistic visualization supports scientists in under-
µ reflects the rest mass of the particles that mediate the inter- standing numerical or analytical results from gravitational
action; ζ is a constant scaling factor. For example, electro- research because it provides a compact representation that is
magnetic interaction is mediated by massless photons (the independent of the coordinate system [Wei00]. In addition,

c The Eurographics Association and Blackwell Publishing 2004.


630 D. Weiskopf, T. Schafhitzel & T. Ertl / GPU-Based Nonlinear Ray Tracing

(a) (b)

(c) (d)

(e) (f)
Figure 4: Images generated by GPU-based nonlinear ray tracing: (a) undistorted image of the test scene; (b) rays governed
by the Lorenz system, with the same test scene as in (a); (c) rays in a radial potential field; (d) rays in a medium with varying
index of refraction; (e) general relativistic ray tracing with a black hole (Schwarzschild spacetime). Image (f) visualizes an
astrophysical scenario with a neutron star (blue) and a much less heavier, accompanying star (yellow) in front of a background
star field. Checkerboard textures are attached to the two stars to reveal the distortions on the surfaces.

c The Eurographics Association and Blackwell Publishing 2004.


D. Weiskopf, T. Schafhitzel & T. Ertl / GPU-Based Nonlinear Ray Tracing 631

visualization also serves as a tool in teaching physics courses Table 1: Rendering times in seconds on a 800 × 600 view-
and explaining important aspects of relativity to the public, port, with 10 scene objects, 30 ray segments, and 300 inte-
e.g., in popular-science films or exhibitions. gration steps.

6. Implementation and Results No Acceleration Acceleration


Methods Methods
Our implementation is based on DirectX 9.0, and all frag-
Lorenz System 3.87 3.13
ment operations are formulated in the assembler-level Pixel
Potential Field 4.22 1.16
Shader 2.0 language. We use 32 bit floating-point textures to
Varying Index 3.35 2.42
represent the depth values and the positions x and directions
of Refraction
v along rays. Tests with 16 bit floating-point textures led
Schwarzschild 4.91 3.79
to a significantly degraded image quality and thus showed
that the accuracy of integration and intersection computa-
tions was heavily affected. All images in Figure 4 were
generated by our ray-tracing system on a Windows XP PC straight lines, which shows that the intersection and shading
with an ATI Radeon 9700 (128 MB) GPU and a Pentium steps depend linearly on the number of scene objects.
4 (2.8 GHz) CPU. The following measurements were also Table 1 documents rendering times for different models
performed with this hardware configuration. of curved rays. We compare results for the Lorenz system,
Figure 5 shows the performance characteristics for a vary- a potential field, a medium with varying index of refraction,
ing number of spherical scene objects. The other parame- and the Schwarzschild metric. The viewport has a size of
ters are fixed: The viewport has a size of 800 × 600 pixels; 800 × 600 pixels, the scene consists of 10 spherical objects,
500 integration steps are computed within a Schwarzschild and 300 integration steps were performed to build 30 ray
spacetime (Section 5.4), leading to 50 ray segments. The up- segments each. The first column shows the rendering times
per (slower) curve in Figure 5 represents the original, non- for the non-optimized implementation. The Schwarzschild
optimized implementation from Section 3, while the lower solver is slower than the solvers for the other models be-
(faster) line displays the rendering performance for ray trac- cause its evaluation of the function f involves a larger num-
ing with early ray termination and adaptive ray integration ber of numerical instructions. The second column reflects
from Section 4. The vertical offset of the two curves for the rendering times for the acceleration methods. Render-
nobjs = 1 indicates how much time is spent to solve the ODE ing is faster, although the speed-up heavily depends on the
and construct the curved rays. Under the present test con- underlying ODE system. In our example for the potential
ditions, the acceleration techniques reduce the computation field, only a weak attractive force is applied, leading to rather
times by some forty percent. More importantly, the slope of straight light rays. Therefore, a significant increase in speed
the lower curve is much smaller than the slope of the upper can be achieved by adaptive integration.
curve, i.e., the acceleration methods improve the intersection
computations by pruning the rays. Both curves are almost 7. Conclusion and Future Work
We have presented a fast GPU implementation of nonlinear
ray tracing that avoids any data transfer back to main mem-
Rendering Times for the Schwarzschild Model ory. Curved light rays are represented by polygonal lines that
20
Without Acceleration are constructed via an iterative numerical ODE solver. The
With Acceleration
rendering process can be mapped to the streaming model of
15 current GPUs by subsequently executing ray setup, ray inte-
Time in Seconds

gration, ray–object intersection, and local illumination. We


have proposed two acceleration techniques to improve the
10
rendering performance: early ray termination and adaptive
ray integration. In particular, we have investigated ways to
5 introduce such acceleration techniques into the streaming ar-
chitecture on GPUs.

0 In future work, some of the bottlenecks of the current im-


0 5 10 15 20 plementation could be addressed. Especially, the inner loop
Number of Objects
for the internal ODE integration steps could be unrolled
Figure 5: Comparison of rendering times for accelerated within a longer pixel shader program. In this way, much
and non-accelerated nonlinear ray tracing. The number of communication via the floating-point textures for ray posi-
spheres in the scene changes along the horizontal axis. tions and directions could be avoided. Furthermore, space
partitioning strategies that are known from linear ray tracing

c The Eurographics Association and Blackwell Publishing 2004.


632 D. Weiskopf, T. Schafhitzel & T. Ertl / GPU-Based Nonlinear Ray Tracing

could be incorporated to achieve a better scalability with re- An important class of spacetimes is described by the
spect to the number of scene objects. Finally, deferred shad- Schwarzschild metric,
ing could be used to accelerate the illumination computa-
2M dr2
 
ds2 = 1 − dt 2 − − r2 dθ2 + sin2 θ dφ2 .
 
tion. By deferring the shading process to the very end of the
r 1 − 2M/r
ray-tracing algorithm, lighting would only be evaluated for
actually hit points. This metric represents a vacuum solution for Einstein’s gen-
eral relativistic field equations and describes the spacetime
around a non-rotating, non-charged, spherically symmetric
Acknowledgments distribution of matter and energy. It applies to many com-
pact astrophysical objects, for example, to regular stars, neu-
We would like to thank the anonymous reviewers for help-
tron stars, or black holes. We choose units in which the
ful remarks to improve the paper. Special thanks to Bettina
speed of light and the gravitational constant are 1. The pa-
Salzer for proof-reading, and to Joachim Vollrath for his help
rameter M denotes the mass of the source of gravitation. In
with the video.
asymptotically flat outer regions of spacetime, the spheri-
cal Schwarzschild coordinates, r, θ, and φ, are identical to
Appendix A: Geodesics in Spacetime the standard spherical coordinates of flat space. In our im-
plementation, the spherical Schwarzschild coordinates are
Here, a brief discussion of the mathematical background of transformed into pseudo-Cartesian Schwarzschild coordi-
general relativity and, in particular, the propagation of light nates. In this way, the x, y, and z components in the ray setup
is given. For a comprehensive presentation we refer to the can be directly used as input to the Schwarzschild metric.
textbooks [MTW73, Wei72]. The concept of curved space- Finally, t denotes time. The temporal component of the po-
time is the geometric basis for general relativity. Spacetime sitions xµ (τ) can be neglected in stationary scenes.
is a pseudo-Riemannian manifold and can be characterized
by the infinitesimal distance ds,
References
3
ds2 = ∑ gµν (x) dxµ dxν , [ASY96] A LLIGOOD K. T., S AUER T. D., YORKE J. A.:
µ,ν=0 Chaos: An Introduction to Dynamical Systems.
Springer, New York, 1996.
where gµν (x) are entries in a 4 × 4 matrix, representing the
metric tensor at the point x in spacetime. The quantities dxµ [BFGS03] B OLZ J., FARMER I., G RINSPUN E.,
describe an infinitesimal distance in the µ direction of the S CHRÖDER P.: Sparse matrix solvers on
coordinate system. Trajectories of freely falling objects are the GPU: Conjugate gradients and multigrid.
identical to geodesics. Geodesics are the generalization of ACM Transactions on Graphics 22, 3 (2003),
the idea of straightest lines to curved manifolds and are solu- 917–924.
tions to the geodesic equations, a set of second-order ODEs: [BTL90] B ERGER M., T ROUT T., L EVIT N.: Ray trac-
d2 xµ (τ) 3
dxν (τ) dxρ (τ) ing mirages. IEEE Computer Graphics and Ap-
dτ2
+ ∑ Γ µ
νρ (x)
dτ dτ
=0 , plications 10, 3 (1990), 36–41.
ν,ρ=0
[CHH02] C ARR N. A., H ALL J. D., H ART J. C.: The
where τ is an affine parameter along the geodesic. The ray engine. In Proceedings of the Eurograph-
Christoffel symbols Γµ νρ are determined by the metric ac- ics/SIGGRAPH Workshop on Graphics Hard-
cording to ware (2002), pp. 37–46.
1 3 ∂gαν (x) ∂gαρ (x) ∂gνρ (x) [CHH03] C ARR N. A., H ALL J. D., H ART J. C.:
 
Γµ νρ (x) = ∑ gµα (x) + − ,
2 α=0 ∂xρ ∂xν ∂xα GPU algorithms for radiosity and subsur-
face scattering. In Proceedings of the SIG-
where gµα (x) is the inverse of gµα (x). Light rays are a spe- GRAPH/Eurographics Workshop on Graphics
cial class of geodesics: lightlike or null geodesics, which ful- Hardware (2003), pp. 51–59.
fill the null condition,
[Gla93] G LASSNER A. S. (Ed.): An Introduction to
3
dxµ (τ) dxν (τ) Ray Tracing, 4th ed. Academic Press, London,
∑ gµν (x) dτ dτ = 0 . 1993.
µ,ν=0
[GPG04] GPGPU: General-Purpose Computation on
During ray setup, the initial position in spacetime and the
GPUs. Web Page: http://www.gpgpu.org, 2004.
initial spatial direction of the light ray are determined as in
the other models from Section 5. The temporal component [Gro93] G ROSS F.: Relativistic Quantum Mechanics
of the initial direction is then computed according to the null and Field Theory. John Wiley & Sons, New
condition. York, 1993.

c The Eurographics Association and Blackwell Publishing 2004.


D. Weiskopf, T. Schafhitzel & T. Ertl / GPU-Based Nonlinear Ray Tracing 633

[Grö95] G RÖLLER E.: Nonlinear ray tracing: Visualiz- [Wei72] W EINBERG S.: Gravitation and Cosmology:
ing strange worlds. The Visual Computer 11, 5 Principles and Applications of the General The-
(1995), 263–276. ory of Relativity. John Wiley & Sons, New
York, 1972.
[HBSL03] H ARRIS M. J., BAXTER W., S CHEUERMANN
T., L ASTRA A.: Simulation of cloud dy- [Wei00] W EISKOPF D.: Four-dimensional non-linear
namics on graphics hardware. In Proceedings ray tracing as a visualization tool for gravita-
of the SIGGRAPH/Eurographics Workshop on tional physics. In Proceedings of IEEE Visual-
Graphics Hardware (2003), pp. 92–101. ization (2000), pp. 445–448.
[HMG03] H ILLESLAND K. E., M OLINOV S., [YOH00] Y NGVE G. D., O’B RIEN J. F., H ODGINS
G RZESZCZUK R.: Nonlinear optimization J. K.: Animating explosions. In Proceedings of
framework for image-based modeling on SIGGRAPH 2000 Conference (2000), pp. 29–
programmable graphics hardware. ACM Trans- 36.
actions on Graphics 22, 3 (2003), 925–934.
[HW01] H ANSON A. J., W EISKOPF D.: Visualizing rel-
ativity. SIGGRAPH 2001 Course #15 Notes,
2001.
[Kaj86] K AJIYA J. T.: The rendering equation. Com-
puter Graphics (SIGGRAPH ’86 Proceedings)
20, 4 (1986), 143–150.
[KW03] K RÜGER J., W ESTERMANN R.: Linear algebra
operators for GPU implementation of numerical
algorithms. ACM Transactions on Graphics 22,
3 (2003), 908–916.
[KWR02] KOBRAS D., W EISKOPF D., RUDER H.: Gen-
eral relativistic image-based rendering. The Vi-
sual Computer 18, 4 (2002), 250–258.
[Lor63] L ORENZ E. N.: Deterministic nonperiodic
flow. Journal of the Atmospheric Sciences 20
(1963), 130–141.
[MTW73] M ISNER C. W., T HORNE K. S., W HEELER
J. A.: Gravitation. Freeman, New York, 1973.
[Nem93] N EMIROFF R. J.: Visual distortions near a neu-
tron star and black hole. American Journal of
Physics 61, 7 (July 1993), 619–632.
[NRHK89] N OLLERT H.-P., RUDER H., H EROLD H.,
K RAUS U.: The relativistic “looks” of a neutron
star. Astronomy and Astrophysics 208 (1989),
153.
[PBMH02] P URCELL T. J., B UCK I., M ARK W. R., H AN -
RAHAN P.: Ray tracing on programmable
graphics hardware. ACM Transactions on
Graphics 21, 3 (2002), 703–712.
[PDC∗ 03] P URCELL T. J., D ONNER C., C AMMARANO
M., J ENSEN H. W., H ANRAHAN P.: Pho-
ton mapping on programmable graphics
hardware. In Proceedings of the SIG-
GRAPH/Eurographics Workshop on Graphics
Hardware (2003), pp. 41–50.
[Rös76] R ÖSSLER O.: An equation for continuous
chaos. Physics Letters A 57, 5 (1976), 397–398.

c The Eurographics Association and Blackwell Publishing 2004.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy