TB 04631 001 - v01
TB 04631 001 - v01
List of Tables
Table 1. Graphics Boards with 10 and 12-Bit Grayscale Support .......................................... 3
Table 2. Multi-GPU Compatibility....................................................................................... 9
Table 3. Characteristics for 10 MP Setup ......................................................................... 17
Table 4. Characteristics for the 20 MP Setup .................................................................... 18
Introduction
Advances in sensor technology and image acquisition techniques in the field of
radiology are producing high bit depth grayscale images in the range of 12 to 16-bit
per pixel. At the same time, the adoption of displays with native support for 10 and
12-bit grayscale is growing. These affordable displays are DICOM[1] conformant to
preserve image quality and consistency. Furthermore, tiling together multiple such
displays enables side-by-side digital study comparisons driven by a single system.
Standard graphics workstations however are limited to 8-bit grayscale, which
provides only 256 possible shades of gray for each pixel sometimes obscuring subtle
contrasts in high density images. Radiologists often use window-leveling techniques
to identify the region of interest that can quickly become a cumbersome and time-
consuming user interaction process.
NVIDIA’s 10–bit and 12-bit grayscale technology allows these high quality displays
to be driven by standard NVIDIA® Quadro® graphics boards preserving the full
grayscale range. By using “pixel packing” the 10-bit or 12-bit grayscale data is
transmitted from the Quadro® graphics board to a high grayscale density display
using a standard DVI cable. Instead of the standard three 8-bit color components
per pixel, the pixel packing allows two 10 or 12-bit pixels to be transmitted,
providing higher spatial resolution and grayscale pixel depth as compared to an 8-bit
system.
As specialty hardware is not required, NVIDIA’s 10-bit grayscale technology is
readily available for use with other radiology functions and easy to support amongst
a wide range of grayscale panels from various manufacturers. In a preliminary study
performed on 10 radiologists using Dome E5 10-bit vs. E5 8-bit displays in
conjunction with Three Palms 10-bit, OpenGL accelerated WorkstationOne
mammography application, radiologists’ performance was statistically significant on
the 10-bit enabled display systems, some experiencing triple the read time speedup.
This technical brief describes the NVIDIA grayscale technology, the system
requirements and setup. It also aims to guide users through common pitfalls that
arise when extending to multi-display and multi graphics processing unit (GPU)
environments routinely used in diagnostic imaging and recommends best practices.
Figure 1 shows the latest technology in digital diagnostic display systems, a Quadro
card driving a 10 mega-pixel, 10-bit grayscale display. Figure 2 shows a 10-bit
enabled mammography application displaying multiple modalities on multiple
displays.
Figure 2. Applicatio
on Enhance
ed Using Multiple
M Dissplays2
1 Im
mage courtesy of NDS Surgiccal Imaging, DO
OME Z10.
2 Im
mage courtesy of Threepalms,, Inc.
April 17,
1 2009 | TB-0
04631-001_v01
1 2
10 an
nd 12-Bit Graysscale Technolog
gy
Sysstem Sp
pecific In
nformattion
10 and 12-b
bit grayscale cuurrently requirres Windows XP.
X
Windows Vista
V support for
f 10-bit grayyscale over DVVI is being wo
orked on.
Grayscale iss only supportted for OpenGGL based applications.
Sup
pported Graphics
G Boards
100-bit grayscale is supported on Quadro FXX graphics bo
oards shown in
n Table 1. Thee
graaphics boards are G80 and higher. The graphics
g boardds are NVIDIAA CUDA™
en
nabled.
Ta
able 1. Graphics Boards with 10 and 12-Bit Gra
ayscale
Support
Quadro
Q FX 3800
Mid-range card with
M w 1 GB of graph hics memory.
R
Recommended if the primary usag
ge is to display 2D
2
grayscale images and some 3D da ata.
Quadro
Q FX 4800
Quadro
Q FX 5800
Quadro
Q Plex 2200
2 D2
Dedicated desksid
D de visual computting system comp posed of
2 Quadro FX 5800 0 graphics boards with a total of 8 GB of
graphics memory. Recommended for advanced
visualization and large scale proje
ection and displayy use
ca
ases.
April 17,
1 2009 | TB-0
04631-001_v01
1 3
10 and 12-Bit Grayscale Technology
Supported Monitors
The monitor should be capable of 10 and 12-bit outputs. We currently support the
following displays.
NDS Surgical Imaging Dome E5 5MP and Z10 10MP display’s [2]
Eizo Radiforce GS520 5MP display[3] – currently in beta, to be released in
the R190 driver.
Supported Connectors
Single or Dual-link DVI
Although single-link DVI is only capable of transmitting up to HD (1920 ×
1200), our grayscale pixel packing mechanism allows 5 MP (2560 × 2048)
images to be sent over single-link DVI.
DisplayPort
This applies to the Quadro FX 4800 and the Quadro FX 5800 that have
DisplayPort outputs. As grayscale monitors currently only support DVI, a
DisplayPort-to-single and dual DVI adaptors is needed at the GPU end. The
Bizlink dongle (P/N 030-0223-0000) shown in Figure 3 has been tested and is
recommended.
Grayscale Monitor
M S
Settings
When
W a grayscalle compatible monitor is coonnected to a suitable NVID DIA board, th he
NVVIDIA driverr automaticallyy detects it andd immediatelyy switches to packed
p pixel
moode. Thereforre, there are no
o control paneel settings to enable
e and dissable 10-bit
graayscale. The only
o setting reqquired is to ennable the graysscale monitor to display at a
maaximum resoluution of 2560 × 2048. Follo ow these simp ple steps.
1. Open the Display
D Properties.
April 17,
1 2009 | TB-0
04631-001_v01
1 5
10 an
nd 12-Bit Graysscale Technolog
gy
Gra
ayscale Implem
mentatio
on
Drivver Layerr
On n grayscale ennabled Quadro o boards, the driver
d implemments a pixel packing
meechanism thatt is transparen nt to the deskto
op and to the application. The
T 24-bit
RGGB desktop iss first converteed to 12-bit grrayscale using the NTSC co olor conversio
on
forrmula and theen two 12-bit gray
g values aree packed into 1 RGB DVI pixel and
fin
nally shipped tot the monitorr. This pixel packing
p allowss displaying off 5 MP gray
vaalues just usingg a single-link DVI (that is normally
n limitted to HD resolution).
Figure 5. Driver Co
onverts andd Packs De
esktop from
m 24-Bit
Color to 12-Bit
1 Grayy
April 17,
1 2009 | TB-0
04631-001_v01
1 6
10 and 12-Bit Grayscale Technology
Application Layer
The 10 and 12-bit grayscale image viewing application is responsible for outputing
24-bit RGB pixels which the driver then converts to 12-bit grayscale values for
scanout as described in the previous section.
The application uses a shader that takes in the 12-bit grayscale value from the image
and translates it into a 24-bit RGB pixel using a lookup table.The lookup table is
generated to find the best RGB pixel with as little as possible differences between
the RGB values (preferred is R=G=B) for each grayscale value in the input image.
In essence, this process is the inverse of the driver conversion from RGB to
grayscale. The end result is that the grayscale image on the desktop looks like a
grayscale image on a color monitor.
The integer texture extension, EXT_texture_integer [4] in Shader Model 4 is used to
store the incoming grayscale image as a 16-bit unsigned integer without converting
to floating point representation saving memory footprint by 2×.
glPixelStorei(GL_UNPACK_ALIGNMENT, 2);
glTexImage2D(GL_TEXTURE_2D, 0, GL_ALPHA16UI_EXT, width, height, 0,
GL_ALPHA_INTEGER_EXT , GL_UNSIGNED_SHORT, TextureStorage);
The lookup table mapping the grayscale image to 24-bit RGB values is stored as 1D
texture. The lookup table dimensions should exactly match the bit depth of the
grayscale values expected in incoming image so that no filtering and interpolation
operations will be performed thus preserving image precision and fidelity. Changes
to contrast, brightness and window level of the image are easily done by changing
the lookup table resulting in a 1D texture download without any change to the
sourceimage.
glBindTexture(GL_TEXTURE_1D, lutTexId);
glTexImage1D(GL_TEXTURE_1D, 0, 4, lutWidth, 0, GL_RGBA,
GL UNSIGNED BYTE, Table );
At run time, the applicaton draws a quad that is texture mapped with the grayscale
image. In the rasterization stage, the fragment shader is invoked for each grayvalue
which then does a dependant texture fetch into the 1D LUT texture. The complete
source is found in GrayScaleDemo.cpp.
#extension GL_EXT_gpu_shader4 : enable // for unsigned int support
uniform usampler2D texUnit0; // Gray Image is in tex unit 0
uniform sampler1D texUnit1; // Lookup Table Texture in tex unit 1
void main(void)
{
vec2 TexCoord = vec2(gl_TexCoord[0]);
//texture fetch of unsigned ints placed in alpha channel
uvec4 GrayIndex = uvec4(texture2D(texUnit0, TexCoord));
//low 12 bits taken only ;
float GrayFloat = float(float(GrayIndex.a) / 4096.0);
//fetch right grayscale value out of table
vec4 Gray = vec4(texture1D(texUnit1, GrayFloat));
// write data to the framebuffer
gl_FragColor = Gray.rgba;
}
Figure 6. Applicatio
on Level Te
exture Settup for 10 and 12-Bitt
Grayscalee Display
April 17,
1 2009 | TB-0
04631-001_v01
1 8
10 and 12-Bit Grayscale Technology
Multi-Display Configurations
Diagnostic imaging commonly requires multiple displays for side by side modality comparisons.
Multi-display configurations are becoming more practical with systems capable of supporting
multiple graphics boards that in turn drive multiple displays. A single Quadro board can drive a
maximum of 2 displays. Depending on the available PCI slots within a system, multiple cards can be
used to drive several displays. These multiple displays can be a mix of regular color LCD panels and
specialty grayscale monitors. This section explains the issues that arise from such a heterogeneous
configuration and programming pointers to address them. The full source code for the examples is
found in the accompanying Grayscale10-bit SDK
Multi-GPU Compatibility
Grayscale capable Quadro boards can be mixed with other Quadro boards that can
drive one or many side displays as shown in Table 2. These “Side Display GPU’s”
may not yield the grayscale effect but the system will be compatible. Mixing of
GPU’s is only guaranteed to work if the GPU’s are G80 and later.
Note: The mixing of older cards (pre G80) is not supported in grayscale configurations.
Grayscale GPU
Quadro FX 3800 Quadro FX 4800 Quadro FX 5800 Quadro Plex
Side Display GPU
Quadro FX 4800 9 9 X X
Quadro FX 5800 X X X 9
Note: These are theoretical compatibilities. In practice, the physical system attributes such as availability of PCI
slots and their placements will determine the final working set of cards from Table 2. The Quadro FX 5800
requires the full 2 auxiliary power inputs and therefore is only used with lower-end Quadro cards that do not
have any auxiliary power requirements.
April 17,
1 2009 | TB-0
04631-001_v01
1 10
10 and 12-Bit Grayscale Technology
class CDisplayWin {
HWND hWin; // handle to display window
HDC winDC; // DC of display window
RECT rect; // rectangle limits of display
bool primary; //Is this the primary display
char displayName[128]; //name of this display
char gpuName[128]; //name of associated GPU
bool grayScale; //Is this a grayscale display
public:
bool spans(RECT r);//If incoming rect r spans this display
}
#define MAX_NUM_GPUS 4
int displayCount = 0; //number of active displays
//list of displays, each gpu can attach to max 2 displays
CDisplayWin displayWinList[MAX_NUM_GPUS*2];
Following is a simple example using the Windows GDI to enumerate the attached
displays, gets their extents and also check if the display is set as primary. The
following code can be easily modified to include unattached displays.
DISPLAY_DEVICE dispDevice;
DWORD displayCount = 0;
memset((void *)&dispDevice, 0, sizeof(DISPLAY_DEVICE));
dispDevice.cb = sizeof(DISPLAY_DEVICE);
// loop through the displays and print out state
while (EnumDisplayDevices(NULL,displayCount,&dispDevice,0)) {
if (dispDevice.StateFlags & DISPLAY_DEVICE_ATTACHED_TO_DESKTOP) {
printf("DeviceName = %s\n", dispDevice.DeviceName);
printf("DeviceString = %s\n",dispDevice.DeviceString);
if (dispDevice.StateFlags &DISPLAY_DEVICE_PRIMARY_DEVICE)
printf("\tPRIMARY DISPLAY\n");
DEVMODE devMode;
memset((void *)&devMode, 0, sizeof(devMode));
devMode.dmSize = sizeof(devMode);
EnumDisplaySettings(dispDevice.DeviceName, ENUM_CURRENT_SETTINGS,
&devMode);
printf("\tPosition/Size = (%d, %d), %dx%d\n",
devMode.dmPosition.x, devMode.dmPosition.y,devMode.dmPelsWidth,
devMode.dmPelsHeight);
HWND hWin =
createWindow(GetModuleHandle(NULL),devMode.dmPosition.x+50,
devMode.dmPosition.y+50, devMode.dmPelsWidth-50,
devMode.dmPelsHeight-50);
if (hWin) { //got a window
HDC winDC = GetDC(hWin);
// TODO - set pixel format, create OpenGL context
}
else
printf("Error creating window \n");
}//if attached to desktop
displayCount++;
} //while(enumdisplay);
Running this enumeration code on our 3 display example (shown in Figure 7) prints
out the following.
DeviceName = \\.\DISPLAY1
DeviceString = NVIDIA Quadro FX 1800
PRIMARY DISPLAY
Position/Size = (0, 0), 1280x1024
DeviceName = \\.\DISPLAY2
DeviceString = NVIDIA Quadro FX 4800
Position/Size = (1280, 0), 2560x2048
DeviceName = \\.\DISPLAY3
DeviceString = NVIDIA Quadro FX 4800
Position/Size = (3840, 0), 1600x1200
Note: The enumeration shown in this section abstracts special hardware capabilities of
the displays such as grayscale or color capability. For such physical display details,
we need access to the Extended display identification data (EDID)-the data
structure provided by the computer display to the graphics card. This is described
in the next section.
LONG WINAPI winProc(HWND hWin, UINT uMsg, WPARAM wParam, LPARAM lParam)
{
switch (uMsg) {
case WM_SIZE:
RECT rect;
GetClientRect(hWin, &rect);
for (int i=0;i<displayCount;i++) {
//check if the window spans this display
if (displayWinList [i].spans(rect)) {
//Now check this is grayscale compatible display
if (!displayWinList[i].grayScale) {
//do something eg prevent spanning
}
}
} //end of for
break;
case WM_MOVE:
RECT rect;
//Repeat as above for WM_SIZE
}
}
Targ
geting Specific GPUs
G for Renderin
ng
Thhe default behhavior is for OpenGL
O comm mands to be seent to all GPU Us. While this
woorks for manyy applications, it makes runttime graphics capability cheecking and
haandling more complicated.
c T
Therefore, it iss desirable to limit grayscalee rendering to
o
thee GPUs that are
a capable off grayscale outtput. In this caase, when the window
mo oves to a display connectedd to a GPU wh here grayscale is not enabledd, no screen
reffresh or drawiing happens. In
I fact, some applications
a p
prevent window movementss
at all, minimizin
ng user interacction to increaase efficiency. To target speccific GPUs fo or
renndering, we usse the WGL NV N Affinity exxtension [6] avvailable for Windows
W on
Quuadro professional cards.
Thhe GPU Affin nity for a winddow is definedd by an affinityy mask that co
ontains a list of
GP PUs responsib ble for the winndow drawingg. This extensiion also introdduces the
co
oncept of an afffinity DC wh hich is simply a device conteext embeddedd with the
afffinity mask. When
W an Open nGL context iss created fromm this DC it innherits the
DC C’s affinity maask that is immmutable. For on-screen
o draw wing, when th
his affinity
co
ontext is associiated with a window
w DC, anny OpenGL calls
c made with h this contextt
cuurrent will be sent
s to the GP PUs specified in the affinityy mask.
April 17,
1 2009 | TB-0
04631-001_v01
1 14
10 and 12-Bit Grayscale Technology
We introduce another class, CAffGPU to encapsulate all the attributes for an Affinity
GPU and the affGPUList which is a collection of CAffGPU’s.
class CAffGPU {
HDC affinityDC; // Device Context of affinity gpu
HGLRC affinityGLRC; // OpenGL Resource Context
public:
init(HGPUNV* pGPU, int num); //List of GPU handles in the mask
~ CAffGPU();
};
unsigned int gpuCount = 0;
CAffGPU affGPUList[MAX_NUM_GPUS]
class CDisplayWin {
...
CAffGPU* pAffinityGPU; //The list of GPU’s responsible for rendering
this window
...
};
Handles for all the system GPUs are enumerated by the following wglEnumGpusNV
call. This example also shows another way of enumerating the display devices using
the wglEnumGpuDevicesNV and GPU_DEVICE structure that resemble closely the
windows GDI enumDisplayDevices and DISPLAY_DEVICE introduced earlier.
HGPUNV curNVGPU;
//Get a list of GPU's
while ((gpuCount < MAX_NUM_GPUS) && wglEnumGpusNV(gpuCount, &curNVGPU)) {
unsigned int curDisplay = 0; //displays per current GPU
GPU_DEVICE gpuDevice;
gpuDevice.cb = sizeof(gpuDevice);
affGPUList[gpuCount].init(&curNVGPU,1);
//loop through displays devices for this GPU
while (wglEnumGpuDevicesNV(curNVGPU, curDisplay, &gpuDevice)) {
displayWinList[displayCount].setGPUName(gpuDevice.DeviceString);
displayWinList[displayCount].setDisplayName(gpuDevice.DeviceName)
displayWinList[displayCount].setRect(gpuDevice.rect);
if ((gpuDevice.Flags & DISPLAY_DEVICE_PRIMARY_DEVICE))
displayWinList[displayCount].primary = true;
curDisplay++;
displayCount++;
} //end of enumerating displays
gpuCount++;
} //end of enumerating gpu's
At run time, the GPU resource context must be made current to the window DC
before any OpenGL calls are made. This way, rendering only happens to the
subrectangles of the windows that overlaps parts of the desktops that are displayed
by the GPUs in the affinity mask of the resource context.
case WM_PAINT:
//Use the affinity context for this window
wglMakeCurrent(winDC, pAffinityGPU->affinityGLRC;
CAffinityGPU::~CAffinityGPU() {
ifglGetString(GL_RENDERER);
(gpuRC)
//Drawing code goes here
wglDeleteContext(gpuRC);
ifSwapBuffers(winDC);
(gpuDC)
break;
wglDeleteDCNV(gpuDC);
}
At application shutdown, the Affinity DC must be deleted
Note: When the affinity GL context is used, it is not recommended to create another
OpenGL context from the Windows DC. Doing so may lead to unexpected
behavior when querying OpenGL attributes using glGetString
Typ
pical Mu
ulti-Disp
play Con
nfiguratiions
We
W examine thee commonly used u multi-disp
play setups thaat mix grayscaale monitors
an
nd color panelss and their un
nderlying GPUU configuration n.
Ta
able 3. Characterristics for 10
1 MP Setu
up
Total
T 10 MP
P 5120 x 2048 (landsca
ape) or 4096 x 2560 (portraitt)
Resolution
R
Side
S Display Quadro NVS
S 290 1 PCI 1x slot; good for
f system with
h only 2 PCI x16
6
(Primary) slots
Quadro FX 1800 1 PCI 16x; recommen
nded for system
ms with 3 PCI
Quadro FX 3800 ots
16x slo
Quadro FX 4800 2 PCI 16x; high-end systems with 4 PCI 16x slots
Grayscale
G Grayscale GPUs
G 2 PCI 16x slot
Display
D (Table 2)
2
April 17,
1 2009 | TB-0
04631-001_v01
1 17
10 an
nd 12-Bit Graysscale Technolog
gy
Ta
able 4. Characterristics for the
t 20 MP Setup
Total
T Resolution
n 20 MP 10, 240 x 2048 (landscap
pe) or
8192 x 2560 (portrait)
Side
S Display GP
PU Quad
dro NVS 290 1 PCIE x1 slot
(Primary Displayy)
Grayscale
G Displa
ay Grayyscale GPUs 2 PCIE x16
6 slot
GPU
G 1 (T
Table 2)
Grayscale
G Displa
ay Grayyscale GPU’ 2 PCIE x16
6 slot
GPU
G 2 (T
Table 2)
April 17,
1 2009 | TB-0
04631-001_v01
1 18
10 and 12-Bit Grayscale Technology
References
[1] Digital Imaging and Communications in Medicine (DICOM)- Part 14 grayscale
standard display function. http://medical.nema.org
[2] NDS Dome E5 Display
http://www.ndssi.com/products/dome/ex-grayscale/e5.html
[3] Eizo Radiforce GS520 Display
http://www.radiforce.com/en/products/mono-gs520-dm.html
[4] Integer Texture Extension
http://www.opengl.org/registry/specs/EXT/texture_integer.txt
[5] NVIDIA NVAPI – www.nvapi.com
[6] GPU Affinity Specification
http://developer.download.nvidia.com/opengl/specs/WGL_nv_gpu_affinity.t
xt
[7] Ian Williams, HD is now 8MP &HDR, Slides from NVISION 2008.
http://www.nvidia.com/content/nvision2008/tech_presentations/Professional
_Visualization/NVISION08-8MP_HDR.pdf
Implementation Details
The following source code is divided into 3 separate projects. The intent is for these
components to be mixed and matched according to the user application
requirements.
GrayscaleDemo.sln
¾ GrayscaleDemo.[cpp|h] – An example demo application that does the
various texture setups and allows the user to choose a grayscale image for
display.
CheckGrayscale.sln
¾ CDisplayWin.[cpp|h] – Class CDisplayWin that encapsulates all attributes
of an attached display such name, extents, driving GPU, etc.
¾ CheckGrayscale.cpp – Main program that enumerates all attached GPUs
and displays using Win GDI API and uses NVIDIA NVAPI to check the
displays that are grayscale compatible.
MultiGPUAffinity.sln
¾ CAffGPU.[cpp|h] – Class CAffGPU that encapsulates an affinity GPU
with its attributes such as the DC, OpenGL context, etc.
¾ CAffDisplayWin.[cpp|h] – Class CAffDisplayWin that extends
CDisplayWin to include affinity specific information.
¾ MultiGPUAffinity.cpp – Main program that enumerate all GPUs creates
the affinity data structures and does the event handling.
Trademarks
NVIDIA, the NVIDIA logo, CUDA and Quadro are trademarks or registered trademarks of NVIDIA Corporation
in the United States and other countries. Other company and product names may be trademarks of the
respective companies with which they are associated.
Copyright
© 2009 NVIDIA Corporation. All rights reserved.