TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0 CUDA_ERROR_INVALID_HANDLE

### Issue type

Bug

### Have you reproduced the bug with TensorFlow Nightly?

Yes

### Source

binary

### TensorFlow version

tf-nightly-2.21.0.dev20250722

### Custom code

No

### OS platform and distribution

Ubuntu 20.04

### Mobile device

no

### Python version

3.11

### Bazel version

_No response_

### GCC/compiler version

_No response_

### CUDA/cuDNN version

12.8.1/9.8

### GPU model and memory

RTX5080 16gb

### Current behavior?


WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
W0000 00:00:1753232013.685876   24341 gpu_device.cc:2431] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.

CUDA_ERROR_INVALID_HANDLE



### Standalone code to reproduce the issue

```shell
import tensorflow as tf
import numpy as np
from tensorflow.python.client import device_lib
import keras

print("Keras version: ", keras.__version__)

print(device_lib.list_local_devices())

x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
tensor = tf.convert_to_tensor(x)

print("Tensor: ", tensor)

# ========================================== define model ======================================
input_data = keras.Input(shape = (8,1))

# Data Encoder
dx = keras.layers.Dense(16, activation='relu')(input_data)

print("dx", dx.shape)
```

### Relevant log output

```shell
Keras version:  3.10.0.dev2025072204
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
W0000 00:00:1753232219.958168   25258 gpu_device.cc:2431] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
I0000 00:00:1753232220.029881   25258 gpu_device.cc:2020] Created device /device:GPU:0 with 11546 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 5080, pci bus id: 0000:01:00.0, compute capability: 12.0
W0000 00:00:1753232220.033014   25258 gpu_device.cc:2431] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
W0000 00:00:1753232220.035555   25258 gpu_device.cc:2431] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
I0000 00:00:1753232220.037159   25258 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 11546 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 5080, pci bus id: 0000:01:00.0, compute capability: 12.0
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 13363326776403279234
xla_global_id: -1
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 12107055104
locality {
  bus_id: 1
  links {
  }
}
incarnation: 7207726466463925696
physical_device_desc: "device: 0, name: NVIDIA GeForce RTX 5080, pci bus id: 0000:01:00.0, compute capability: 12.0"
xla_global_id: 416903419
]
Tensor:  tf.Tensor(
[[1 2 3]
 [4 5 6]
 [7 8 9]], shape=(3, 3), dtype=int64)
2025-07-23 10:57:00.115747: W tensorflow/compiler/mlir/tools/kernel_gen/tf_gpu_runtime_wrappers.cc:40] 'cuModuleLoadData(&module, data)' failed with 'CUDA_ERROR_INVALID_PTX'

2025-07-23 10:57:00.115757: W tensorflow/compiler/mlir/tools/kernel_gen/tf_gpu_runtime_wrappers.cc:40] 'cuModuleGetFunction(&function, module, kernel_name)' failed with 'CUDA_ERROR_INVALID_HANDLE'

2025-07-23 10:57:00.115761: W tensorflow/core/framework/op_kernel.cc:1842] INTERNAL: 'cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, 0, reinterpret_cast<CUstream>(stream), params, nullptr)' failed with 'CUDA_ERROR_INVALID_HANDLE'
2025-07-23 10:57:00.115766: I tensorflow/core/framework/local_rendezvous.cc:407] Local rendezvous is aborting with status: INTERNAL: 'cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, 0, reinterpret_cast<CUstream>(stream), params, nullptr)' failed with 'CUDA_ERROR_INVALID_HANDLE'
Traceback (most recent call last):
  File "/home/mike/catkin_ws2/src/mypy311/scripts/tftest.py", line 19, in <module>
    dx = keras.layers.Dense(16, activation='relu')(input_data)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mike/PycharmProjects/py311/.venv/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/mike/PycharmProjects/py311/.venv/lib/python3.11/site-packages/keras/src/backend/tensorflow/core.py", line 152, in convert_to_tensor
    return tf.cast(x, dtype)
           ^^^^^^^^^^^^^^^^^
tensorflow.python.framework.errors_impl.InternalError: {{function_node __wrapped__Cast_device_/job:localhost/replica:0/task:0/device:GPU:0}} 'cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, 0, reinterpret_cast<CUstream>(stream), params, nullptr)' failed with 'CUDA_ERROR_INVALID_HANDLE' [Op:Cast] name: 

Process finished with exit code 1
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0 CUDA_ERROR_INVALID_HANDLE #97387

Issue type

Have you reproduced the bug with TensorFlow Nightly?

Source

TensorFlow version

Custom code

OS platform and distribution

Mobile device

Python version

Bazel version

GCC/compiler version

CUDA/cuDNN version

GPU model and memory

Current behavior?

Standalone code to reproduce the issue

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0 CUDA_ERROR_INVALID_HANDLE #97387

Description

Issue type

Have you reproduced the bug with TensorFlow Nightly?

Source

TensorFlow version

Custom code

OS platform and distribution

Mobile device

Python version

Bazel version

GCC/compiler version

CUDA/cuDNN version

GPU model and memory

Current behavior?

Standalone code to reproduce the issue

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.