Skip to content

TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0 CUDA_ERROR_INVALID_HANDLE #97387

@MikeHallettUK

Description

@MikeHallettUK

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

Yes

Source

binary

TensorFlow version

tf-nightly-2.21.0.dev20250722

Custom code

No

OS platform and distribution

Ubuntu 20.04

Mobile device

no

Python version

3.11

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

12.8.1/9.8

GPU model and memory

RTX5080 16gb

Current behavior?

WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
W0000 00:00:1753232013.685876 24341 gpu_device.cc:2431] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.

CUDA_ERROR_INVALID_HANDLE

Standalone code to reproduce the issue

import tensorflow as tf
import numpy as np
from tensorflow.python.client import device_lib
import keras

print("Keras version: ", keras.__version__)

print(device_lib.list_local_devices())

x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
tensor = tf.convert_to_tensor(x)

print("Tensor: ", tensor)

# ========================================== define model ======================================
input_data = keras.Input(shape = (8,1))

# Data Encoder
dx = keras.layers.Dense(16, activation='relu')(input_data)

print("dx", dx.shape)

Relevant log output

Keras version:  3.10.0.dev2025072204
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
W0000 00:00:1753232219.958168   25258 gpu_device.cc:2431] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
I0000 00:00:1753232220.029881   25258 gpu_device.cc:2020] Created device /device:GPU:0 with 11546 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 5080, pci bus id: 0000:01:00.0, compute capability: 12.0
W0000 00:00:1753232220.033014   25258 gpu_device.cc:2431] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
W0000 00:00:1753232220.035555   25258 gpu_device.cc:2431] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
I0000 00:00:1753232220.037159   25258 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 11546 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 5080, pci bus id: 0000:01:00.0, compute capability: 12.0
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 13363326776403279234
xla_global_id: -1
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 12107055104
locality {
  bus_id: 1
  links {
  }
}
incarnation: 7207726466463925696
physical_device_desc: "device: 0, name: NVIDIA GeForce RTX 5080, pci bus id: 0000:01:00.0, compute capability: 12.0"
xla_global_id: 416903419
]
Tensor:  tf.Tensor(
[[1 2 3]
 [4 5 6]
 [7 8 9]], shape=(3, 3), dtype=int64)
2025-07-23 10:57:00.115747: W tensorflow/compiler/mlir/tools/kernel_gen/tf_gpu_runtime_wrappers.cc:40] 'cuModuleLoadData(&module, data)' failed with 'CUDA_ERROR_INVALID_PTX'

2025-07-23 10:57:00.115757: W tensorflow/compiler/mlir/tools/kernel_gen/tf_gpu_runtime_wrappers.cc:40] 'cuModuleGetFunction(&function, module, kernel_name)' failed with 'CUDA_ERROR_INVALID_HANDLE'

2025-07-23 10:57:00.115761: W tensorflow/core/framework/op_kernel.cc:1842] INTERNAL: 'cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, 0, reinterpret_cast<CUstream>(stream), params, nullptr)' failed with 'CUDA_ERROR_INVALID_HANDLE'
2025-07-23 10:57:00.115766: I tensorflow/core/framework/local_rendezvous.cc:407] Local rendezvous is aborting with status: INTERNAL: 'cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, 0, reinterpret_cast<CUstream>(stream), params, nullptr)' failed with 'CUDA_ERROR_INVALID_HANDLE'
Traceback (most recent call last):
  File "/home/mike/catkin_ws2/src/mypy311/scripts/tftest.py", line 19, in <module>
    dx = keras.layers.Dense(16, activation='relu')(input_data)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mike/PycharmProjects/py311/.venv/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/mike/PycharmProjects/py311/.venv/lib/python3.11/site-packages/keras/src/backend/tensorflow/core.py", line 152, in convert_to_tensor
    return tf.cast(x, dtype)
           ^^^^^^^^^^^^^^^^^
tensorflow.python.framework.errors_impl.InternalError: {{function_node __wrapped__Cast_device_/job:localhost/replica:0/task:0/device:GPU:0}} 'cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, 0, reinterpret_cast<CUstream>(stream), params, nullptr)' failed with 'CUDA_ERROR_INVALID_HANDLE' [Op:Cast] name: 

Process finished with exit code 1

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    pFad - Phonifier reborn

    Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

    Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


    Alternative Proxies:

    Alternative Proxy

    pFad Proxy

    pFad v3 Proxy

    pFad v4 Proxy