-
Notifications
You must be signed in to change notification settings - Fork 74.7k
TypeError: cannot pickle '_thread.lock' object in TensorFlow 2.4 #46556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Was able to reproduce the issue with TF v2.4 and TF-nightly. Please check the attached screenshot for reference. Whereas with TF v2.3 the error is Similar to issue #45918 Thanks! |
Hi @rpasricha, to narrow down the issue here I just tried running the sample provided in the Multi-worker training with Estimator tutorial. I seem to be getting the same |
I am having the same problem in TF2.4.0 and TF2.4.1 with the same stacktrace. I tried both python3.6/python3.7 with |
Also facing the issue. Is there a workaround for this issue? |
The problem is this I find a workaround by explicitly overwriting
In your code, just use the above two class for
|
Could you try disable eager execution via: tf.compat.v1.disable_eager_execution() it needs to be called at the beginning of the program. |
Found another workaround Add the following code to explicitly disable health check
|
I was able to get around the issue by disabling eager execution, thanks. @nikitamaia The tutorial only runs the code on a single node, the issue only arises when doing distributed training with estimator + multi worker mirrored strategy. |
@rpasricha ah yes, I meant to say to run the code from the tutorial but in your multi node environment (not in colab). Regardless, seems we have a workaround for now. |
Closing this issue since it is a duplicate of #45918. For further updates please refer to the other thread so we can track this in one place. |
Fixes tensorflow/tensorflow#46556 PiperOrigin-RevId: 364780060
Please make sure that this is a bug. As per our
GitHub Policy,
we only address code/doc bugs, performance issues, feature requests and
build/installation issues on GitHub. tag:bug_template
System information
You can collect some of this information using our environment capture
script
You can also obtain the TensorFlow version with:
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"
Describe the current behavior
Running a simple training process with MultiWorkerMirroredStrategy fails with
TypeError: can't pickle _thread.lock objects
.Describe the expected behavior
The training should proceed without errors.
Standalone code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate
the problem. If possible, please share a link to Colab/Jupyter/any notebook.
The example needs to run in a distributed environment to reproduce the issue, so save the script in a file and run it in 3 different terminals.
Other info / logs Include any logs or source code that would be helpful to
diagnose the problem. If including tracebacks, please include the full
traceback. Large logs and files should be attached.
Full logs:
The text was updated successfully, but these errors were encountered: