-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Description
Description
It's often the case that we cannot serialize exceptions due to some part of user program being non-serializable. As a result, users end up seeing an unactionable error message such as:
File "/usr/local/lib/python3.11/dist-packages/ray/exceptions.py", line 45, in from_bytes
return RayError.from_ray_exception(ray_exception)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/ray/exceptions.py", line 54, in from_ray_exception
raise RuntimeError(msg) from e
RuntimeError: Failed to unpickle serialized exception
(from #50138)
or
(TaskRunner pid=530440) Failed to unpickle serialized exception
(TaskRunner pid=530440) Traceback (most recent call last):
(TaskRunner pid=530440) File "/usr/local/lib/python3.12/site-packages/ray/exceptions.py", line 51, in from_ray_exception
(TaskRunner pid=530440) return pickle.loads(ray_exception.serialized_exception)
(TaskRunner pid=530440) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(TaskRunner pid=530440) TypeError: BackendCompilerFailed.__init__() missing 1 required positional argument: 'inner_exception'
(TaskRunner pid=530440)
(TaskRunner pid=530440) The above exception was the direct cause of the following exception:
(TaskRunner pid=530440)
(TaskRunner pid=530440) Traceback (most recent call last):
(TaskRunner pid=530440) File "/usr/local/lib/python3.12/site-packages/ray/_private/serialization.py", line 460, in deserialize_objects
(TaskRunner pid=530440) obj = self._deserialize_object(data, metadata, object_ref)
(TaskRunner pid=530440) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(TaskRunner pid=530440) File "/usr/local/lib/python3.12/site-packages/ray/_private/serialization.py", line 342, in _deserialize_object
(TaskRunner pid=530440) return RayError.from_bytes(obj)
(TaskRunner pid=530440) ^^^^^^^^^^^^^^^^^^^^^^^^
(TaskRunner pid=530440) File "/usr/local/lib/python3.12/site-packages/ray/exceptions.py", line 45, in from_bytes
(TaskRunner pid=530440) return RayError.from_ray_exception(ray_exception)
(TaskRunner pid=530440) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(TaskRunner pid=530440) File "/usr/local/lib/python3.12/site-packages/ray/exceptions.py", line 54, in from_ray_exception
(TaskRunner pid=530440) raise RuntimeError(msg) from e
(TaskRunner pid=530440) RuntimeError: Failed to unpickle serialized exception
from #54341
A workaround would be for the user to try-catch the program block themselves and reraise a different, serializable exception, but often times this is done in some 3rd party library code, and the user doesn't have access to this exception.
Ray can continue to throw the runtime error but also provide a string representation of the exception/stacktrace, making it easier for users to consume/understand.
Reproducible Script
import openai
import ray
from openai import AuthenticationError
def call_openai_and_error_out():
client = openai.OpenAI(
base_url="https://api.endpoints.anyscale.com/v1",
api_key="test",
)
try:
client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a chatbot."},
{"role": "user", "content": "What is the capital of France?"},
],
)
except AuthenticationError as e:
print("Errored as expected given API key is invalid.")
raise e
remote_fn = ray.remote(call_openai_and_error_out)
ray.get(remote_fn.remote())
This gives a non-actionable stacktrace, like:
2025-08-02 14:19:36,726 ERROR serialization.py:462 -- Failed to unpickle serialized exception
Traceback (most recent call last):
File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/ray/exceptions.py", line 51, in from_ray_exception
return pickle.loads(ray_exception.serialized_exception)
TypeError: APIStatusError.__init__() missing 2 required keyword-only arguments: 'response' and 'body'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/ray/_private/serialization.py", line 460, in deserialize_objects
obj = self._deserialize_object(data, metadata, object_ref)
File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/ray/_private/serialization.py", line 342, in _deserialize_object
return RayError.from_bytes(obj)
File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/ray/exceptions.py", line 45, in from_bytes
return RayError.from_ray_exception(ray_exception)
File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/ray/exceptions.py", line 54, in from_ray_exception
raise RuntimeError(msg) from e
RuntimeError: Failed to unpickle serialized exception
Traceback (most recent call last):
File "/Users/rliaw/dev/proteins/_test.py", line 31, in <module>
ray.get(remote_fn.remote())
File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/ray/_private/auto_init_hook.py", line 21, in auto_init_wrapper
return fn(*args, **kwargs)
File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
return func(*args, **kwargs)
File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/ray/_private/worker.py", line 2782, in get
values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout)
File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/ray/_private/worker.py", line 931, in get_objects
raise value
ray.exceptions.RaySystemError: System error: Failed to unpickle serialized exception
traceback: Traceback (most recent call last):
File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/ray/exceptions.py", line 51, in from_ray_exception
return pickle.loads(ray_exception.serialized_exception)
TypeError: APIStatusError.__init__() missing 2 required keyword-only arguments: 'response' and 'body'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/ray/_private/serialization.py", line 460, in deserialize_objects
obj = self._deserialize_object(data, metadata, object_ref)
File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/ray/_private/serialization.py", line 342, in _deserialize_object
return RayError.from_bytes(obj)
File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/ray/exceptions.py", line 45, in from_bytes
return RayError.from_ray_exception(ray_exception)
File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/ray/exceptions.py", line 54, in from_ray_exception
raise RuntimeError(msg) from e
RuntimeError: Failed to unpickle serialized exception
However, it's actually possible to print the stacktrace and message, by wrapping the function in a try-catch:
try:
...
except Exception:
import traceback
print(type(e), e.code)
print(traceback.format_exc())
raise e
which then gives something much more reasonable:
(call_openai_and_error_out pid=32523) <class 'openai.NotFoundError'> None
(call_openai_and_error_out pid=32523) Traceback (most recent call last):
(call_openai_and_error_out pid=32523) File "/Users/rliaw/dev/proteins/_test.py", line 13, in call_openai_and_error_out
(call_openai_and_error_out pid=32523) client.chat.completions.create(
(call_openai_and_error_out pid=32523) File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/openai/_utils/_utils.py", line 275, in wrapper
(call_openai_and_error_out pid=32523) return func(*args, **kwargs)
(call_openai_and_error_out pid=32523) File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/openai/resources/chat/completions.py", line 667, in create
(call_openai_and_error_out pid=32523) return self._post(
(call_openai_and_error_out pid=32523) File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/openai/_base_client.py", line 1208, in post
(call_openai_and_error_out pid=32523) return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
(call_openai_and_error_out pid=32523) File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/openai/_base_client.py", line 897, in request
(call_openai_and_error_out pid=32523) return self._request(
(call_openai_and_error_out pid=32523) File "/Users/rliaw/miniconda3/lib/python3.10/site-packages/openai/_base_client.py", line 988, in _request
(call_openai_and_error_out pid=32523) raise self._make_status_error_from_response(err.response) from None
(call_openai_and_error_out pid=32523) openai.NotFoundError
All of these issues are related:
- [Core] ray raises a "Failed to unpickle serialized exception" error when an OpenAI Authentication Error is raised in task #43428
- [Core] Deserialize tensorflow MultilineMessageKeyError #50138
- [Core] Please provide better message where 'RuntimeError: Failed to unpickle serialized exception' #49885
- [Python] ERROR serialization.py:462 -- Failed to unpickle serialized exception #49970
- RuntimeError: Failed to unpickle serialized exception | how can i solve it? #54341