gradio 4 critical bug -- all return messages truncated to 65k #6601

pseudotensor · 2023-11-28T21:25:27Z

Describe the bug

In gradio 3 there were never any issues with returning large amounts of text or data. However, in gradio 4 this is totally broken.

This is super critical bug given how broadly it applies to all API calls of any return type (text, audio, etc.).

Related: #6319

that is, even once heartbeat bug is fixed, long output still hits json error mentioned there.

Have you searched existing issues? 🔎

I have searched and found no existing issues

Reproduction

server:

import gradio as gr
import random
import time

with gr.Blocks() as demo:
    chatbot = gr.Chatbot()
    msg = gr.Textbox()
    clear = gr.Button("Clear")

    def user(user_message, history):
        return "", history + [[user_message, None]]

    def bot(history):
        bot_message = ' '.join(['a'] * 100000)
        history[-1][1] = bot_message
        yield history

    msg.submit(user, [msg, chatbot], [msg, chatbot], queue=False).then(
        fn=bot, inputs=chatbot, outputs=chatbot, api_name='bot',
    )
    clear.click(lambda: None, None, chatbot, api_name='clear')

demo.queue()
demo.launch()

client:

import time
from gradio_client import Client

client = Client('http://localhost:7860', serialize=False)

args = [[['Who are you?', None]]]
res = client.predict(*tuple(args), api_name='/bot')
print(res)

Error:

(/data/conda/h2ogpt) jon@pseudotensor:~/h2ogpt$ python testchat_nostream_client.py 
Loaded as API: http://localhost:7860/ ✔
Traceback (most recent call last):
  File "/home/jon/h2ogpt/testchat_nostream_client.py", line 7, in <module>
    res = client.predict(*tuple(args), api_name='/bot')
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/client.py", line 305, in predict
    return self.submit(*args, api_name=api_name, fn_index=fn_index).result()
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/client.py", line 1456, in result
    return super().result(timeout=timeout)
  File "/data/conda/h2ogpt/lib/python3.10/concurrent/futures/_base.py", line 445, in result
    return self.__get_result()
  File "/data/conda/h2ogpt/lib/python3.10/concurrent/futures/_base.py", line 390, in __get_result
    raise self._exception
  File "/data/conda/h2ogpt/lib/python3.10/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/client.py", line 869, in _inner
    predictions = _predict(*data)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/client.py", line 894, in _predict
    result = utils.synchronize_async(self._sse_fn, data, hash_data, helper)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/utils.py", line 665, in synchronize_async
    return fsspec.asyn.sync(fsspec.asyn.get_loop(), func, *args, **kwargs)  # type: ignore
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/fsspec/asyn.py", line 103, in sync
    raise return_result
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/fsspec/asyn.py", line 56, in _runner
    result[0] = await coro
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/client.py", line 1075, in _sse_fn
    return await utils.get_pred_from_sse(
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/utils.py", line 342, in get_pred_from_sse
    return task.result()
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/utils.py", line 374, in stream_sse
    resp = json.loads(line[5:])
  File "/data/conda/h2ogpt/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/data/conda/h2ogpt/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/data/conda/h2ogpt/lib/python3.10/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 70 (char 69)

When checking the details, the issue is that all messages of any kind are truncated to no more than 65k bytes. This was never a problem with gradio 3 and I can see large messages work perfectly fine.

For audio and video this is a DOA for gradio 4 and its API.

Screenshot

No response

Logs

No response

System Info

gradio==4.7.1
gradio_client==0.7.0

Severity

Blocking usage of gradio

The text was updated successfully, but these errors were encountered:

pseudotensor · 2023-11-28T21:38:00Z

If It try to debug the clent for the above running server, and step through stream_sse(), eventually for no obvious reason it hits on client:

Traceback (most recent call last):
  File "/data/conda/h2ogpt/lib/python3.10/concurrent/futures/_base.py", line 445, in result
    return self.__get_result()
  File "/data/conda/h2ogpt/lib/python3.10/concurrent/futures/_base.py", line 390, in __get_result
    raise self._exception
  File "/data/conda/h2ogpt/lib/python3.10/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/client.py", line 869, in _inner
    predictions = _predict(*data)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/client.py", line 894, in _predict
    result = utils.synchronize_async(self._sse_fn, data, hash_data, helper)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/utils.py", line 665, in synchronize_async
    return fsspec.asyn.sync(fsspec.asyn.get_loop(), func, *args, **kwargs)  # type: ignore
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/fsspec/asyn.py", line 103, in sync
    raise return_result
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/fsspec/asyn.py", line 56, in _runner
    result[0] = await coro
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/client.py", line 1075, in _sse_fn
    return await utils.get_pred_from_sse(
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/utils.py", line 342, in get_pred_from_sse
    return task.result()
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/utils.py", line 407, in stream_sse
    req.raise_for_status()
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/httpx/_models.py", line 758, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Server error '500 Internal Server Error' for url 'http://localhost:7860/queue/data'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500

and on server:

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 426, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/fastapi/applications.py", line 1106, in __call__
    await super().__call__(scope, receive, send)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/starlette/middleware/cors.py", line 83, in __call__
    await self.app(scope, receive, send)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
    raise e
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
    await self.app(scope, receive, send)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/fastapi/routing.py", line 274, in app
    raw_response = await run_endpoint_function(
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio/routes.py", line 668, in queue_data
    blocks._queue.attach_data(body)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio/queueing.py", line 161, in attach_data
    raise ValueError("Event not found", event_id)
ValueError: ('Event not found', 'c4d06de1de954425bc4fe30e64eee7fe')

It's like the client-server connection is unstable to timing choices that come from how the async stream is handled. That's not good.

pseudotensor · 2023-11-28T21:45:20Z

My guess is this is the offending changes: #6069

FYI @aliabid94

It seems to be both unstable (timing dependent) and wrong (truncation)

pseudotensor · 2023-11-28T21:58:03Z

Actually I can't tell where things changed. Maybe make_predict changes by @pngwn

pseudotensor · 2023-11-28T22:09:25Z

If I start messing with the stream_sse() code, e.g. just adding a print or something, I can see random changes in behavior. Sometimes the print shows full correct output, sometimes not. All a mess.

i.e. just this debug:

            async for line in response.aiter_text():
                print(len(line), flush=True)
                if line.startswith("data:"):

Then one sees the last number before failure is all over the place. sometimes 32761, 65529, both fail.

But if put instead:

            async for line in response.aiter_text():
                if len(line) > 65000:
                    print(line, flush=True)
                    continue
                if line.startswith("data:"):

Even though doesn't make it work (not intended), sometimes I see the full message printed with size 65536

but of course the "continue" is not valid and leads to other issues. But without the continue I never see the right length.

pseudotensor · 2023-11-28T22:21:12Z

This seems to work:

            async for line in response.aiter_lines():
                print(len(line), flush=True)
                if len(line) == 0:
                    continue
                if line.startswith("data:"):

i.e. aiter_lines() instead of aiter_text() and ignoring 0 length lines.

Related: encode/httpx#2310

pseudotensor · 2023-11-28T23:04:14Z

But it's not a perfect fix. I sometimes see:


Traceback (most recent call last):
  File "/home/jon/h2ogpt/gradio_utils/grclient.py", line 264, in submit
    self.refresh_client_if_should()
  File "/home/jon/h2ogpt/gradio_utils/grclient.py", line 210, in refresh_client_if_should
    server_hash = self.get_server_hash()
  File "/home/jon/h2ogpt/gradio_utils/grclient.py", line 203, in get_server_hash
    return super().submit(api_name="/system_hash").result()
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/client.py", line 1456, in result
    return super().result(timeout=timeout)
  File "/data/conda/h2ogpt/lib/python3.10/concurrent/futures/_base.py", line 445, in result
    return self.__get_result()
  File "/data/conda/h2ogpt/lib/python3.10/concurrent/futures/_base.py", line 390, in __get_result
    raise self._exception
  File "/data/conda/h2ogpt/lib/python3.10/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/client.py", line 869, in _inner
    predictions = _predict(*data)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/client.py", line 894, in _predict
    result = utils.synchronize_async(self._sse_fn, data, hash_data, helper)
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/utils.py", line 667, in synchronize_async
    return fsspec.asyn.sync(fsspec.asyn.get_loop(), func, *args, **kwargs)  # type: ignore
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/fsspec/asyn.py", line 103, in sync
    raise return_result
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/fsspec/asyn.py", line 56, in _runner
    result[0] = await coro
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/client.py", line 1075, in _sse_fn
    return await utils.get_pred_from_sse(
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/utils.py", line 342, in get_pred_from_sse
    return task.result()
  File "/data/conda/h2ogpt/lib/python3.10/site-packages/gradio_client/utils.py", line 413, in stream_sse
    raise ValueError(f"Unexpected message: {line}")
ValueError: Unexpected message: {"detail":"Not Found"}

abidlabs · 2023-11-29T03:27:28Z

Thanks for the detailed report @pseudotensor. Is this only an issue when you are using the Client to make a prediction, or even if you use the Gradio app via UI?

pseudotensor · 2023-11-29T03:36:53Z

So far only seen in API use, and the only changes to work-around so far are in client side.

pseudotensor · 2023-11-30T01:45:46Z

Even with the work-arounds I did so far, still hit this and get hangs:

Traceback (most recent call last):
  File "/home/jon/h2ogpt0/gradio_utils/grclient.py", line 268, in submit
    self.refresh_client_if_should()
  File "/home/jon/h2ogpt0/gradio_utils/grclient.py", line 214, in refresh_client_if_should
    server_hash = self.get_server_hash()
  File "/home/jon/h2ogpt0/gradio_utils/grclient.py", line 207, in get_server_hash
    return super().submit(api_name="/system_hash").result()
  File "/home/jon/miniconda3/envs/h2ogpt0/lib/python3.10/site-packages/gradio_client/client.py", line 1456, in result
    return super().result(timeout=timeout)
  File "/home/jon/miniconda3/envs/h2ogpt0/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/home/jon/miniconda3/envs/h2ogpt0/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/home/jon/miniconda3/envs/h2ogpt0/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/jon/miniconda3/envs/h2ogpt0/lib/python3.10/site-packages/gradio_client/client.py", line 869, in _inner
    predictions = _predict(*data)
  File "/home/jon/miniconda3/envs/h2ogpt0/lib/python3.10/site-packages/gradio_client/client.py", line 894, in _predict
    result = utils.synchronize_async(self._sse_fn, data, hash_data, helper)
  File "/home/jon/miniconda3/envs/h2ogpt0/lib/python3.10/site-packages/gradio_client/utils.py", line 670, in synchronize_async
    return fsspec.asyn.sync(fsspec.asyn.get_loop(), func, *args, **kwargs)  # type: ignore
  File "/home/jon/miniconda3/envs/h2ogpt0/lib/python3.10/site-packages/fsspec/asyn.py", line 103, in sync
    raise return_result
  File "/home/jon/miniconda3/envs/h2ogpt0/lib/python3.10/site-packages/fsspec/asyn.py", line 56, in _runner
    result[0] = await coro
  File "/home/jon/miniconda3/envs/h2ogpt0/lib/python3.10/site-packages/gradio_client/client.py", line 1075, in _sse_fn
    return await utils.get_pred_from_sse(
  File "/home/jon/miniconda3/envs/h2ogpt0/lib/python3.10/site-packages/gradio_client/utils.py", line 343, in get_pred_from_sse
    return task.result()
  File "/home/jon/miniconda3/envs/h2ogpt0/lib/python3.10/site-packages/gradio_client/utils.py", line 417, in stream_sse
    raise ValueError("Did not receive process_completed message.")
ValueError: Did not receive process_completed message.

pseudotensor · 2023-11-30T01:46:26Z

Will probably have to revert back to gradio 3. Just too unstable.

abidlabs · 2023-11-30T06:07:24Z

We're working through a few issues related to the Client, e.g. #6602

Btw was just looking at your Client code, and noticed that you have:

import time
from gradio_client import Client

client = Client('http://localhost:7860', serialize=False)

In Gradio 4.x, you don't need to set serialize=False, since the Chatbot component now returns a list of tuples by default. You can just do:

import time
from gradio_client import Client

client = Client('http://localhost:7860')

Will work through the other issues you mentioned here soon.

pseudotensor · 2023-11-30T06:39:57Z

I need sanitize=False in general for many other reasons. E.g. if through API I push http link into text box, it gets converted into {'path': filename} like dict with a filename as a temp file made (not sure by client or server).

I don't want any of those conversions done. And since the client is global to any API calls, I must disable serialize entirely and get strictly un-processed inputs from client -> server and server -> client.

abidlabs · 2023-11-30T06:48:11Z

Got it okay, we can explore those issues later, let's get this unblocked first

…o#5914, gradio4 is too unstable to use until various things fixed: gradio-app/gradio#6592 gradio-app/gradio#6282 gradio-app/gradio#6319 gradio-app/gradio#6601

freddyaboulton · 2023-12-06T21:34:06Z

Hi @pseudotensor ! Thanks for all of the helpful comments you left on this thread as you investigated the issue. I think I have a fix in #6693. I would appreciate if you could test it out. You can install the client from that PR with

pip install "gradio-client @ git+https://github.com/gradio-
app/gradio@6887fe4e080647f9892b478c35a625b056eddb31#subdirectory=client/python"

That PR only targets the 65k response length issue and the hearbeat issue (#6319). There are still some other issues with the client we are fixing. And #6556 should hopefully fix a lot of them.

pseudotensor added the bug Something isn't working label Nov 28, 2023

pseudotensor added a commit to h2oai/h2ogpt that referenced this issue Nov 28, 2023

Work-around gradio-app/gradio#6601

7158937

pseudotensor added a commit to h2oai/h2ogpt that referenced this issue Nov 29, 2023

More work-arounds for gradio-app/gradio#6601

7653276

abidlabs added the Regression Bugs did not exist in previous versions of Gradio label Nov 29, 2023

abuyusif01 mentioned this issue Nov 29, 2023

[websockets.exceptions.InvalidStatusCode: server rejected WebSocket connection: HTTP 403] #6605

Closed

1 task

freddyaboulton self-assigned this Dec 4, 2023

freddyaboulton added the API Related to the one of the client libraries or usage of Gradio via API label Dec 4, 2023

freddyaboulton mentioned this issue Dec 6, 2023

Python client properly handles hearbeat and log messages. Also handles responses longer than 65k #6693

Merged

freddyaboulton closed this as completed in #6693 Dec 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gradio 4 critical bug -- all return messages truncated to 65k #6601

gradio 4 critical bug -- all return messages truncated to 65k #6601

pseudotensor commented Nov 28, 2023

pseudotensor commented Nov 28, 2023 •

edited

Loading

Uh oh!

pseudotensor commented Nov 28, 2023 •

edited

Loading

Uh oh!

pseudotensor commented Nov 28, 2023

Uh oh!

pseudotensor commented Nov 28, 2023 •

edited

Loading

Uh oh!

pseudotensor commented Nov 28, 2023

Uh oh!

pseudotensor commented Nov 28, 2023

Uh oh!

abidlabs commented Nov 29, 2023

Uh oh!

pseudotensor commented Nov 29, 2023

Uh oh!

pseudotensor commented Nov 30, 2023

Uh oh!

pseudotensor commented Nov 30, 2023

Uh oh!

abidlabs commented Nov 30, 2023

Uh oh!

pseudotensor commented Nov 30, 2023 •

edited

Loading

Uh oh!

abidlabs commented Nov 30, 2023

Uh oh!

freddyaboulton commented Dec 6, 2023

Uh oh!

gradio 4 critical bug -- all return messages truncated to 65k #6601

gradio 4 critical bug -- all return messages truncated to 65k #6601

Comments

pseudotensor commented Nov 28, 2023

Describe the bug

Have you searched existing issues? 🔎

Reproduction

Screenshot

Logs

System Info

Severity

pseudotensor commented Nov 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pseudotensor commented Nov 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pseudotensor commented Nov 28, 2023

Uh oh!

pseudotensor commented Nov 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pseudotensor commented Nov 28, 2023

Uh oh!

pseudotensor commented Nov 28, 2023

Uh oh!

abidlabs commented Nov 29, 2023

Uh oh!

pseudotensor commented Nov 29, 2023

Uh oh!

pseudotensor commented Nov 30, 2023

Uh oh!

pseudotensor commented Nov 30, 2023

Uh oh!

abidlabs commented Nov 30, 2023

Uh oh!

pseudotensor commented Nov 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abidlabs commented Nov 30, 2023

Uh oh!

freddyaboulton commented Dec 6, 2023

Uh oh!

pseudotensor commented Nov 28, 2023 •

edited

Loading

pseudotensor commented Nov 28, 2023 •

edited

Loading

pseudotensor commented Nov 28, 2023 •

edited

Loading

pseudotensor commented Nov 30, 2023 •

edited

Loading