Skip to content
This repository was archived by the owner on Apr 26, 2024. It is now read-only.
This repository was archived by the owner on Apr 26, 2024. It is now read-only.

Synapse can fail to move large media uploads from temporary storage to disk in time before the client disconnects #13009

Open
@reivilibre

Description

@reivilibre

This issue is a distilled version of #12937.

When a file is uploaded to Synapse's media store, Twisted buffers it into a temporary file and gives Synapse an IO handle to it once it's been fully received.
The issue is that, with sufficiently large files and slow storage, it's possible that it will take more than 30 seconds to copy the bytes out of that temporary file into the correct location in the media store.

Clients such as Element Web will time out 30 seconds after having sent the last byte. (It also wouldn't seem unusual for a reverse proxy to do something similar.)
When this happens, the disconnection causes Twisted to close the temporary file and so Synapse's copy is interrupted.

The error produced in the logs looks as follows:

synapse.http.site - 362 - INFO - POST-8959 - Connection from client lost before response was sent
synapse.http.server - 183 - ERROR - POST-8959 - Failed handle request via 'UploadResource': <XForwardedForRequest at 0x7f5bada94c40 method='POST' uri='/_matrix/media/r0/upload?filename=2.0gb' clientproto='HTTP/1.1' site='8008'>
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/synapse/http/server.py", line 366, in _async_render_wrapper
    callback_return = await self._async_render(request)
  File "/usr/local/lib/python3.9/site-packages/synapse/http/server.py", line 396, in _async_render
    callback_return = await raw_callback_return
  File "/usr/local/lib/python3.9/site-packages/synapse/rest/media/v1/upload_resource.py", line 96, in _async_render_POST
    content_uri = await self.media_repo.create_content(
  File "/usr/local/lib/python3.9/site-packages/synapse/rest/media/v1/media_repository.py", line 178, in create_content
    fname = await self.media_storage.store_file(content, file_info)
  File "/usr/local/lib/python3.9/site-packages/synapse/rest/media/v1/media_storage.py", line 94, in store_file
    await self.write_to_file(source, f)
  File "/usr/local/lib/python3.9/site-packages/synapse/rest/media/v1/media_storage.py", line 101, in write_to_file
    await defer_to_thread(self.reactor, _write_file_synchronously, source, output)
  File "/usr/local/lib/python3.9/site-packages/twisted/python/threadpool.py", line 244, in inContext
    result = inContext.theWork()  # type: ignore[attr-defined]
  File "/usr/local/lib/python3.9/site-packages/twisted/python/threadpool.py", line 260, in <lambda>
    inContext.theWork = lambda: context.call(  # type: ignore[attr-defined]
  File "/usr/local/lib/python3.9/site-packages/twisted/python/context.py", line 117, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/usr/local/lib/python3.9/site-packages/twisted/python/context.py", line 82, in callWithContext
    return func(*args, **kw)
  File "/usr/local/lib/python3.9/site-packages/synapse/logging/context.py", line 970, in g
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/synapse/rest/media/v1/media_storage.py", line 312, in _write_file_synchronously
    shutil.copyfileobj(source, dest)
  File "/usr/local/lib/python3.9/shutil.py", line 205, in copyfileobj
    buf = fsrc_read(length)
ValueError: I/O operation on closed file
synapse.http.server - 795 - WARNING - POST-8959 - Not sending response to request <XForwardedForRequest at 0x7f5bada94c40 method='POST' uri='/_matrix/media/r0/upload?filename=2.0gb' clientproto='HTTP/1.1' site='8008'>, already disconnected.

We would benefit overall by writing the bytes straight into a file on the correct filesystem as they're received from the client (either to the correct location, or to a temporary location with a rename operation upon completion). This would also mean that the client doesn't have to 'wait around' for the file copy to happen, since it's been happening during the upload (and in most cases, disk I/O is probably faster than the client's upload speed anyway!). In short: it will reduce upload latency a bit for large files.

(An uglier workaround would be to prevent the file copy from blocking the response, returning the MXC URL before it's even ready and finishing the copy in the background. I don't think it's great to give an MXC URL that's not ready — it could lead to races.)

Either solution probably involves some amount of spelunking into how Twisted manages these temporary files.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-Media-RepositoryUploading, downloading images and video, thumbnailingO-OccasionalAffects or can be seen by some users regularly or most users rarelyS-MajorMajor functionality / product severely impaired, no satisfactory workaround.T-DefectBugs, crashes, hangs, security vulnerabilities, or other reported issues.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions