Skip to content

Concurrent downloads of remote media can lead to orphaned files in storage providers. #8692

Open
@matrixbot

Description

@matrixbot

This issue has been migrated from #8692.


When running multiple media workers it is possible to end up with both fetching and persisting the same media. matrix-org/synapse#8682 fixed it such that doing so a) didn't throw an error to the client and b) deleted the duplicated media from disk, however by this point the media is already queued up to be uploaded to s3.

We could change it to only upload to storage providers after persisting to the DB, however that runs the risk of having media in the DB that hasn't been uploaded to a storage provider if e.g. the worker dies half way through upload.

I think the "fix" here is to add a delete_file function to the storage provider interface that can be called to delete a file (or cancel its upload if it hasn't finished uploading).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions