Skip to content
This repository was archived by the owner on Apr 26, 2024. It is now read-only.
This repository was archived by the owner on Apr 26, 2024. It is now read-only.

Race condition with replication means that publishing room aliases lacks read-after-write consistency between workers #14210

Open
@DMRobertson

Description

@DMRobertson

Consider the following sequence of events:

  1. Alice creates a room without any aliases.
  2. Alice lists aliases for that room.
  3. Alice sets an alias for that room.
  4. Alice lists aliases for that room.

If the alias writes occur on a separate worker to the reads, this is vulnerable to a classic worker cache invalidation race:

  • (2) succeeds because the reader has no cached alias information for the room. It queries the database (which is written before (1) completes) and caches the result.
  • (3) succeeds on the writer, which fires off a message telling readers to invalidate their caches.
  • ⚠️ If request (4) arrives before the reader has received and processed the invalidation, the reader will return the (now stale) data in its cache. This means Alice has failed to read her own write.

I don't think actual humans edit and then immediately list aliases that often, so I suggest we don't worry about fixing this. (i.e. I think this only manifests as test flakes). But I wanted to write this up as a reference. (It would be nice to have a catalogue of known races like this).

History:

See issues labeled with Z-Read-After-Write A lack of read-after-write consistency, usually due to cache invalidation races with workers

And previous related history specifically around aliases:

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-TestingIssues related to testing in complement, synapse, etcA-WorkersProblems related to running Synapse in Worker Mode (or replication)O-UncommonMost users are unlikely to come across this or unexpected workflowS-TolerableMinor significance, cosmetic issues, low or no impact to users.T-DefectBugs, crashes, hangs, security vulnerabilities, or other reported issues.Z-Read-After-WriteA lack of read-after-write consistency, usually due to cache invalidation races with workers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions