Skip to content
This repository was archived by the owner on Apr 26, 2024. It is now read-only.

Commit fa8616e

Browse files
Fix MSC3030 /timestamp_to_event returning outliers that it has no idea whether are near a gap or not (#14215)
Fix MSC3030 `/timestamp_to_event` endpoint returning `outliers` that it has no idea whether are near a gap or not (and therefore unable to determine whether it's actually the closest event). The reason Synapse doesn't know whether an `outlier` is next to a gap is because our gap checks rely on entries in the `event_edges`, `event_forward_extremeties`, and `event_backward_extremities` tables which is [not the case for `outliers`](https://github.com/matrix-org/synapse/blob/2c63cdcc3f1aa4625e947de3c23e0a8133c61286/docs/development/room-dag-concepts.md#outliers). Also fixes MSC3030 Complement `can_paginate_after_getting_remote_event_from_timestamp_to_event_endpoint` test flake. Although this acted flakey in Complement, if `sync_partial_state` raced and beat us before `/timestamp_to_event`, then even if we retried the failing `/context` request it wouldn't work until we made this Synapse change. With this PR, Synapse will never return an `outlier` event so that test will always go and ask over federation. Fix #13944 ### Why did this fail before? Why was it flakey? Sleuthing the server logs on the [CI failure](https://github.com/matrix-org/synapse/actions/runs/3149623842/jobs/5121449357#step:5:5805), it looks like `hs2:/timestamp_to_event` found `$NP6-oU7mIFVyhtKfGvfrEQX949hQX-T-gvuauG6eurU` as an `outlier` event locally. Then when we went and asked for it via `/context`, since it's an `outlier`, it was filtered out of the results -> `You don't have permission to access that event.` This is reproducible when `sync_partial_state` races and persists `$NP6-oU7mIFVyhtKfGvfrEQX949hQX-T-gvuauG6eurU` as an `outlier` before we evaluate `get_event_for_timestamp(...)`. To consistently reproduce locally, just add a delay at the [start of `get_event_for_timestamp(...)`](https://github.com/matrix-org/synapse/blob/cb20b885cb4bd1648581dd043a184d86fc8c7a00/synapse/handlers/room.py#L1470-L1496) so it always runs after `sync_partial_state` completes. ```py from twisted.internet import task as twisted_task d = twisted_task.deferLater(self.hs.get_reactor(), 3.5) await d ``` In a run where it passes, on `hs2`, `get_event_for_timestamp(...)` finds a different event locally which is next to a gap and we request from a closer one from `hs1` which gets backfilled. And since the backfilled event is not an `outlier`, it's returned as expected during `/context`. With this PR, Synapse will never return an `outlier` event so that test will always go and ask over federation.
1 parent 2a76a73 commit fa8616e

File tree

3 files changed

+104
-21
lines changed

3 files changed

+104
-21
lines changed

changelog.d/14215.bugfix

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Fix [MSC3030](https://github.com/matrix-org/matrix-spec-proposals/pull/3030) `/timestamp_to_event` endpoint returning potentially inaccurate closest events with `outliers` present.

synapse/storage/databases/main/events_worker.py

Lines changed: 38 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1971,12 +1971,17 @@ async def is_event_next_to_backward_gap(self, event: EventBase) -> bool:
19711971
19721972
Args:
19731973
room_id: room where the event lives
1974-
event_id: event to check
1974+
event: event to check (can't be an `outlier`)
19751975
19761976
Returns:
19771977
Boolean indicating whether it's an extremity
19781978
"""
19791979

1980+
assert not event.internal_metadata.is_outlier(), (
1981+
"is_event_next_to_backward_gap(...) can't be used with `outlier` events. "
1982+
"This function relies on `event_backward_extremities` which won't be filled in for `outliers`."
1983+
)
1984+
19801985
def is_event_next_to_backward_gap_txn(txn: LoggingTransaction) -> bool:
19811986
# If the event in question has any of its prev_events listed as a
19821987
# backward extremity, it's next to a gap.
@@ -2026,12 +2031,17 @@ async def is_event_next_to_forward_gap(self, event: EventBase) -> bool:
20262031
20272032
Args:
20282033
room_id: room where the event lives
2029-
event_id: event to check
2034+
event: event to check (can't be an `outlier`)
20302035
20312036
Returns:
20322037
Boolean indicating whether it's an extremity
20332038
"""
20342039

2040+
assert not event.internal_metadata.is_outlier(), (
2041+
"is_event_next_to_forward_gap(...) can't be used with `outlier` events. "
2042+
"This function relies on `event_edges` and `event_forward_extremities` which won't be filled in for `outliers`."
2043+
)
2044+
20352045
def is_event_next_to_gap_txn(txn: LoggingTransaction) -> bool:
20362046
# If the event in question is a forward extremity, we will just
20372047
# consider any potential forward gap as not a gap since it's one of
@@ -2112,13 +2122,33 @@ async def get_event_id_for_timestamp(
21122122
The closest event_id otherwise None if we can't find any event in
21132123
the given direction.
21142124
"""
2125+
if direction == "b":
2126+
# Find closest event *before* a given timestamp. We use descending
2127+
# (which gives values largest to smallest) because we want the
2128+
# largest possible timestamp *before* the given timestamp.
2129+
comparison_operator = "<="
2130+
order = "DESC"
2131+
else:
2132+
# Find closest event *after* a given timestamp. We use ascending
2133+
# (which gives values smallest to largest) because we want the
2134+
# closest possible timestamp *after* the given timestamp.
2135+
comparison_operator = ">="
2136+
order = "ASC"
21152137

2116-
sql_template = """
2138+
sql_template = f"""
21172139
SELECT event_id FROM events
21182140
LEFT JOIN rejections USING (event_id)
21192141
WHERE
2120-
origin_server_ts %s ?
2121-
AND room_id = ?
2142+
room_id = ?
2143+
AND origin_server_ts {comparison_operator} ?
2144+
/**
2145+
* Make sure the event isn't an `outlier` because we have no way
2146+
* to later check whether it's next to a gap. `outliers` do not
2147+
* have entries in the `event_edges`, `event_forward_extremeties`,
2148+
* and `event_backward_extremities` tables to check against
2149+
* (used by `is_event_next_to_backward_gap` and `is_event_next_to_forward_gap`).
2150+
*/
2151+
AND NOT outlier
21222152
/* Make sure event is not rejected */
21232153
AND rejections.event_id IS NULL
21242154
/**
@@ -2128,27 +2158,14 @@ async def get_event_id_for_timestamp(
21282158
* Finally, we can tie-break based on when it was received on the server
21292159
* (`stream_ordering`).
21302160
*/
2131-
ORDER BY origin_server_ts %s, depth %s, stream_ordering %s
2161+
ORDER BY origin_server_ts {order}, depth {order}, stream_ordering {order}
21322162
LIMIT 1;
21332163
"""
21342164

21352165
def get_event_id_for_timestamp_txn(txn: LoggingTransaction) -> Optional[str]:
2136-
if direction == "b":
2137-
# Find closest event *before* a given timestamp. We use descending
2138-
# (which gives values largest to smallest) because we want the
2139-
# largest possible timestamp *before* the given timestamp.
2140-
comparison_operator = "<="
2141-
order = "DESC"
2142-
else:
2143-
# Find closest event *after* a given timestamp. We use ascending
2144-
# (which gives values smallest to largest) because we want the
2145-
# closest possible timestamp *after* the given timestamp.
2146-
comparison_operator = ">="
2147-
order = "ASC"
2148-
21492166
txn.execute(
2150-
sql_template % (comparison_operator, order, order, order),
2151-
(timestamp, room_id),
2167+
sql_template,
2168+
(room_id, timestamp),
21522169
)
21532170
row = txn.fetchone()
21542171
if row:

tests/rest/client/test_rooms.py

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,8 @@
3939
)
4040
from synapse.api.errors import Codes, HttpResponseException
4141
from synapse.appservice import ApplicationService
42+
from synapse.events import EventBase
43+
from synapse.events.snapshot import EventContext
4244
from synapse.handlers.pagination import PurgeStatus
4345
from synapse.rest import admin
4446
from synapse.rest.client import account, directory, login, profile, register, room, sync
@@ -51,6 +53,7 @@
5153
from tests.http.server._base import make_request_with_cancellation_test
5254
from tests.storage.test_stream import PaginationTestCase
5355
from tests.test_utils import make_awaitable
56+
from tests.test_utils.event_injection import create_event
5457

5558
PATH_PREFIX = b"/_matrix/client/api/v1"
5659

@@ -3486,3 +3489,65 @@ def test_400_missing_param_without_id_access_token(self) -> None:
34863489
)
34873490
self.assertEqual(channel.code, 400)
34883491
self.assertEqual(channel.json_body["errcode"], "M_MISSING_PARAM")
3492+
3493+
3494+
class TimestampLookupTestCase(unittest.HomeserverTestCase):
3495+
servlets = [
3496+
admin.register_servlets,
3497+
room.register_servlets,
3498+
login.register_servlets,
3499+
]
3500+
3501+
def default_config(self) -> JsonDict:
3502+
config = super().default_config()
3503+
config["experimental_features"] = {"msc3030_enabled": True}
3504+
return config
3505+
3506+
def prepare(self, reactor: MemoryReactor, clock: Clock, hs: HomeServer) -> None:
3507+
self._storage_controllers = self.hs.get_storage_controllers()
3508+
3509+
self.room_owner = self.register_user("room_owner", "test")
3510+
self.room_owner_tok = self.login("room_owner", "test")
3511+
3512+
def _inject_outlier(self, room_id: str) -> EventBase:
3513+
event, _context = self.get_success(
3514+
create_event(
3515+
self.hs,
3516+
room_id=room_id,
3517+
type="m.test",
3518+
sender="@test_remote_user:remote",
3519+
)
3520+
)
3521+
3522+
event.internal_metadata.outlier = True
3523+
self.get_success(
3524+
self._storage_controllers.persistence.persist_event(
3525+
event, EventContext.for_outlier(self._storage_controllers)
3526+
)
3527+
)
3528+
return event
3529+
3530+
def test_no_outliers(self) -> None:
3531+
"""
3532+
Test to make sure `/timestamp_to_event` does not return `outlier` events.
3533+
We're unable to determine whether an `outlier` is next to a gap so we
3534+
don't know whether it's actually the closest event. Instead, let's just
3535+
ignore `outliers` with this endpoint.
3536+
3537+
This test is really seeing that we choose the non-`outlier` event behind the
3538+
`outlier`. Since the gap checking logic considers the latest message in the room
3539+
as *not* next to a gap, asking over federation does not come into play here.
3540+
"""
3541+
room_id = self.helper.create_room_as(self.room_owner, tok=self.room_owner_tok)
3542+
3543+
outlier_event = self._inject_outlier(room_id)
3544+
3545+
channel = self.make_request(
3546+
"GET",
3547+
f"/_matrix/client/unstable/org.matrix.msc3030/rooms/{room_id}/timestamp_to_event?dir=b&ts={outlier_event.origin_server_ts}",
3548+
access_token=self.room_owner_tok,
3549+
)
3550+
self.assertEqual(HTTPStatus.OK, channel.code, msg=channel.json_body)
3551+
3552+
# Make sure the outlier event is not returned
3553+
self.assertNotEqual(channel.json_body["event_id"], outlier_event.event_id)

0 commit comments

Comments
 (0)