Skip to content
This repository was archived by the owner on Apr 26, 2024. It is now read-only.

Commit d0b294a

Browse files
Make historical events discoverable from backfill for servers without any scrollback history (MSC2716) (#10245)
* Make historical messages available to federated servers Part of MSC2716: matrix-org/matrix-spec-proposals#2716 Follow-up to #9247 * Debug message not available on federation * Add base starting insertion point when no chunk ID is provided * Fix messages from multiple senders in historical chunk Follow-up to #9247 Part of MSC2716: matrix-org/matrix-spec-proposals#2716 --- Previously, Synapse would throw a 403, `Cannot force another user to join.`, because we were trying to use `?user_id` from a single virtual user which did not match with messages from other users in the chunk. * Remove debug lines * Messing with selecting insertion event extremeties * Move db schema change to new version * Add more better comments * Make a fake requester with just what we need See #10276 (comment) * Store insertion events in table * Make base insertion event float off on its own See #10250 (comment) Conflicts: synapse/rest/client/v1/room.py * Validate that the app service can actually control the given user See #10276 (comment) Conflicts: synapse/rest/client/v1/room.py * Add some better comments on what we're trying to check for * Continue debugging * Share validation logic * Add inserted historical messages to /backfill response * Remove debug sql queries * Some marker event implemntation trials * Clean up PR * Rename insertion_event_id to just event_id * Add some better sql comments * More accurate description * Add changelog * Make it clear what MSC the change is part of * Add more detail on which insertion event came through * Address review and improve sql queries * Only use event_id as unique constraint * Fix test case where insertion event is already in the normal DAG * Remove debug changes * Switch to chunk events so we can auth via power_levels Previously, we were using `content.chunk_id` to connect one chunk to another. But these events can be from any `sender` and we can't tell who should be able to send historical events. We know we only want the application service to do it but these events have the sender of a real historical message, not the application service user ID as the sender. Other federated homeservers also have no indicator which senders are an application service on the originating homeserver. So we want to auth all of the MSC2716 events via power_levels and have them be sent by the application service with proper PL levels in the room. * Switch to chunk events for federation * Add unstable room version to support new historical PL * Fix federated events being rejected for no state_groups Add fix from #10439 until it merges. * Only connect base insertion event to prev_event_ids Per discussion with @erikjohnston, https://matrix.to/#/!UytJQHLQYfvYWsGrGY:jki.re/$12bTUiObDFdHLAYtT7E-BvYRp3k_xv8w0dUQHibasJk?via=jki.re&via=matrix.org * Make it possible to get the room_version with txn * Allow but ignore historical events in unsupported room version See #10245 (comment) We can't reject historical events on unsupported room versions because homeservers without knowledge of MSC2716 or the new room version don't reject historical events either. Since we can't rely on the auth check here to stop historical events on unsupported room versions, I've added some additional checks in the processing/persisting code (`synapse/storage/databases/main/events.py` -> `_handle_insertion_event` and `_handle_chunk_event`). I've had to do some refactoring so there is method to fetch the room version by `txn`. * Move to unique index syntax See #10245 (comment) * High-level document how the insertion->chunk lookup works * Remove create_event fallback for room_versions See https://github.com/matrix-org/synapse/pull/10245/files#r677641879 * Use updated method name
1 parent 8c201c9 commit d0b294a

File tree

12 files changed

+338
-26
lines changed

12 files changed

+338
-26
lines changed

changelog.d/10245.feature

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Make historical events discoverable from backfill for servers without any scrollback history (part of MSC2716).

synapse/api/constants.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -206,9 +206,6 @@ class EventContentFields:
206206
MSC2716_CHUNK_ID = "org.matrix.msc2716.chunk_id"
207207
# For "marker" events
208208
MSC2716_MARKER_INSERTION = "org.matrix.msc2716.marker.insertion"
209-
MSC2716_MARKER_INSERTION_PREV_EVENTS = (
210-
"org.matrix.msc2716.marker.insertion_prev_events"
211-
)
212209

213210

214211
class RoomTypes:

synapse/api/room_versions.py

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,9 @@ class RoomVersion:
7373
# MSC2403: Allows join_rules to be set to 'knock', changes auth rules to allow sending
7474
# m.room.membership event with membership 'knock'.
7575
msc2403_knocking = attr.ib(type=bool)
76+
# MSC2716: Adds m.room.power_levels -> content.historical field to control
77+
# whether "insertion", "chunk", "marker" events can be sent
78+
msc2716_historical = attr.ib(type=bool)
7679

7780

7881
class RoomVersions:
@@ -88,6 +91,7 @@ class RoomVersions:
8891
msc2176_redaction_rules=False,
8992
msc3083_join_rules=False,
9093
msc2403_knocking=False,
94+
msc2716_historical=False,
9195
)
9296
V2 = RoomVersion(
9397
"2",
@@ -101,6 +105,7 @@ class RoomVersions:
101105
msc2176_redaction_rules=False,
102106
msc3083_join_rules=False,
103107
msc2403_knocking=False,
108+
msc2716_historical=False,
104109
)
105110
V3 = RoomVersion(
106111
"3",
@@ -114,6 +119,7 @@ class RoomVersions:
114119
msc2176_redaction_rules=False,
115120
msc3083_join_rules=False,
116121
msc2403_knocking=False,
122+
msc2716_historical=False,
117123
)
118124
V4 = RoomVersion(
119125
"4",
@@ -127,6 +133,7 @@ class RoomVersions:
127133
msc2176_redaction_rules=False,
128134
msc3083_join_rules=False,
129135
msc2403_knocking=False,
136+
msc2716_historical=False,
130137
)
131138
V5 = RoomVersion(
132139
"5",
@@ -140,6 +147,7 @@ class RoomVersions:
140147
msc2176_redaction_rules=False,
141148
msc3083_join_rules=False,
142149
msc2403_knocking=False,
150+
msc2716_historical=False,
143151
)
144152
V6 = RoomVersion(
145153
"6",
@@ -153,6 +161,7 @@ class RoomVersions:
153161
msc2176_redaction_rules=False,
154162
msc3083_join_rules=False,
155163
msc2403_knocking=False,
164+
msc2716_historical=False,
156165
)
157166
MSC2176 = RoomVersion(
158167
"org.matrix.msc2176",
@@ -166,6 +175,7 @@ class RoomVersions:
166175
msc2176_redaction_rules=True,
167176
msc3083_join_rules=False,
168177
msc2403_knocking=False,
178+
msc2716_historical=False,
169179
)
170180
MSC3083 = RoomVersion(
171181
"org.matrix.msc3083.v2",
@@ -179,6 +189,7 @@ class RoomVersions:
179189
msc2176_redaction_rules=False,
180190
msc3083_join_rules=True,
181191
msc2403_knocking=False,
192+
msc2716_historical=False,
182193
)
183194
V7 = RoomVersion(
184195
"7",
@@ -192,6 +203,21 @@ class RoomVersions:
192203
msc2176_redaction_rules=False,
193204
msc3083_join_rules=False,
194205
msc2403_knocking=True,
206+
msc2716_historical=False,
207+
)
208+
MSC2716 = RoomVersion(
209+
"org.matrix.msc2716",
210+
RoomDisposition.STABLE,
211+
EventFormatVersions.V3,
212+
StateResolutionVersions.V2,
213+
enforce_key_validity=True,
214+
special_case_aliases_auth=False,
215+
strict_canonicaljson=True,
216+
limit_notifications_power_levels=True,
217+
msc2176_redaction_rules=False,
218+
msc3083_join_rules=False,
219+
msc2403_knocking=True,
220+
msc2716_historical=True,
195221
)
196222

197223

@@ -207,6 +233,7 @@ class RoomVersions:
207233
RoomVersions.MSC2176,
208234
RoomVersions.MSC3083,
209235
RoomVersions.V7,
236+
RoomVersions.MSC2716,
210237
)
211238
}
212239

synapse/event_auth.py

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -205,6 +205,13 @@ def check(
205205
if event.type == EventTypes.Redaction:
206206
check_redaction(room_version_obj, event, auth_events)
207207

208+
if (
209+
event.type == EventTypes.MSC2716_INSERTION
210+
or event.type == EventTypes.MSC2716_CHUNK
211+
or event.type == EventTypes.MSC2716_MARKER
212+
):
213+
check_historical(room_version_obj, event, auth_events)
214+
208215
logger.debug("Allowing! %s", event)
209216

210217

@@ -539,6 +546,37 @@ def check_redaction(
539546
raise AuthError(403, "You don't have permission to redact events")
540547

541548

549+
def check_historical(
550+
room_version_obj: RoomVersion,
551+
event: EventBase,
552+
auth_events: StateMap[EventBase],
553+
) -> None:
554+
"""Check whether the event sender is allowed to send historical related
555+
events like "insertion", "chunk", and "marker".
556+
557+
Returns:
558+
None
559+
560+
Raises:
561+
AuthError if the event sender is not allowed to send historical related events
562+
("insertion", "chunk", and "marker").
563+
"""
564+
# Ignore the auth checks in room versions that do not support historical
565+
# events
566+
if not room_version_obj.msc2716_historical:
567+
return
568+
569+
user_level = get_user_power_level(event.user_id, auth_events)
570+
571+
historical_level = get_named_level(auth_events, "historical", 100)
572+
573+
if user_level < historical_level:
574+
raise AuthError(
575+
403,
576+
'You don\'t have permission to send send historical related events ("insertion", "chunk", and "marker")',
577+
)
578+
579+
542580
def _check_power_levels(
543581
room_version_obj: RoomVersion,
544582
event: EventBase,

synapse/events/utils.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,9 @@ def add_fields(*fields):
126126
if room_version.msc2176_redaction_rules:
127127
add_fields("invite")
128128

129+
if room_version.msc2716_historical:
130+
add_fields("historical")
131+
129132
elif event_type == EventTypes.Aliases and room_version.special_case_aliases_auth:
130133
add_fields("aliases")
131134
elif event_type == EventTypes.RoomHistoryVisibility:

synapse/handlers/federation.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2748,9 +2748,11 @@ async def _update_auth_events_and_context_for_auth(
27482748
event.event_id,
27492749
e.event_id,
27502750
)
2751-
context = await self.state_handler.compute_event_context(e)
2751+
missing_auth_event_context = (
2752+
await self.state_handler.compute_event_context(e)
2753+
)
27522754
await self._auth_and_persist_event(
2753-
origin, e, context, auth_events=auth
2755+
origin, e, missing_auth_event_context, auth_events=auth
27542756
)
27552757

27562758
if e.event_id in event_auth_events:

synapse/handlers/room.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -951,6 +951,7 @@ async def send(etype: str, content: JsonDict, **kwargs) -> int:
951951
"kick": 50,
952952
"redact": 50,
953953
"invite": 50,
954+
"historical": 100,
954955
}
955956

956957
if config["original_invitees_have_ops"]:

synapse/rest/client/v1/room.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -504,7 +504,6 @@ async def on_POST(self, request, room_id):
504504

505505
events_to_create = body["events"]
506506

507-
prev_event_ids = prev_events_from_query
508507
inherited_depth = await self._inherit_depth_from_prev_ids(
509508
prev_events_from_query
510509
)
@@ -516,6 +515,10 @@ async def on_POST(self, request, room_id):
516515
chunk_id_to_connect_to = chunk_id_from_query
517516
base_insertion_event = None
518517
if chunk_id_from_query:
518+
# All but the first base insertion event should point at a fake
519+
# event, which causes the HS to ask for the state at the start of
520+
# the chunk later.
521+
prev_event_ids = [fake_prev_event_id]
519522
# TODO: Verify the chunk_id_from_query corresponds to an insertion event
520523
pass
521524
# Otherwise, create an insertion event to act as a starting point.
@@ -526,6 +529,8 @@ async def on_POST(self, request, room_id):
526529
# an insertion event), in which case we just create a new insertion event
527530
# that can then get pointed to by a "marker" event later.
528531
else:
532+
prev_event_ids = prev_events_from_query
533+
529534
base_insertion_event_dict = self._create_insertion_event_dict(
530535
sender=requester.user.to_string(),
531536
room_id=room_id,

synapse/storage/databases/main/event_federation.py

Lines changed: 79 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -936,15 +936,46 @@ def _get_backfill_events(self, txn, room_id, event_list, limit):
936936
# We want to make sure that we do a breadth-first, "depth" ordered
937937
# search.
938938

939-
query = (
940-
"SELECT depth, prev_event_id FROM event_edges"
941-
" INNER JOIN events"
942-
" ON prev_event_id = events.event_id"
943-
" WHERE event_edges.event_id = ?"
944-
" AND event_edges.is_state = ?"
945-
" LIMIT ?"
946-
)
939+
# Look for the prev_event_id connected to the given event_id
940+
query = """
941+
SELECT depth, prev_event_id FROM event_edges
942+
/* Get the depth of the prev_event_id from the events table */
943+
INNER JOIN events
944+
ON prev_event_id = events.event_id
945+
/* Find an event which matches the given event_id */
946+
WHERE event_edges.event_id = ?
947+
AND event_edges.is_state = ?
948+
LIMIT ?
949+
"""
950+
951+
# Look for the "insertion" events connected to the given event_id
952+
connected_insertion_event_query = """
953+
SELECT e.depth, i.event_id FROM insertion_event_edges AS i
954+
/* Get the depth of the insertion event from the events table */
955+
INNER JOIN events AS e USING (event_id)
956+
/* Find an insertion event which points via prev_events to the given event_id */
957+
WHERE i.insertion_prev_event_id = ?
958+
LIMIT ?
959+
"""
960+
961+
# Find any chunk connections of a given insertion event
962+
chunk_connection_query = """
963+
SELECT e.depth, c.event_id FROM insertion_events AS i
964+
/* Find the chunk that connects to the given insertion event */
965+
INNER JOIN chunk_events AS c
966+
ON i.next_chunk_id = c.chunk_id
967+
/* Get the depth of the chunk start event from the events table */
968+
INNER JOIN events AS e USING (event_id)
969+
/* Find an insertion event which matches the given event_id */
970+
WHERE i.event_id = ?
971+
LIMIT ?
972+
"""
947973

974+
# In a PriorityQueue, the lowest valued entries are retrieved first.
975+
# We're using depth as the priority in the queue.
976+
# Depth is lowest at the oldest-in-time message and highest and
977+
# newest-in-time message. We add events to the queue with a negative depth so that
978+
# we process the newest-in-time messages first going backwards in time.
948979
queue = PriorityQueue()
949980

950981
for event_id in event_list:
@@ -970,9 +1001,48 @@ def _get_backfill_events(self, txn, room_id, event_list, limit):
9701001

9711002
event_results.add(event_id)
9721003

1004+
# Try and find any potential historical chunks of message history.
1005+
#
1006+
# First we look for an insertion event connected to the current
1007+
# event (by prev_event). If we find any, we need to go and try to
1008+
# find any chunk events connected to the insertion event (by
1009+
# chunk_id). If we find any, we'll add them to the queue and
1010+
# navigate up the DAG like normal in the next iteration of the loop.
1011+
txn.execute(
1012+
connected_insertion_event_query, (event_id, limit - len(event_results))
1013+
)
1014+
connected_insertion_event_id_results = txn.fetchall()
1015+
logger.debug(
1016+
"_get_backfill_events: connected_insertion_event_query %s",
1017+
connected_insertion_event_id_results,
1018+
)
1019+
for row in connected_insertion_event_id_results:
1020+
connected_insertion_event_depth = row[0]
1021+
connected_insertion_event = row[1]
1022+
queue.put((-connected_insertion_event_depth, connected_insertion_event))
1023+
1024+
# Find any chunk connections for the given insertion event
1025+
txn.execute(
1026+
chunk_connection_query,
1027+
(connected_insertion_event, limit - len(event_results)),
1028+
)
1029+
chunk_start_event_id_results = txn.fetchall()
1030+
logger.debug(
1031+
"_get_backfill_events: chunk_start_event_id_results %s",
1032+
chunk_start_event_id_results,
1033+
)
1034+
for row in chunk_start_event_id_results:
1035+
if row[1] not in event_results:
1036+
queue.put((-row[0], row[1]))
1037+
1038+
# Navigate up the DAG by prev_event
9731039
txn.execute(query, (event_id, False, limit - len(event_results)))
1040+
prev_event_id_results = txn.fetchall()
1041+
logger.debug(
1042+
"_get_backfill_events: prev_event_ids %s", prev_event_id_results
1043+
)
9741044

975-
for row in txn:
1045+
for row in prev_event_id_results:
9761046
if row[1] not in event_results:
9771047
queue.put((-row[0], row[1]))
9781048

0 commit comments

Comments
 (0)