Skip to content
This repository was archived by the owner on Apr 26, 2024. It is now read-only.

Commit d08ef6f

Browse files
Make background updates controllable via a plugin (#11306)
Co-authored-by: Brendan Abolivier <[email protected]>
1 parent 9d1971a commit d08ef6f

12 files changed

+407
-61
lines changed

changelog.d/11306.feature

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Add plugin support for controlling database background updates.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# Background update controller callbacks
2+
3+
Background update controller callbacks allow module developers to control (e.g. rate-limit)
4+
how database background updates are run. A database background update is an operation
5+
Synapse runs on its database in the background after it starts. It's usually used to run
6+
database operations that would take too long if they were run at the same time as schema
7+
updates (which are run on startup) and delay Synapse's startup too much: populating a
8+
table with a big amount of data, adding an index on a big table, deleting superfluous data,
9+
etc.
10+
11+
Background update controller callbacks can be registered using the module API's
12+
`register_background_update_controller_callbacks` method. Only the first module (in order
13+
of appearance in Synapse's configuration file) calling this method can register background
14+
update controller callbacks, subsequent calls are ignored.
15+
16+
The available background update controller callbacks are:
17+
18+
### `on_update`
19+
20+
_First introduced in Synapse v1.49.0_
21+
22+
```python
23+
def on_update(update_name: str, database_name: str, one_shot: bool) -> AsyncContextManager[int]
24+
```
25+
26+
Called when about to do an iteration of a background update. The module is given the name
27+
of the update, the name of the database, and a flag to indicate whether the background
28+
update will happen in one go and may take a long time (e.g. creating indices). If this last
29+
argument is set to `False`, the update will be run in batches.
30+
31+
The module must return an async context manager. It will be entered before Synapse runs a
32+
background update; this should return the desired duration of the iteration, in
33+
milliseconds.
34+
35+
The context manager will be exited when the iteration completes. Note that the duration
36+
returned by the context manager is a target, and an iteration may take substantially longer
37+
or shorter. If the `one_shot` flag is set to `True`, the duration returned is ignored.
38+
39+
__Note__: Unlike most module callbacks in Synapse, this one is _synchronous_. This is
40+
because asynchronous operations are expected to be run by the async context manager.
41+
42+
This callback is required when registering any other background update controller callback.
43+
44+
### `default_batch_size`
45+
46+
_First introduced in Synapse v1.49.0_
47+
48+
```python
49+
async def default_batch_size(update_name: str, database_name: str) -> int
50+
```
51+
52+
Called before the first iteration of a background update, with the name of the update and
53+
of the database. The module must return the number of elements to process in this first
54+
iteration.
55+
56+
If this callback is not defined, Synapse will use a default value of 100.
57+
58+
### `min_batch_size`
59+
60+
_First introduced in Synapse v1.49.0_
61+
62+
```python
63+
async def min_batch_size(update_name: str, database_name: str) -> int
64+
```
65+
66+
Called before running a new batch for a background update, with the name of the update and
67+
of the database. The module must return an integer representing the minimum number of
68+
elements to process in this iteration. This number must be at least 1, and is used to
69+
ensure that progress is always made.
70+
71+
If this callback is not defined, Synapse will use a default value of 100.

docs/modules/writing_a_module.md

+6-6
Original file line numberDiff line numberDiff line change
@@ -71,15 +71,15 @@ Modules **must** register their web resources in their `__init__` method.
7171
## Registering a callback
7272

7373
Modules can use Synapse's module API to register callbacks. Callbacks are functions that
74-
Synapse will call when performing specific actions. Callbacks must be asynchronous, and
75-
are split in categories. A single module may implement callbacks from multiple categories,
76-
and is under no obligation to implement all callbacks from the categories it registers
77-
callbacks for.
74+
Synapse will call when performing specific actions. Callbacks must be asynchronous (unless
75+
specified otherwise), and are split in categories. A single module may implement callbacks
76+
from multiple categories, and is under no obligation to implement all callbacks from the
77+
categories it registers callbacks for.
7878

7979
Modules can register callbacks using one of the module API's `register_[...]_callbacks`
8080
methods. The callback functions are passed to these methods as keyword arguments, with
81-
the callback name as the argument name and the function as its value. This is demonstrated
82-
in the example below. A `register_[...]_callbacks` method exists for each category.
81+
the callback name as the argument name and the function as its value. A
82+
`register_[...]_callbacks` method exists for each category.
8383

8484
Callbacks for each category can be found on their respective page of the
8585
[Synapse documentation website](https://matrix-org.github.io/synapse).

setup.py

+3-1
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,9 @@ def exec_file(path_segments):
119119
# Tests assume that all optional dependencies are installed.
120120
#
121121
# parameterized_class decorator was introduced in parameterized 0.7.0
122-
CONDITIONAL_REQUIREMENTS["test"] = ["parameterized>=0.7.0"]
122+
#
123+
# We use `mock` library as that backports `AsyncMock` to Python 3.6
124+
CONDITIONAL_REQUIREMENTS["test"] = ["parameterized>=0.7.0", "mock>=4.0.0"]
123125

124126
CONDITIONAL_REQUIREMENTS["dev"] = (
125127
CONDITIONAL_REQUIREMENTS["lint"]

synapse/module_api/__init__.py

+53-1
Original file line numberDiff line numberDiff line change
@@ -82,10 +82,19 @@
8282
)
8383
from synapse.http.servlet import parse_json_object_from_request
8484
from synapse.http.site import SynapseRequest
85-
from synapse.logging.context import make_deferred_yieldable, run_in_background
85+
from synapse.logging.context import (
86+
defer_to_thread,
87+
make_deferred_yieldable,
88+
run_in_background,
89+
)
8690
from synapse.metrics.background_process_metrics import run_as_background_process
8791
from synapse.rest.client.login import LoginResponse
8892
from synapse.storage import DataStore
93+
from synapse.storage.background_updates import (
94+
DEFAULT_BATCH_SIZE_CALLBACK,
95+
MIN_BATCH_SIZE_CALLBACK,
96+
ON_UPDATE_CALLBACK,
97+
)
8998
from synapse.storage.database import DatabasePool, LoggingTransaction
9099
from synapse.storage.databases.main.roommember import ProfileInfo
91100
from synapse.storage.state import StateFilter
@@ -311,6 +320,24 @@ def register_password_auth_provider_callbacks(
311320
auth_checkers=auth_checkers,
312321
)
313322

323+
def register_background_update_controller_callbacks(
324+
self,
325+
on_update: ON_UPDATE_CALLBACK,
326+
default_batch_size: Optional[DEFAULT_BATCH_SIZE_CALLBACK] = None,
327+
min_batch_size: Optional[MIN_BATCH_SIZE_CALLBACK] = None,
328+
) -> None:
329+
"""Registers background update controller callbacks.
330+
331+
Added in Synapse v1.49.0.
332+
"""
333+
334+
for db in self._hs.get_datastores().databases:
335+
db.updates.register_update_controller_callbacks(
336+
on_update=on_update,
337+
default_batch_size=default_batch_size,
338+
min_batch_size=min_batch_size,
339+
)
340+
314341
def register_web_resource(self, path: str, resource: Resource) -> None:
315342
"""Registers a web resource to be served at the given path.
316343
@@ -995,6 +1022,11 @@ def looping_background_call(
9951022
f,
9961023
)
9971024

1025+
async def sleep(self, seconds: float) -> None:
1026+
"""Sleeps for the given number of seconds."""
1027+
1028+
await self._clock.sleep(seconds)
1029+
9981030
async def send_mail(
9991031
self,
10001032
recipient: str,
@@ -1149,6 +1181,26 @@ async def get_room_state(
11491181

11501182
return {key: state_events[event_id] for key, event_id in state_ids.items()}
11511183

1184+
async def defer_to_thread(
1185+
self,
1186+
f: Callable[..., T],
1187+
*args: Any,
1188+
**kwargs: Any,
1189+
) -> T:
1190+
"""Runs the given function in a separate thread from Synapse's thread pool.
1191+
1192+
Added in Synapse v1.49.0.
1193+
1194+
Args:
1195+
f: The function to run.
1196+
args: The function's arguments.
1197+
kwargs: The function's keyword arguments.
1198+
1199+
Returns:
1200+
The return value of the function once ran in a thread.
1201+
"""
1202+
return await defer_to_thread(self._hs.get_reactor(), f, *args, **kwargs)
1203+
11521204

11531205
class PublicRoomListManager:
11541206
"""Contains methods for adding to, removing from and querying whether a room

0 commit comments

Comments
 (0)