This repository was archived by the owner on Apr 26, 2024. It is now read-only.
-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Support for database schema version ranges #9933
Merged
Merged
Changes from 7 commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
e425a1f
Introduce an `attrs` to contain db schema state
richvdh a7da9df
Support for ranges of database schemas
richvdh ddb7676
changelog
richvdh 65e546c
Merge remote-tracking branch 'origin/develop' into rav/database_migra…
richvdh 1aac02f
update version numbers
richvdh f15dc2a
move `compat_version` definition into the Python
richvdh 4eae684
Move new docs into the docs hierarchy
richvdh 3ad9e10
Apply suggestions from code review
richvdh File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Update the database schema versioning to support gradual migration away from legacy tables. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
# Synapse database schema files | ||
|
||
Synapse's database schema is stored in the `synapse.storage.schema` module. | ||
|
||
## Logical databases | ||
|
||
Synapse supports splitting its datastore across multiple physical databases (which can | ||
be useful for large installations), and the schema files are therefore split according | ||
to the logical database they are apply to. | ||
|
||
At the time of writing, the following "logical" databases are supported: | ||
|
||
* `state` - used to store Matrix room state (more specifically, `state_groups`, | ||
their relationships and contents.) | ||
richvdh marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* `main` - stores everything else. | ||
|
||
Addionally, the `common` directory contains schema files for tables which must be | ||
richvdh marked this conversation as resolved.
Show resolved
Hide resolved
|
||
present on *all* physical databases. | ||
|
||
## Synapse schema versions | ||
|
||
Synapse manages its database schema via "schema versions". These are mainly used to | ||
help avoid confusion if the Synapse codebase is rolled back after the database is | ||
updated. They work as follows: | ||
|
||
* The Synapse codebase defines a constant `synapse.storage.schema.SCHEMA_VERSION` | ||
which represents the expectations made about the database by that version. For | ||
example, as of Synapse v1.36, this is `59`. | ||
|
||
* The database stores a "compatibility version" in | ||
`schema_compat_version.compat_version` which defines the `SCHEMA_VERSION` of the | ||
oldest version of Synapse which will work with the database. On startup, if | ||
`compat_version` is found to be newer than `SCHEMA_VERSION`, Synapse will refuse to | ||
start. | ||
|
||
Synapse automatically updates this field from | ||
`synapse.storage.schema.SCHEMA_COMPAT_VERSION`. | ||
|
||
* Whenever a backwards-incompatible change is made to the database format (normally | ||
via a `delta` file), `synapse.storage.schema.SCHEMA_COMPAT_VERSION` is also updated | ||
so that administrators can not accidentally roll back to a too-old version of Synapse. | ||
|
||
Generally, the goal is to maintain compatibility with at least one or two previous | ||
releases of Synapse, so any substantial change tends to require multiple releases and a | ||
bit of forward-planning to get right. | ||
|
||
As a worked example: we want to remove the `room_stats_historical` table. Here is how it | ||
might pan out. | ||
|
||
1. Replace any code that *reads* from `room_stats_historical` with alternative | ||
implementations, but keep writing to it in case of rollback to an earlier version. | ||
Also, increase `synapse.storage.schema.SCHEMA_VERSION`. In this | ||
instance, there is no existing code which reads from `room_stats_historical`, so | ||
our starting point is: | ||
|
||
v1.36.0: `SCHEMA_VERSION=59`, `SCHEMA_COMPAT_VERSION=59` | ||
|
||
2. Next (say in Synapse v1.37.0): remove the code that *writes* to | ||
`room_stats_historical`, but don’t yet remove the table in case of rollback to | ||
v1.36.0. Again, we increase `synapse.storage.schema.SCHEMA_VERSION`, but | ||
because we have not broken compatibility with v1.36, we do not yet update | ||
`SCHEMA_COMPAT_VERSION`. We now have: | ||
|
||
v1.37.0: `SCHEMA_VERSION=60`, `SCHEMA_COMPAT_VERSION=59`. | ||
|
||
3. Later (say in Synapse v1.38.0): we can remove the table altogether. This will | ||
break compatibility with v1.36.0, so we must update `SCHEMA_COMPAT_VERSION` accordingly. | ||
There is no need to update `synapse.storage.schema.SCHEMA_VERSION`, since there is no | ||
change to the Synapse codebase here. So we end up with: | ||
|
||
v1.38.0: `SCHEMA_VERSION=60`, `SCHEMA_COMPAT_VERSION=60`. | ||
|
||
If in doubt about whether to update `SCHEMA_VERSION` or not, it is generally best to | ||
lean towards doing so. | ||
|
||
## Full schema dumps | ||
|
||
In the `full_schemas` directories, only the most recently-numbered snapshot is used | ||
(`54` at the time of writing). Older snapshots (eg, `16`) are present for historical | ||
reference only. | ||
|
||
### Building full schema dumps | ||
|
||
If you want to recreate these schemas, they need to be made from a database that | ||
has had all background updates run. | ||
|
||
To do so, use `scripts-dev/make_full_schema.sh`. This will produce new | ||
`full.sql.postgres` and `full.sql.sqlite` files. | ||
|
||
Ensure postgres is installed, then run: | ||
|
||
./scripts-dev/make_full_schema.sh -p postgres_username -o output_dir/ | ||
|
||
NB at the time of writing, this script predates the split into separate `state`/`main` | ||
databases so will require updates to handle that correctly. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,4 @@ | ||
# Copyright 2014 - 2016 OpenMarket Ltd | ||
# Copyright 2018 New Vector Ltd | ||
# Copyright 2014 - 2021 The Matrix.org Foundation C.I.C. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
|
@@ -26,7 +25,7 @@ | |
from synapse.storage.database import LoggingDatabaseConnection | ||
from synapse.storage.engines import BaseDatabaseEngine | ||
from synapse.storage.engines.postgres import PostgresEngine | ||
from synapse.storage.schema import SCHEMA_VERSION | ||
from synapse.storage.schema import SCHEMA_COMPAT_VERSION, SCHEMA_VERSION | ||
from synapse.storage.types import Cursor | ||
|
||
logger = logging.getLogger(__name__) | ||
|
@@ -59,6 +58,28 @@ class UpgradeDatabaseException(PrepareDatabaseException): | |
) | ||
|
||
|
||
@attr.s | ||
class _SchemaState: | ||
current_version: int = attr.ib() | ||
"""The current schema version of the database""" | ||
|
||
compat_version: Optional[int] = attr.ib() | ||
"""The SCHEMA_VERSION of the oldest version of Synapse for this database | ||
|
||
If this is None, we have an old version of the database without the necessary | ||
table. | ||
""" | ||
|
||
applied_deltas: Collection[str] = attr.ib(factory=tuple) | ||
"""Any delta files for `current_version` which have already been applied""" | ||
|
||
upgraded: bool = attr.ib(default=False) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (Fwiw you could remove all the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah, I think I didn't know about |
||
"""Whether the current state was reached by applying deltas. | ||
|
||
If False, we have run the full schema for `current_version`, and have applied no | ||
deltas since. If True, we have run some deltas since the original creation.""" | ||
|
||
|
||
def prepare_database( | ||
db_conn: LoggingDatabaseConnection, | ||
database_engine: BaseDatabaseEngine, | ||
|
@@ -96,12 +117,11 @@ def prepare_database( | |
version_info = _get_or_create_schema_state(cur, database_engine) | ||
|
||
if version_info: | ||
user_version, delta_files, upgraded = version_info | ||
logger.info( | ||
"%r: Existing schema is %i (+%i deltas)", | ||
databases, | ||
user_version, | ||
len(delta_files), | ||
version_info.current_version, | ||
len(version_info.applied_deltas), | ||
) | ||
|
||
# config should only be None when we are preparing an in-memory SQLite db, | ||
|
@@ -113,16 +133,18 @@ def prepare_database( | |
|
||
# if it's a worker app, refuse to upgrade the database, to avoid multiple | ||
# workers doing it at once. | ||
if config.worker_app is not None and user_version != SCHEMA_VERSION: | ||
if ( | ||
config.worker_app is not None | ||
and version_info.current_version != SCHEMA_VERSION | ||
): | ||
raise UpgradeDatabaseException( | ||
OUTDATED_SCHEMA_ON_WORKER_ERROR % (SCHEMA_VERSION, user_version) | ||
OUTDATED_SCHEMA_ON_WORKER_ERROR | ||
% (SCHEMA_VERSION, version_info.current_version) | ||
) | ||
|
||
_upgrade_existing_database( | ||
cur, | ||
user_version, | ||
delta_files, | ||
upgraded, | ||
version_info, | ||
database_engine, | ||
config, | ||
databases=databases, | ||
|
@@ -261,9 +283,7 @@ def _setup_new_database( | |
|
||
_upgrade_existing_database( | ||
cur, | ||
current_version=max_current_ver, | ||
applied_delta_files=[], | ||
upgraded=False, | ||
_SchemaState(current_version=max_current_ver, compat_version=None), | ||
database_engine=database_engine, | ||
config=None, | ||
databases=databases, | ||
|
@@ -273,9 +293,7 @@ def _setup_new_database( | |
|
||
def _upgrade_existing_database( | ||
cur: Cursor, | ||
current_version: int, | ||
applied_delta_files: List[str], | ||
upgraded: bool, | ||
current_schema_state: _SchemaState, | ||
database_engine: BaseDatabaseEngine, | ||
config: Optional[HomeServerConfig], | ||
databases: Collection[str], | ||
|
@@ -321,12 +339,8 @@ def _upgrade_existing_database( | |
|
||
Args: | ||
cur | ||
current_version: The current version of the schema. | ||
applied_delta_files: A list of deltas that have already been applied. | ||
upgraded: Whether the current version was generated by having | ||
applied deltas or from full schema file. If `True` the function | ||
will never apply delta files for the given `current_version`, since | ||
the current_version wasn't generated by applying those delta files. | ||
current_schema_state: The current version of the schema, as | ||
returned by _get_or_create_schema_state | ||
database_engine | ||
config: | ||
None if we are initialising a blank database, otherwise the application | ||
|
@@ -337,13 +351,16 @@ def _upgrade_existing_database( | |
upgrade portions of the delta scripts. | ||
""" | ||
if is_empty: | ||
assert not applied_delta_files | ||
assert not current_schema_state.applied_deltas | ||
else: | ||
assert config | ||
|
||
is_worker = config and config.worker_app is not None | ||
|
||
if current_version > SCHEMA_VERSION: | ||
if ( | ||
current_schema_state.compat_version is not None | ||
and current_schema_state.compat_version > SCHEMA_VERSION | ||
): | ||
raise ValueError( | ||
"Cannot use this database as it is too " | ||
+ "new for the server to understand" | ||
|
@@ -357,14 +374,26 @@ def _upgrade_existing_database( | |
assert config is not None | ||
check_database_before_upgrade(cur, database_engine, config) | ||
|
||
start_ver = current_version | ||
# update schema_compat_version before we run any upgrades, so that if synapse | ||
# gets downgraded again, it won't try to run against the upgraded database. | ||
if ( | ||
current_schema_state.compat_version is None | ||
or current_schema_state.compat_version < SCHEMA_COMPAT_VERSION | ||
): | ||
cur.execute("DELETE FROM schema_compat_version") | ||
cur.execute( | ||
"INSERT INTO schema_compat_version(compat_version) VALUES (?)", | ||
(SCHEMA_COMPAT_VERSION,), | ||
) | ||
|
||
start_ver = current_schema_state.current_version | ||
|
||
# if we got to this schema version by running a full_schema rather than a series | ||
# of deltas, we should not run the deltas for this version. | ||
if not upgraded: | ||
if not current_schema_state.upgraded: | ||
start_ver += 1 | ||
|
||
logger.debug("applied_delta_files: %s", applied_delta_files) | ||
logger.debug("applied_delta_files: %s", current_schema_state.applied_deltas) | ||
|
||
if isinstance(database_engine, PostgresEngine): | ||
specific_engine_extension = ".postgres" | ||
|
@@ -440,7 +469,7 @@ def _upgrade_existing_database( | |
absolute_path = entry.absolute_path | ||
|
||
logger.debug("Found file: %s (%s)", relative_path, absolute_path) | ||
if relative_path in applied_delta_files: | ||
if relative_path in current_schema_state.applied_deltas: | ||
continue | ||
|
||
root_name, ext = os.path.splitext(file_name) | ||
|
@@ -621,25 +650,39 @@ def execute_statements_from_stream(cur: Cursor, f: TextIO) -> None: | |
|
||
def _get_or_create_schema_state( | ||
txn: Cursor, database_engine: BaseDatabaseEngine | ||
) -> Optional[Tuple[int, List[str], bool]]: | ||
) -> Optional[_SchemaState]: | ||
# Bluntly try creating the schema_version tables. | ||
sql_path = os.path.join(schema_path, "common", "schema_version.sql") | ||
executescript(txn, sql_path) | ||
|
||
txn.execute("SELECT version, upgraded FROM schema_version") | ||
row = txn.fetchone() | ||
|
||
if row is None: | ||
# new database | ||
return None | ||
|
||
current_version = int(row[0]) | ||
upgraded = bool(row[1]) | ||
|
||
compat_version: Optional[int] = None | ||
txn.execute("SELECT compat_version FROM schema_compat_version") | ||
row = txn.fetchone() | ||
if row is not None: | ||
current_version = int(row[0]) | ||
txn.execute( | ||
"SELECT file FROM applied_schema_deltas WHERE version >= ?", | ||
(current_version,), | ||
) | ||
applied_deltas = [d for d, in txn] | ||
upgraded = bool(row[1]) | ||
return current_version, applied_deltas, upgraded | ||
compat_version = int(row[0]) | ||
|
||
txn.execute( | ||
"SELECT file FROM applied_schema_deltas WHERE version >= ?", | ||
(current_version,), | ||
) | ||
applied_deltas = tuple(d for d, in txn) | ||
|
||
return None | ||
return _SchemaState( | ||
current_version=current_version, | ||
compat_version=compat_version, | ||
applied_deltas=applied_deltas, | ||
upgraded=upgraded, | ||
) | ||
|
||
|
||
@attr.s(slots=True) | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,37 +1,4 @@ | ||
# Synapse Database Schemas | ||
|
||
This directory contains the schema files used to build Synapse databases. | ||
|
||
Synapse supports splitting its datastore across multiple physical databases (which can | ||
be useful for large installations), and the schema files are therefore split according | ||
to the logical database they are apply to. | ||
|
||
At the time of writing, the following "logical" databases are supported: | ||
|
||
* `state` - used to store Matrix room state (more specifically, `state_groups`, | ||
their relationships and contents.) | ||
* `main` - stores everything else. | ||
|
||
Addionally, the `common` directory contains schema files for tables which must be | ||
present on *all* physical databases. | ||
|
||
## Full schema dumps | ||
|
||
In the `full_schemas` directories, only the most recently-numbered snapshot is useful | ||
(`54` at the time of writing). Older snapshots (eg, `16`) are present for historical | ||
reference only. | ||
|
||
## Building full schema dumps | ||
|
||
If you want to recreate these schemas, they need to be made from a database that | ||
has had all background updates run. | ||
|
||
To do so, use `scripts-dev/make_full_schema.sh`. This will produce new | ||
`full.sql.postgres` and `full.sql.sqlite` files. | ||
|
||
Ensure postgres is installed, then run: | ||
|
||
./scripts-dev/make_full_schema.sh -p postgres_username -o output_dir/ | ||
|
||
NB at the time of writing, this script predates the split into separate `state`/`main` | ||
databases so will require updates to handle that correctly. | ||
This directory contains the schema files used to build Synapse databases. For more | ||
information, see /docs/development/database_schema.md. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.