Add materialization invalidations API #8027

mkindahl · 2025-04-29T12:38:35Z

Add functions add_materialization_invalidation and get_materialization_invalidations to work with the materialization invalidation log.

Function add_materialization_invalidations will allow you to add new invalidation ranges to a continuous aggregate. These ranges will then be considered invalid and in need of a refresh.

Function get_materialization_invalidations will get invalidations from the materialization invalidation log for a particular continuous aggregate and in a particular range.

codecov · 2025-04-29T13:14:11Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.12%. Comparing base (59f50f2) to head (52b20fe).
Report is 1009 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #8027      +/-   ##
==========================================
+ Coverage   80.06%   82.12%   +2.05%     
==========================================
  Files         190      253      +63     
  Lines       37181    47209   +10028     
  Branches     9450    11894    +2444     
==========================================
+ Hits        29770    38771    +9001     
- Misses       2997     3730     +733     
- Partials     4414     4708     +294

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2025-04-30T11:48:09Z

@svenklemm, @erimatnor: please review this pull request.

Powered by pull-review

erimatnor

The API seems OK for what it is doing, but it is not clear to me how it will be used and whether it makes sense.

I think it would be good if all the functions part of the same cagg API used a common prefix or schema name to make clear they are part of the same API. Otherwise we just have a number of free-floating functions in the _timescaledb_functions schema.

If this is an "official" API, maybe we should not have the _timescaledb_ schema prefix? The leading underscore often means "internal", which I guess this might not be as a user-facing developer API?

Perhaps, tsdb_cagg.get_materialization_info or similar makes sense?

Finally, I am unsure why we need the snapshot/token functionality of the accept function. I added more questions about this inline.

sql/cagg_api.sql

tsl/test/expected/cagg_api.out

mkindahl · 2025-05-05T10:58:19Z

The API seems OK for what it is doing, but it is not clear to me how it will be used and whether it makes sense.

I think it would be good if all the functions part of the same cagg API used a common prefix or schema name to make clear they are part of the same API. Otherwise we just have a number of free-floating functions in the _timescaledb_functions schema.

If this is an "official" API, maybe we should not have the _timescaledb_ schema prefix? The leading underscore often means "internal", which I guess this might not be as a user-facing developer API?

This might be interesting long-term, but for now this is an internal API, free for use, but might change based on our experiences with the "external scheduler" work.

Perhaps, tsdb_cagg.get_materialization_info or similar makes sense?

Finally, I am unsure why we need the snapshot/token functionality of the accept function. I added more questions about this inline.

As mentioned in Slack, this was material for discussion. Based on the discussions, we decided to rely on the refresh function to update invalidation log after refresh is done to get a transactional behavior.

sql/cagg_api.sql

sql/updates/latest-dev.sql

tsl/test/sql/cagg_api.sql

fabriziomello

Left some comments mostly about test but approved anyway.

Add functions `add_materialization_invalidation` and `get_materialization_invalidations` to work with the materialization invalidation log. Function `add_materialization_invalidations` will allow you to add new invalidation ranges to a continuous aggregate. These ranges will then be considered invalid and in need of a refresh. Function `get_materialization_invalidations` will get invalidations from the materialization invalidation log for a particular continuous aggregate and in a particular range.

@arajkumar

## 2.20.0 (2025-05-15) This release contains performance improvements and bug fixes since the 2.19.3 release. We recommend that you upgrade at the next available opportunity. **Highlighted features in TimescaleDB v2.20.0** * The columnstore now leverages *bloom filters* to deliver up to 6x faster point queries on columns with high cardinality values, such as UUIDs. * Major *improvements to the columnstores' backfill process* enable `UPSERTS` with strict constraints to execute up to 10x faster. * *SkipScan is now supported in the columnstore*, including for DISTINCT queries. This enhancement leads to dramatic query performance improvements of 2000x to 2500x, especially for selective queries. * SIMD vectorization for the bool data type is now enabled by default. This change results in a 30–45% increase in performance for analytical queries with bool clauses on the columnstore. * *Continuous aggregates* now include experimental support for *window functions and non-immutable functions*, extending the analytics use cases they can solve. * Several quality-of-life improvements have been introduced: job names for continuous aggregates are now more descriptive, you can assign custom names to them, and it is now possible to add unique constraints along with `ADD COLUMN` operations in the columnstore. * Improved management and optimization of chunks with the ability to split large uncompressed chunks at a specified point in time using the `split_chunk` function. This new function complements the existing `merge_chunk` function that can be used to merge two small chunks into one larger chunk. * Enhancements to the default behavior of the columnstore now provide better *automatic assessments* of `segment by` and `order by` columns, reducing the need for manual configuration and simplifying initial setup. **PostgreSQL 14 support removal announcement** Following the deprecation announcement for PostgreSQL 14 in TimescaleDB v2.19.0, PostgreSQL 14 is no longer supported in TimescaleDB v2.20.0. The currently supported PostgreSQL major versions are 15, 16, and 17. **Features** * [#7638](#7638) Bloom filter sparse indexes for compressed columns. Can be disabled with the GUC `timescaledb.enable_sparse_index_bloom` * [#7756](#7756) Add warning for poor compression ratio * [#7762](#7762) Speed up the queries that use minmax sparse indexes on compressed tables by changing the index TOAST storage type to `MAIN`. This applies to newly compressed chunks * [#7785](#7785) Do `DELETE` instead of `TRUNCATE` when locks aren't acquired * [#7852](#7852) Allow creating foreign key constraints on compressed tables * [#7854](#7854) Remove support for PG14 * [#7864](#7854) Allow adding CHECK constraints to compressed chunks * [#7868](#7868) Allow adding columns with `CHECK` constraints to compressed chunks * [#7874](#7874) Support for SkipScan for distinct aggregates over the same column * [#7877](#7877) Remove blocker for unique constraints with `ADD COLUMN` * [#7878](#7878) Don't block non-immutable functions in continuous aggregates * [#7880](#7880) Add experimental support for window functions in continuous aggregates * [#7899](#7899) Vectorized decompression and filtering for boolean columns * [#7915](#7915) New option `refresh_newest_first` to continuous aggregate refresh policy API * [#7917](#7917) Remove `_timescaledb_functions.create_chunk_table` function * [#7929](#7929) Add `CREATE TABLE ... WITH` API for creating hypertables * [#7946](#7946) Add support for splitting a chunk * [#7958](#7958) Allow custom names for jobs * [#7972](#7972) Add vectorized filtering for constraint checking while backfilling into compressed chunks * [#7976](#7976) Include continuous aggregate name in jobs informational view * [#7977](#7977) Replace references to compression with columnstore * [#7981](#7981) Add columnstore as alias for `enable_columnstore `in `ALTER TABLE` * [#7983](#7983) Support for SkipScan over compressed data * [#7991](#7991) Improves default `segmentby` options * [#7992](#7992) Add API into hypertable invalidation log * [#8000](#8000) Add primary dimension info to information schema * [#8005](#8005) Support `ALTER TABLE SET (timescaledb.chunk_time_interval='1 day')` * [#8012](#8012) Add event triggers support on chunk creation * [#8014](#8014) Enable bool compression by default by setting `timescaledb.enable_bool_compression=true`. Note: for downgrading to `2.18` or earlier version, use [this downgrade script](https://github.com/timescale/timescaledb-extras/blob/master/utils/2.19.0-downgrade_new_compression_algorithms.sql) * [#8018](#8018) Add spin-lock during recompression on unique constraints * [#8026](#8026) Allow `WHERE` conditions that use nonvolatile functions to be pushed down to the compressed scan level. For example, conditions like `time > now()`, where `time` is a columnstore `orderby` column, will evaluate `now()` and use the sparse index on `time` to filter out the entire compressed batches that cannot contain matching rows. * [#8027](#8027) Add materialization invalidations API * [#8047](#8027) Support SkipScan for `SELECT DISTINCT` with multiple distincts when all but one distinct is pinned * [#8115](#8115) Add batch size limiting during compression **Bugfixes** * [#7862](#7862) Release cache pin when checking for `NOT NULL` * [#7909](#7909) Update compression stats when merging chunks * [#7928](#7928) Don't create a hypertable for implicitly published tables * [#7982](#7982) Fix crash in batch sort merge over eligible expressions * [#8008](#8008) Fix compression policy error message that shows number of successes * [#8031](#8031) Fix reporting of deleted tuples for direct batch delete * [#8033](#8033) Skip default `segmentby` if `orderby` is explicitly set * [#8061](#8061) Ensure settings for a compressed relation are found * [#7515](#7515) Add missing lock to Constraint-aware append * [#8067](#8067) Make sure hypercore TAM parent is vacuumed * [#8074](#8074) Fix memory leak in row compressor flush * [#8099](#8099) Block chunk merging on multi-dimensional hypertables * [#8106](#8106) Fix segfault when adding unique compression indexes to compressed chunks * [#8127](#8127) Read bit-packed version of booleans **GUCs** * `timescaledb.enable_sparse_index_bloom`: Enable creation of the bloom1 sparse index on compressed chunks; Default: `ON` * `timescaledb.compress_truncate_behaviour`: Defines how truncate behaves at the end of compression; Default: `truncate_only` * `timescaledb.enable_compression_ratio_warnings`: Enable warnings for poor compression ratio; Default: `ON` * `timescaledb.enable_event_triggers`: Enable event triggers for chunks creation; Default: `OFF` * `timescaledb.enable_cagg_window_functions`: Enable window functions in continuous aggregates; Default: `OFF` **Thanks** * @arajkumar for reporting that implicitly published tables were still able to create hypertables * @thotokraa for reporting an issue with unique expression indexes on compressed chunks --------- Signed-off-by: Philip Krauss <[email protected]> Signed-off-by: Ramon Guiu <[email protected]> Co-authored-by: Anastasiia Tovpeko <[email protected]> Co-authored-by: Ramon Guiu <[email protected]>

github-actions bot assigned mkindahl Apr 29, 2025

mkindahl force-pushed the mat-inval-api branch from 84eaa7e to a684e85 Compare April 29, 2025 13:03

mkindahl force-pushed the mat-inval-api branch 9 times, most recently from dbe3885 to 6ccd570 Compare April 30, 2025 10:04

mkindahl marked this pull request as ready for review April 30, 2025 11:47

mkindahl requested a review from erimatnor April 30, 2025 11:48

github-actions bot requested a review from svenklemm April 30, 2025 11:48

mkindahl requested review from melihmutlu and fabriziomello and removed request for svenklemm April 30, 2025 11:48

mkindahl force-pushed the mat-inval-api branch 2 times, most recently from 12f52e8 to 22b5e5e Compare April 30, 2025 12:23

fabriziomello added Continuous Aggregate ui/ux labels Apr 30, 2025

philkra added this to the v2.20.0 milestone May 1, 2025

erimatnor reviewed May 2, 2025

View reviewed changes

mkindahl force-pushed the mat-inval-api branch from 1a34bb5 to 22b5e5e Compare May 5, 2025 10:26

mkindahl requested a review from erimatnor May 5, 2025 11:29

mkindahl force-pushed the mat-inval-api branch 2 times, most recently from 431e742 to f6a12c6 Compare May 5, 2025 12:42

mkindahl force-pushed the mat-inval-api branch 2 times, most recently from f0ea7cc to fdd9079 Compare May 5, 2025 13:20

erimatnor approved these changes May 5, 2025

View reviewed changes

mkindahl force-pushed the mat-inval-api branch 3 times, most recently from c3b2681 to b59df57 Compare May 5, 2025 15:24