Description
What happened?
See here for context on why this table exists and how it is used. Records are added or updated in this table whenever jobs are added to the database or after an attempt for a job completes. Records are currently only removed when the records belong to a cancelled job group. If a job group runs to completion, we end up with many rows in the database that no longer serve any purpose, and (if you sum over the token
column), have 0s for all the job columns. This does not affect correctness, but is a lot of wasted space in the database. This leads to two points that together would save a lot of space in the database (I've not quantified how much but select count(*)
on this table takes longer than I've been willing to wait.
- Rows in this table with the same key
(batch_id, update_id, job_group_id, inst_coll)
but differenttoken
value can be "compacted" into one row with key(batch_id, update_id, job_group_id, inst_coll, 0)
(token 0) where all the other columns are summed. This is most useful for cold rows. - Rows whose
n_*_jobs
and*_cancellable_cores_mcpu
columns are 0 can be deleted.
We already do 1 for the aggregated billing tables. Use tokens for parallelism on hot rows and then compact records so that records from before the current day always end up only using 1 row.
Implementing 1 should be a big win for the size of this table. Following that up with 2 would eliminate what I presume to be the vast majority of data in this table.
Version
0.2.132
Relevant log output
No response