Skip to content

feat(watermark): topNExec state cleaning #8106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Tracked by #6042
yuhao-su opened this issue Feb 21, 2023 · 6 comments
Closed
Tracked by #6042

feat(watermark): topNExec state cleaning #8106

yuhao-su opened this issue Feb 21, 2023 · 6 comments
Assignees
Labels
type/feature Type: New feature.
Milestone

Comments

@yuhao-su
Copy link
Contributor

Plain top-n:

Ascending

We need to range delete records above the watermark, while currently, we can only delete those below the watermark.

Descending

This is more tricky. When watermark w arrives, we need to locate the smallest record with the first column value larger than w as the n-th records. We can clean all records with the watermark column smaller than the watermark value of (n+offset+limit)-th records.

In this case, two functions from the storage layer might be needed,

  • locate the first record smaller than x, return the position.
  • get the value at position pos

Group top-n:

For group topn with group key a, order by b, to clean the state, range delete for b is needed. (currently we can only range delete with a give watermark in the first column)

@yuhao-su yuhao-su added the type/feature Type: New feature. label Feb 21, 2023
@github-actions github-actions bot added this to the release-0.1.18 milestone Feb 21, 2023
@wcy-fdu
Copy link
Contributor

wcy-fdu commented Feb 22, 2023

Operators generally access storage through StateTable, could you please give the definition of such two interface in StateTable or relational semantics?

@BugenZhao
Copy link
Member

May I ask about the use cases for ordering by the watermark (usually time) column in ascending manner? 🤔

@yuhao-su
Copy link
Contributor Author

yuhao-su commented Feb 22, 2023

May I ask about the use cases for ordering by the watermark (usually time) column in ascending manner? 🤔

I'm not sure about the exact use case. But it is possible for users to write such queries. cc @TennyZhuang @fuyufjh @st1page

@st1page
Copy link
Contributor

st1page commented Feb 22, 2023

Emmm... I can not think up any use case for now. Let's remain this issue and see if it is required.

@yuhao-su
Copy link
Contributor Author

Could you please give the definition of such two interface in StateTable or relational semantics?

Iet's wait to see the use case before decide a general interface.

@st1page st1page changed the title more storage function needed for cleaning top-n state feat(watermark): topNExec state cleaning Mar 22, 2023
@fuyufjh fuyufjh modified the milestones: release-0.18, release-0.19 Mar 22, 2023
@xxchan xxchan mentioned this issue Apr 21, 2023
22 tasks
@fuyufjh fuyufjh modified the milestones: release-1.1, release-1.2 Aug 8, 2023
@st1page st1page modified the milestones: release-1.6, release-1.7 Jan 9, 2024
@st1page
Copy link
Contributor

st1page commented Aug 19, 2024

Let's wait some real-world requeriment about the topN's cleaning

@st1page st1page closed this as not planned Won't fix, can't repro, duplicate, stale Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature Type: New feature.
Projects
None yet
Development

No branches or pull requests

5 participants