-
Notifications
You must be signed in to change notification settings - Fork 641
feat(optimizer): index accelerating TopN
#7726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Eurekaaw <[email protected]>
Signed-off-by: Eurekaaw <[email protected]>
TopN
TopN
|
Signed-off-by: Eurekaaw <[email protected]>
Signed-off-by: Eurekaaw <[email protected]>
Signed-off-by: Eurekaaw <[email protected]>
Signed-off-by: Eurekaaw <[email protected]>
Some planner tests which seem quite unrelated fail 🤨 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a planner test and a e2e test? Rest LGTM!
// For Optimizing TopN | ||
pub fn get_child(&self) -> PlanRef { | ||
self.core.input.clone() | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can just use existing method input()
.
impl Rule for AggOnIndexRule { | ||
fn apply(&self, plan: PlanRef) -> Option<PlanRef> { | ||
let logical_topn: &LogicalTopN = plan.as_logical_top_n()?; | ||
let logical_scan: LogicalScan = logical_topn.get_child().as_logical_scan()?.to_owned(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This rule name is inconsistent with its match pattern. Maybe TopNOnIndexRule
is more appropriate.
Signed-off-by: Eurekaaw <[email protected]>
Co-authored-by: lmatz <[email protected]>
.map(|idx_item| FieldOrder { | ||
index: idx_item.index, | ||
direct: Asc, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure whether we've supported it, but we can specify DESC
order when creating index in postgres (which is often used along with NULLS FIRST/LAST
). So maybe we should not hard-code an Asc
here, but use index_table.pk
instead.
cc @chenzl25
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. Use index_table.pk
instead of hard-code Asc
even if we do not support creating indexes with descending ordering currently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since index_items
and index_table.pk.items()
are not quite matched, I'm using index_table.pk.items()
instead and it seems won't affect the result. Actually this part of code is derived from
let index = self.indexes().iter().find(|idx| { |
Please also update the PR description as it will be used as the commit message.
The batch scan will output at least one chunk to downstream, so we still have to scan 1024(default chunk size) * parallelism records. To achieve a better effect, we may also need to tell the scan to reduce the chunk size. 🤔 |
Are we going to support it in next PRs? |
Signed-off-by: Eurekaaw <[email protected]>
Are you talking about
point_get works. Can we leverage this?
|
Signed-off-by: Eurekaaw <[email protected]>
Exactly. I guess this is not related to |
Signed-off-by: Eurekaaw <[email protected]>
Signed-off-by: Eurekaaw <[email protected]>
Signed-off-by: Eurekaaw <[email protected]>
Codecov Report
@@ Coverage Diff @@
## main #7726 +/- ##
========================================
Coverage 71.70% 71.71%
========================================
Files 1096 1097 +1
Lines 174608 174724 +116
========================================
+ Hits 125208 125303 +95
- Misses 49400 49421 +21
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@@ -86,6 +86,7 @@ impl LogicalScan { | |||
table_desc, | |||
indexes, | |||
predicate, | |||
chunk_size: 1024, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about using an Option
here? If it's None
, then the batch executor will follow the value from the config, so that we don't need to hard code a 1024
here.
TIPS: wrap a primitive into a new message will make it optional
in proto3 like this:
risingwave/proto/stream_plan.proto
Lines 632 to 634 in 9e1ff93
message Parallelism { | |
uint64 parallelism = 1; | |
} |
direct: if op.order_type == OrderType::Ascending { | ||
Direction::Asc | ||
} else { | ||
Direction::Desc | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😄 I guess @richardchien is working on unifying these types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:doge:
BTW there's no conflict for now. I'll unify these types after finishing distinct aggregator, which may need some days.
let index = logical_scan.indexes().iter().find(|idx| { | ||
Order { | ||
field_order: idx | ||
.index_table | ||
.pk() | ||
.iter() | ||
.map(|idx_item| FieldOrder { | ||
index: idx_item.index, | ||
direct: idx_item.direct, | ||
}) | ||
.collect(), | ||
} | ||
.satisfies(order) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't compare index order directly with the topn order, because they refer to different columns. We need a map between them. The following case can reproduce this bug.
create table t (a int primary key, b int);
create index idx on t(b);
explain select * from t order by b limit 1;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then should we test if the index order is a superset of the topn order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need it and actually satisfies
method has already handled the superset problem, but considering the above case, my point is we need to compare index order with topn order at the same level, that is, both refer to primary table column or index column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, I'll try to solve it in the next commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems solved 😁
dev=> create table t (v1 int, v2 int, v3 int);
CREATE_TABLE
dev=> create index idx on t (v2) include (v2, v3);
CREATE_INDEX
dev=> explain select v2, v3 from t order by v2, v3 limit 1;
QUERY PLAN
----------------------------------------------------------------------
BatchTopN { order: "[t.v2 ASC, t.v3 ASC]", limit: 1, offset: 0 }
└─BatchExchange { order: [], dist: Single }
└─BatchTopN { order: "[t.v2 ASC, t.v3 ASC]", limit: 1, offset: 0 }
└─BatchScan { table: t, columns: [v2, v3] }
(4 rows)
dev=> explain select v2, v3 from t order by v2 limit 1;
QUERY PLAN
-------------------------------------------------------
BatchLimit { limit: 1, offset: 0 }
└─BatchExchange { order: [idx.v2 ASC], dist: Single }
└─BatchLimit { limit: 1, offset: 0 }
└─BatchScan { table: idx, columns: [v2, v3] }
(4 rows)
Signed-off-by: Eurekaaw <[email protected]>
Signed-off-by: Eurekaaw <[email protected]>
Signed-off-by: Eurekaaw <[email protected]>
Signed-off-by: Eurekaaw <[email protected]>
Signed-off-by: Eurekaaw <[email protected]>
Have no idea of the misc-check
|
Signed-off-by: Eurekaaw <[email protected]>
One more case needs to be handled. When a table scan has a filter which could be a primary lookup followed by a topn. It seems we should ensure we still choose the primary table scan instead of the index. create table t (a int primary key, b int);
create index idx on t(b);
explain select * from t where a = 1 order by b limit 1; |
Could you explain a bit about the reason of doing primary lookup? |
proto/batch_plan.proto
Outdated
@@ -24,6 +24,12 @@ message RowSeqScanNode { | |||
common.Buffer vnode_bitmap = 4; | |||
// Whether the order on output columns should be preserved. | |||
bool ordered = 5; | |||
|
|||
message ChunkSize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trailing whitespace here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
Because primary point lookup is more efficient than index range scan in this case. Maybe we can only apply this rule when there are no predicates in the table scan. Actually this decision should be done by a cost based optimizer, but we don't have now, so let's optimize those cases can be improved definitely. |
Signed-off-by: Eurekaaw <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Great work, thank you.
uint32 chunk_size = 1; | ||
} | ||
// If along with `batch_limit`, `chunk_size` will be set. | ||
ChunkSize chunk_size = 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not optional uint32 chunk_size
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember long ago we forbid optional
in proto3 for some reason, but it seems to work now. 😂 Both LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose we leave it for a future task that refractors all those optional workarounds 😂
Signed-off-by: Eurekaaw <[email protected]>
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
Optimizing applicable
LogicalTopN-LogicalScan
toLogicalLimit-LogicalScan
by a new rule namedtop_n_on_index
that arbitrarily match the above structure.Please explain IN DETAIL what the changes are in this PR and why they are needed:
LogicalTopN
will be replaced withLogicalLimit
if there is applicable index or primary key.On index:
On primary key:
Checklist
- [ ] I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features)../risedev check
(or alias,./risedev c
)Refer to a related PR or issue link (optional)
#7656