-
Notifications
You must be signed in to change notification settings - Fork 158
Fix execution errors caused by plan gap #3350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix execution errors caused by plan gap #3350
Conversation
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
…ine' into feature/calcite-engine-sort-fix
*/ | ||
RelCollation collation = osPlan.getTraitSet().getCollation(); | ||
if (!(osPlan instanceof Sort) && collation != RelCollations.EMPTY) { | ||
calcitePlan = LogicalSort.create(osPlan, collation, null, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for adding this NotNull annotation? Would be better to add this?
if (calcitePlan = null) return osPlan;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing special, it's auto-generated by IDEA. We can remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we expect that the collation can be preserved through the pipes over PPL
could u elberate more on this? i do not think PPL command should preserved order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qianheng-aws @LantaoJin
I seem, some PPL command should preserved order, for instance, fields
, take
, but others not, stats
.
In ANSI SQL, select a from tbl order by b
, order is perserved becuase of logically processing order of select statement is acutall from -> select -> order
.
CalciteOpenSearchIndexScan newScan = scan.pushDownFilter(filter); | ||
if (newScan != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newScan
won't be used after this check. the name newScan
confused me.
could you change it to
if (scan.pushDownFilter(filter) != null) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, it's typo. It happens the current logic works as well.
call.transformTo(scan);
->
call.transformTo(newScan);
@@ -56,24 +59,31 @@ public class CalciteOpenSearchIndexScan extends OpenSearchTableScan { | |||
*/ | |||
public CalciteOpenSearchIndexScan( | |||
RelOptCluster cluster, RelOptTable table, OpenSearchIndex index) { | |||
this(cluster, table, index, index.createRequestBuilder(), table.getRowType()); | |||
this(cluster, table, index, table.getRowType(), null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this(cluster, table, index, table.getRowType(), new PushDownContext());
and you can remove L74
@@ -118,17 +138,18 @@ public Enumerator<Object> enumerator() { | |||
}; | |||
} | |||
|
|||
public boolean pushDownFilter(Filter filter) { | |||
public CalciteOpenSearchIndexScan pushDownFilter(Filter filter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why change the return type? seems the return value is never in use besides the null-checker in OpenSearchFilterIndexScanRule
return newScan; | ||
} | ||
|
||
static class PushDownContext extends ArrayDeque<PushDownAction> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some comments about this context. such as purpose and usage
if (fields.size() == 1) { | ||
return current.tupleValue().get(fields.getFirst()).valueForCalcite(); | ||
} | ||
return fields.stream().map(k -> current.tupleValue().get(k).valueForCalcite()).toArray(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The number of fields
here has been reduced to the actual number of outputs, right? So get(k)
won't return null
anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed the workaround code to dev branch mistakenly. could you merge with the latest branch code to resolve conflicts @qianheng-aws
*/ | ||
RelCollation collation = osPlan.getTraitSet().getCollation(); | ||
if (!(osPlan instanceof Sort) && collation != RelCollations.EMPTY) { | ||
calcitePlan = LogicalSort.create(osPlan, collation, null, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we expect that the collation can be preserved through the pipes over PPL
could u elberate more on this? i do not think PPL command should preserved order.
@@ -85,8 +95,10 @@ public RelNode copy(RelTraitSet traitSet, List<RelNode> inputs) { | |||
@Override | |||
public void register(RelOptPlanner planner) { | |||
super.register(planner); | |||
for (RelOptRule rule : OpenSearchIndexRules.OPEN_SEARCH_INDEX_SCAN_RULES) { | |||
planner.addRule(rule); | |||
if (osIndex.getSettings().getSettingValue(Settings.Key.CALCITE_PUSHDOWN_ENABLED)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it for test purpose, right? I do not think end-user has any reasons to disable optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- For example, PPL
source=table | sort a | field a
will have planscan -> sort -> project
, we expect the final result should have collation on columna
.
However, calcite won't ensure that. If we translate the above PPL or plan into SQL, it should beselect a from (select * from table order by a)
. In Calcite, it doesn't ensure collation inner a subquery and will have several optimization to remove the sort operator.
- [remove sort in subquery when converting to RelNode] https://github.com/apache/calcite/blob/a94927e9b80f9f5bf639e31c2636536cb6aebc1a/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L6454C36-L6454C58
- [set root collation to be empty if top operator is not sort, then the optimizer won't choose a plan having sort operator since it will cost more while we don't have collation requirement for root] https://github.com/apache/calcite/blob/a94927e9b80f9f5bf639e31c2636536cb6aebc1a/core/src/main/java/org/apache/calcite/prepare/CalcitePrepareImpl.java#L1055
If we do want to ensure collation of the final result, the PPL should be source=table | field a | sort a
, which can be translated to SQL select a from table order by a
. That's why we need add sort operator to ensure collation and rely on optimizer to eliminate the redundant sort for the consideration of simplicity.
- Yeah, I think so. It's per @LantaoJin's suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- And it allows us to disable the push down feature quickly without rollback if there are some unexpected issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@penghuo Adding this configuration is by my suggestion. This is an advanced configuration aimed at developers. Similar to how databases typically have switches for optimization rules. Moreover, this configuration is currently very useful for our development - in some scenarios, enabling push-down might cause issues, and having this switch helps us determine whether problems are caused by push-down or other factors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The optimal solution would be to provide a method that allows free selection of optimization rules. However, considering there aren't many custom optimization rules in the short term, adding a push-down config seems to be the most cost-effective approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, make sense, if Settings.Key.CALCITE_PUSHDOWN_ENABLED default value is enabled, should be fine. let's consider remove it when we are confident.
Signed-off-by: Heng Qian <[email protected]>
…ine' into feature/calcite-engine-sort-fix # Conflicts: # opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexEnumerator.java
|
2971eae
into
opensearch-project:feature/calcite-engine
* Transform to calcite plan before executing Signed-off-by: Heng Qian <[email protected]> * Fix bug for single column row Signed-off-by: Heng Qian <[email protected]> * Add settings for calcite pushdown Signed-off-by: Heng Qian <[email protected]> * Lazily construct OpenSearchRequestBuilder and do push down Signed-off-by: Heng Qian <[email protected]> * Address comments and disable push down Signed-off-by: Heng Qian <[email protected]> --------- Signed-off-by: Heng Qian <[email protected]>
* Transform to calcite plan before executing Signed-off-by: Heng Qian <[email protected]> * Fix bug for single column row Signed-off-by: Heng Qian <[email protected]> * Add settings for calcite pushdown Signed-off-by: Heng Qian <[email protected]> * Lazily construct OpenSearchRequestBuilder and do push down Signed-off-by: Heng Qian <[email protected]> * Address comments and disable push down Signed-off-by: Heng Qian <[email protected]> --------- Signed-off-by: Heng Qian <[email protected]> Signed-off-by: xinyual <[email protected]>
Description
This PR fixes several errors caused by Calcite optimization. These issues arise because there are several gaps between our OpenSearch Plan and the Calcite Plan, and the Calcite optimization process is specific to SQL and its own semantic requirements.
This PR includes changes:
Add OpenSearch settings for calcite plan push down and make it disabled by default. Will set default to enabled once all tests works with push down.
Fix error for single column row. Calcite has optimization that transform such row into scalar object and every enumerable operators all follow that convention. In order to combine our scan operators with them, we should adjust to follow it as well.
Fix error for sort or collation behavior of the final results. Calcite doesn't ensure collation of a query if its plan doesn't end with sort operator. For example, it has optimization that will remove the sort inner a subquery since it doesn't ensure the collation outside of that subquery. While for the semantic of our PPL, we expect to ensure collation even a query has other operators(i.e. project) after sort command.
However, we don't plan to hack the prepare process of Calcite, the only thing we can do is transform our plan to conform to Calcite conventions. For this case, before optimization, we should add a sort operator as the root of our plan if the original root has collation for its output data. And in optimization, we rely on calcite optimizer to remove redundant sort operators.
Fix error for pushdown by lazily construct OpenSearchRequestBuilder. The previous implementation has a problem in optimization process that scan operators in different equivalent plans may share the same objects in OpenSearchRequestBuilder, since OpenSearchRequestBuilder cannot do deep copy. It has issue that transformation on one plan may affect other plans while it's never allowed.
To fix this, we should lazily construct OpenSearchRequestBuilder and maintain these push down actions in a queue, until the final implementation stage. Since queue supports deep copy and each push down actions are all enclosure, we can avoid the above issue.
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.