Skip to content

Calcite enable pushdown aggregation #3389

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

qianheng-aws
Copy link
Collaborator

Description

This PR include changes:

  1. Migrate pushdown process from physical index scan operator to logical index scan operator. The reason is that, in Calcite, some functions(i.e. AVG) only have logical definition while without physical definition or implementation, so we cannot push down such functions for a physical plan.
  2. Enable aggregation pushdown for the general cases
  3. Add VARSAMP and VARPOP implementation.

Related Issues

Resolves #3332

./gradlew :integ-test:integTest --tests '*Calcite*IT' succeed locally.

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Heng Qian <[email protected]>
@LantaoJin
Copy link
Member

Please update the latest code and rerun all ITs and UTs locally. PR added and refactored many ITs and UTs.

Copy link
Member

@LantaoJin LantaoJin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still need UT for pushdown features, it shouldn't be blocked by explain feature.
We can add a CalcitePushdownTest with following code for checking.

RelNode finalPlan = Frameworks.getPlanner(getConfig().build()).transform(0, RelTraitSet.createEmpty(), root);
String finalPlanString = RelOptUtil.dumpPlan("", finalPlan, SqlExplainFormat.TEXT, SqlExplainLevel.ALL_ATTRIBUTES);

import org.opensearch.sql.opensearch.storage.OpenSearchIndex;

@Getter
public class CalciteLogicalOSIndexScan extends CalciteOSIndexScan {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: "OS" is not very convenient for searching classes. (It's fine to use it as a variable name). I think we can remove the "OS" in class name, so does in its parent class. We don't have other type of Index. Or keeping the full name "OpenSearch" also acceptable.

import org.opensearch.sql.opensearch.storage.OpenSearchIndex;

/** Relational expression representing a scan of an OpenSearchIndex type. */
public abstract class CalciteOSIndexScan extends TableScan {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto. rename to CalciteOpenSearchIndexScan or CalciteIndexScan


@Override
public boolean add(PushDownAction pushDownAction) {
// Defense check. I never do push down to this context after aggregate push-down.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove "I"

PushDownContext pushDownContext) {
super(cluster, traitSet, hints, table);
this.osIndex = requireNonNull(osIndex, "OpenSearch index");
;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove it? format checker does not detect it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems "./gradlew spotlessApply" doesn't fix it.

return newScan;
} catch (Exception e) {
if (LOG.isDebugEnabled()) {
LOG.debug("Cannot pushdown the aggregate {}", aggregate, e);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reasons only add stacktrace in debug log?

Copy link
Collaborator Author

@qianheng-aws qianheng-aws Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are too many cases where there is unsupported pushdown aggreagte/filter and it will report such issue many times since Calcite will hit this rule many times in its optimizing processes.

Hence, we need to avoid log explosion by removing trackstrace in no debugging mode.

@penghuo penghuo added the calcite calcite migration releated label Mar 7, 2025
Signed-off-by: Heng Qian <[email protected]>
…ine' into feature/calcite-engine-pushdown-agg
@qianheng-aws
Copy link
Collaborator Author

I think we still need UT for pushdown features, it shouldn't be blocked by explain feature. We can add a CalcitePushdownTest with following code for checking.

RelNode finalPlan = Frameworks.getPlanner(getConfig().build()).transform(0, RelTraitSet.createEmpty(), root);
String finalPlanString = RelOptUtil.dumpPlan("", finalPlan, SqlExplainFormat.TEXT, SqlExplainLevel.ALL_ATTRIBUTES);

Frameworks.getPlanner is not suitable for this purpose. It has state inner and is designed for planning a query from query string. We cannot directly use its transform API.

May need further investigation or implementation on getting the final optimized plan.

…ine' into feature/calcite-engine-pushdown-agg
@LantaoJin LantaoJin merged commit dcf2057 into opensearch-project:feature/calcite-engine Mar 7, 2025
4 of 13 checks passed
penghuo pushed a commit to penghuo/os-sql that referenced this pull request Mar 12, 2025
* Change push down to logical index scan

Signed-off-by: Heng Qian <[email protected]>

* Support Aggregate Push Down

Signed-off-by: Heng Qian <[email protected]>

* Rebase and resolve conflict

Signed-off-by: Heng Qian <[email protected]>

* Add TODO

Signed-off-by: Heng Qian <[email protected]>

* Address comments

Signed-off-by: Heng Qian <[email protected]>

---------

Signed-off-by: Heng Qian <[email protected]>
penghuo pushed a commit that referenced this pull request Jun 16, 2025
* Change push down to logical index scan

Signed-off-by: Heng Qian <[email protected]>

* Support Aggregate Push Down

Signed-off-by: Heng Qian <[email protected]>

* Rebase and resolve conflict

Signed-off-by: Heng Qian <[email protected]>

* Add TODO

Signed-off-by: Heng Qian <[email protected]>

* Address comments

Signed-off-by: Heng Qian <[email protected]>

---------

Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: xinyual <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
calcite calcite migration releated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants