-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Add --substrait-round-trip
option in sqllogictests
#16183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --substrait-round-trip
option in sqllogictests
#16183
Conversation
dd41d5b
to
2cc9244
Compare
2cc9244
to
e481a15
Compare
Code looks fine to me but I'm not familiar with the sqllogictests code so will leave it to Andrew to review properly. I really like this additional testing for Substrait, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I suggest adding a CI test that exercises this option with a passing .slt file (or create a minimal one). Otherwise this LGTM.
Something like this 1a05ba2? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great @gabotechs -- thank you
I also tried it locally with tests I know fail and got an error as expected:
cargo test --test sqllogictests -- --substrait-round-trip agg
...
10. query failed: DataFusion error: This feature is not implemented: Producing a row from empty relation is unsupported
[SQL] with data as (
select 1 as f, 4 as b
union all
select null as f, 99 as b
union all
select 2 as f, 5 as b
union all
select 98 as f, null as b
union all
select 3 as f, 6 as b
union all
select null as f, null as b
)
select corr(f, b)
from data
at /Users/andrewlamb/Software/datafusion2/datafusion/sqllogictest/test_files/aggregate.slt:539
@@ -476,6 +476,28 @@ jobs: | |||
POSTGRES_HOST: postgres | |||
POSTGRES_PORT: ${{ job.services.postgres.ports[5432] }} | |||
|
|||
sqllogictest-substrait: | |||
name: "Run sqllogictest in Substrait round-trip mode" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
Thank you for the review @2010YOUY01 |
* Add substrait roundtrip option in sqllogictests * Fix doc link and missing license header * Add README.md entry for the Substrait round-trip mode * Link tracking issue in README.md * Use clap's `conflicts_with` instead of manually checking flag compatibility * Add sqllogictest-substrait job to the CI * Revert committed formatting changes to README.md
Which issue does this PR close?
It probably does not fully closes it but it partially addresses:
Rationale for this change
Being able to execute the full sqllogictest suite adding a Substrait roundtrip. This means:
What changes are included in this PR?
Adds a new
DataFusionSubstraitRoundTrip
sqllogictest runner implementation that will do a Substrait roundtrip where it makes sense, and a new--substrait-round-trip
option to the sqllogictest CLI.Example:
Runs the full aggregate.slt file doing a Substrait roundtrip on each individual statement
cargo test --test sqllogictests -- --substrait-round-trip aggregate
Runs an individual failing test reproducing one specific issue with the current Substrait conversions
cargo test --test sqllogictests -- --substrait-round-trip aggregate:1522
This is still not enforced in the CI, and I would say the expectations are that it's just another tool to catch existing bugs in the current Substrait conversion code and better communicate them so that contributors have a clear path forward for contributing fixes.
Are these changes tested?
N/A
Are there any user-facing changes?
No