Skip to content

fix: make column.unnest().topk() work #11259

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

NickCrews
Copy link
Contributor

Also simplifies the implementation of Column.topk() and
makes it re-use the implementation of Table.topk()

This does NOT fix column.unnest().topk(by=<anything besides None>). This is a rare case so I decided to just punt on implementing it

This also factors out the common marks for when arrays, structs, maps, json are not supported by a backend. I wanted to reuse that mark, and I bet there are many existing tests that could benefit from this utilility.

@github-actions github-actions bot added the tests Issues or PRs related to tests label May 28, 2025
@NickCrews
Copy link
Contributor Author

I split out the test refactoring into #11315, maybe we merge that and then I revade this to only include the behavior changes

@NickCrews
Copy link
Contributor Author

@cpcloud This is now a lot simpler now that the test refactor is merged separately, and hopefully should be a quick review.

Also simplifies the implementation of Column.topk() and
makes it re-use the implementation of Table.topk()

This does NOT fix `column.unnest().topk(by=<anything besides None>)`. This is a rare case so I decided to just punt on implementing it
@cpcloud
Copy link
Member

cpcloud commented Jun 10, 2025

Seems to be failing more than half of CI checks.

@cpcloud
Copy link
Member

cpcloud commented Jun 11, 2025

Given that grouping by UNNEST works in postgres, I'd much rather get this implemented upstream in DuckDB than hack around it here.

@cpcloud
Copy link
Member

cpcloud commented Jun 11, 2025

Opened a feature request in DuckDB for adding support for grouping by unnests: duckdb/duckdb#17884

@cpcloud
Copy link
Member

cpcloud commented Jun 12, 2025

I wonder if there's some DuckDB specific thing we can do here that would take any query with SELECT UNNEST(x) FROM t and rewrite it as SELECT * REPLACE (UNNEST(x) AS x) FROM t.

@cpcloud
Copy link
Member

cpcloud commented Jun 12, 2025

The hairy part is of course that UNNEST can be buried in an arbitrarily deep expression such as 1 * 2 * LOG(UNNEST(x))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests Issues or PRs related to tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants