The `count` function needs `s"*"` for the argument in current dev version #2717

eitsupi · 2023-06-04T01:54:01Z

What happened?

From #2710 (comment), related to #2713

We are currently using workarounds in the following following locations:

prql/README.md

Line 34 in 69571f1

count s"*", # Trailing commas are allowed

prql/web/website/content/_index.md

Line 26 in 69571f1

ct = count s"*",

prql/prql-compiler/examples/compile-files/queries/variables.prql

Line 16 in 69571f1

ct = count s"*",

PRQL input

from a
derive {cnt = count}

SQL output

NA

Expected SQL output

SELECT
  *,
  COUNT(*) AS cnt
FROM
  a

MVCE confirmation

Minimal example
New issue

Anything else?

No response

aljazerzen · 2023-06-12T06:19:29Z

While working on the compiler, I had to change the rule which triggers the OVER clause for window functions to not look at the result type of the function, but instead look at the parameter type. This works well, but it has a problem: some window functions don't have any parameters:

count (there was non_null:, but that's a named param)
rank,
row_number,

For the sake of language design this should not matter, but it lead me to the following question:

How does the count function work, if it receives no parameters? What's the input of its computation? Does it just "pull from the relation"? How do rank and row_number know how many resulting rows should they output?

This was my justification for making them have a parameter. count now has a positional non_null, and rank and row_number have a column param that is not used in compilation.

This is obviously a bit inconvenient, so how could we design the language to work around this? To have the arguments as they were before, but don't use the "pull from the relation" bullshit justification?

I've hoped to find a quick fix, but then I found the similar issue of #2723 and wanted to clear that first, so the two do not get mixed up.

max-sixty · 2023-06-12T21:23:48Z

How does the count function work, if it receives no parameters? What's the input of its computation? Does it just "pull from the relation"? How do rank and row_number know how many resulting rows should they output?

Yes, I think we can think about it as one of:

it looks at the whole relation, rather than just one column.
it looks any column (and at least in the case of count, also counts nulls, unlike when a column is selected, as you describe)

Is this a big problem, though? Is your point that it introduces a completely new type for just a couple of functions? For the types, I was thinking the type is int rather than T?

aljazerzen · 2023-06-13T07:02:46Z

it looks at the whole relation, rather than just one column.

You mean like this?

from a = albums
aggregate {count a}

The problem is that this does not make any sense:

let x = row_number

... while this is completely fine:

let x = sum [1, 2, -6, 45]

So my question is: how do we want to describe the difference between these two functions?

My solution was to say "there is no difference, rank requires a column too".

let x = row_number [1, -2, 5]

assert x == [1, 2, 3]

max-sixty · 2023-06-13T22:53:57Z

You mean like this?

from a = albums
aggregate {count a}

Yes!

The problem is that this does not make any sense:
let x = row_number
... while this is completely fine:
let x = sum [1, 2, -6, 45]
So my question is: how do we want to describe the difference between these two functions?

My mental model is that they have the relation as a parameter:

let x = row_number [{a=1, b='a'}, {a=2, b='a'}]

...so indeed let x = row_number isn't correct.

Does that make sense?

aljazerzen · 2023-06-14T08:31:40Z

Makes sense, I fully agree.

Now, we just need a way to reference current relation in all contexts (without using _frame) and a solution for #2723.

eitsupi · 2023-06-18T03:16:40Z

Since this is seemingly related to the larger issue #2723, can we mark it as a known issue to Changelog, etc. and remove it from the 0.9 milestone and do a 0.9 release?

I believe that the syntax changes between the last release and the development version are currently causing confusion for new users, so I wonder if it might be worthwhile to expedite the 0.9 release, even if there are some flaws. (Of course, if we could solve these issues right away, that would be great.)

eitsupi added the bug Invalid compiler output or panic label Jun 4, 2023

eitsupi changed the title ~~The count function needs s"*" for the argument~~ The count function needs s"*" for the argument in current dev version Jun 4, 2023

eitsupi added this to the 0.9 milestone Jun 4, 2023

aljazerzen added language-design Changes to PRQL-the-language compiler and removed bug Invalid compiler output or panic labels Jun 6, 2023

This was referenced Jun 11, 2023

Release plan for 0.9.0 #2689

Closed

tweak: Fix Readme alignment #2796

Merged

This comment was marked as off-topic.

Sign in to view

eitsupi mentioned this issue Jun 18, 2023

Stack overflow on Windows #2857

Open

This comment was marked as off-topic.

Sign in to view

aljazerzen mentioned this issue Jul 3, 2023

feat: replace count s"*" with count this #2936

Merged

aljazerzen closed this as completed in #2936 Jul 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The `count` function needs `s"*"` for the argument in current dev version #2717

The `count` function needs `s"*"` for the argument in current dev version #2717

eitsupi commented Jun 4, 2023 •

edited

Loading

aljazerzen commented Jun 12, 2023 •

edited

Loading

max-sixty commented Jun 12, 2023

aljazerzen commented Jun 13, 2023

max-sixty commented Jun 13, 2023

aljazerzen commented Jun 14, 2023

eitsupi commented Jun 18, 2023

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

The count function needs s"*" for the argument in current dev version #2717

The count function needs s"*" for the argument in current dev version #2717

Comments

eitsupi commented Jun 4, 2023 • edited Loading

What happened?

PRQL input

SQL output

Expected SQL output

MVCE confirmation

Anything else?

aljazerzen commented Jun 12, 2023 • edited Loading

max-sixty commented Jun 12, 2023

aljazerzen commented Jun 13, 2023

max-sixty commented Jun 13, 2023

aljazerzen commented Jun 14, 2023

eitsupi commented Jun 18, 2023

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

The `count` function needs `s"*"` for the argument in current dev version #2717

The `count` function needs `s"*"` for the argument in current dev version #2717

eitsupi commented Jun 4, 2023 •

edited

Loading

aljazerzen commented Jun 12, 2023 •

edited

Loading