Skip to content

Improve performance of constant aggregate window expression #16234

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 4, 2025

Conversation

suibianwanwank
Copy link
Contributor

Which issue does this PR close?

  • Closes #.

Rationale for this change

For unbounded aggregate window functions, the result is the same for all rows within a partition. Therefore, we can avoid repeatedly computing the window bounds and evaluating the function for each row, which can improve performance.

Related Work in DuckDB:https://github.com/duckdb/duckdb/blob/main/src/function/window/window_constant_aggregator.cpp

Benchmark results on h2o_small_window (run locally) show performance improvements for Q1 and Q3, with no noticeable changes in other queries.

Before Optimization

Q1: -- Basic Window
SELECT
    id1,
    id2,
    id3,
    v2,
    sum(v2) OVER () AS window_basic
FROM large;
Query 1 iteration 1 took 1297.6 ms and returned 10000000 rows
Query 1 iteration 2 took 804.3 ms and returned 10000000 rows
Query 1 iteration 3 took 582.2 ms and returned 10000000 rows
Query 1 avg time: 894.72 ms

Q3: -- PARTITION BY
SELECT
    id1,
    id2,
    id3,
    v2,
    sum(v2) OVER (PARTITION BY id1) AS sum_by_id1,
    sum(v2) OVER (PARTITION BY id2) AS sum_by_id2,
    sum(v2) OVER (PARTITION BY id3) AS sum_by_id3
FROM large;
Query 3 iteration 1 took 1532.6 ms and returned 10000000 rows
Query 3 iteration 2 took 1755.2 ms and returned 10000000 rows
Query 3 iteration 3 took 1821.4 ms and returned 10000000 rows
Query 3 avg time: 1703.09 ms

After Optimization

Q1: -- Basic Window
SELECT
    id1,
    id2,
    id3,
    v2,
    sum(v2) OVER () AS window_basic
FROM large;
Query 1 iteration 1 took 253.6 ms and returned 10000000 rows
Query 1 iteration 2 took 190.3 ms and returned 10000000 rows
Query 1 iteration 3 took 88.7 ms and returned 10000000 rows
Query 1 avg time: 177.54 ms

Q3: -- PARTITION BY
SELECT
    id1,
    id2,
    id3,
    v2,
    sum(v2) OVER (PARTITION BY id1) AS sum_by_id1,
    sum(v2) OVER (PARTITION BY id2) AS sum_by_id2,
    sum(v2) OVER (PARTITION BY id3) AS sum_by_id3
FROM large;
Query 3 iteration 1 took 1534.7 ms and returned 10000000 rows
Query 3 iteration 2 took 1541.5 ms and returned 10000000 rows
Query 3 iteration 3 took 1320.6 ms and returned 10000000 rows
Query 3 avg time: 1465.59 ms

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the physical-expr Changes to the physical-expr crates label Jun 2, 2025
@suibianwanwank
Copy link
Contributor Author

FYI @alamb @Dandandan

Copy link
Contributor

@jonathanc-n jonathanc-n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! small nit

@jonathanc-n
Copy link
Contributor

jonathanc-n commented Jun 3, 2025

I ran some of my own benchmarks and it looks significantly faster 🚀 :

--------------------
Benchmark h2o_window.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      test ┃ constant_agg_window ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  652.58ms │            231.97ms │ +2.81x faster │
│ QQuery 2     │ 1484.23ms │           1474.34ms │     no change │
│ QQuery 3     │ 3722.48ms │           3032.43ms │ +1.23x faster │
│ QQuery 4     │  902.19ms │            742.78ms │ +1.21x faster │
│ QQuery 5     │ 1183.35ms │           1032.99ms │ +1.15x faster │
│ QQuery 6     │ 1143.23ms │           1108.38ms │     no change │
│ QQuery 7     │ 1138.37ms │           1134.46ms │     no change │
│ QQuery 8     │ 4860.75ms │           4273.91ms │ +1.14x faster │
│ QQuery 9     │  780.54ms │            678.87ms │ +1.15x faster │
│ QQuery 10    │  927.86ms │            664.33ms │ +1.40x faster │
│ QQuery 11    │  804.48ms │            676.44ms │ +1.19x faster │
│ QQuery 12    │ 1365.07ms │           1229.44ms │ +1.11x faster │
└──────────────┴───────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (test)                  │ 18965.13ms │
│ Total Time (constant_agg_window)   │ 16280.34ms │
│ Average Time (test)                │  1580.43ms │
│ Average Time (constant_agg_window) │  1356.70ms │
│ Queries Faster                     │          9 │
│ Queries Slower                     │          0 │
│ Queries with No Change             │          3 │
└────────────────────────────────────┴────────────┘

@alamb
Copy link
Contributor

alamb commented Jun 3, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubuntu SMP Wed Apr 2 16:34:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing constant_agg_window (cf3d1b0) to 7248259 diff
Benchmarks: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Jun 3, 2025

🤖: Benchmark completed

Details

Comparing HEAD and constant_agg_window
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ constant_agg_window ┃    Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 0     │  1960.38ms │           2006.73ms │ no change │
│ QQuery 1     │   739.60ms │            721.00ms │ no change │
│ QQuery 2     │  1487.69ms │           1438.68ms │ no change │
│ QQuery 3     │   697.82ms │            728.93ms │ no change │
│ QQuery 4     │  1469.14ms │           1499.34ms │ no change │
│ QQuery 5     │ 15473.54ms │          15742.51ms │ no change │
│ QQuery 6     │  2107.60ms │           2063.04ms │ no change │
│ QQuery 7     │  2153.83ms │           2176.00ms │ no change │
│ QQuery 8     │   869.68ms │            881.17ms │ no change │
└──────────────┴────────────┴─────────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 26959.29ms │
│ Total Time (constant_agg_window)   │ 27257.42ms │
│ Average Time (HEAD)                │  2995.48ms │
│ Average Time (constant_agg_window) │  3028.60ms │
│ Queries Faster                     │          0 │
│ Queries Slower                     │          0 │
│ Queries with No Change             │          9 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ constant_agg_window ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0     │    14.46ms │             15.32ms │ 1.06x slower │
│ QQuery 1     │    32.98ms │             32.33ms │    no change │
│ QQuery 2     │    82.86ms │             79.83ms │    no change │
│ QQuery 3     │    96.46ms │             99.41ms │    no change │
│ QQuery 4     │   585.72ms │            588.16ms │    no change │
│ QQuery 5     │   871.68ms │            856.93ms │    no change │
│ QQuery 6     │    22.57ms │             24.38ms │ 1.08x slower │
│ QQuery 7     │    37.98ms │             37.25ms │    no change │
│ QQuery 8     │   909.98ms │            938.44ms │    no change │
│ QQuery 9     │  1199.01ms │           1217.32ms │    no change │
│ QQuery 10    │   274.47ms │            266.67ms │    no change │
│ QQuery 11    │   305.91ms │            297.36ms │    no change │
│ QQuery 12    │   915.18ms │            934.90ms │    no change │
│ QQuery 13    │  1275.03ms │           1354.71ms │ 1.06x slower │
│ QQuery 14    │   869.25ms │            867.06ms │    no change │
│ QQuery 15    │   846.20ms │            865.31ms │    no change │
│ QQuery 16    │  1763.37ms │           1780.27ms │    no change │
│ QQuery 17    │  1636.18ms │           1642.84ms │    no change │
│ QQuery 18    │  3127.73ms │           3124.27ms │    no change │
│ QQuery 19    │    87.17ms │             89.95ms │    no change │
│ QQuery 20    │  1147.52ms │           1150.63ms │    no change │
│ QQuery 21    │  1352.27ms │           1350.53ms │    no change │
│ QQuery 22    │  2300.77ms │           2247.50ms │    no change │
│ QQuery 23    │  8317.70ms │           8163.20ms │    no change │
│ QQuery 24    │   491.55ms │            467.96ms │    no change │
│ QQuery 25    │   415.86ms │            396.35ms │    no change │
│ QQuery 26    │   549.42ms │            544.60ms │    no change │
│ QQuery 27    │  1649.42ms │           1617.72ms │    no change │
│ QQuery 28    │ 12706.23ms │          13416.50ms │ 1.06x slower │
│ QQuery 29    │   528.52ms │            528.06ms │    no change │
│ QQuery 30    │   829.55ms │            811.62ms │    no change │
│ QQuery 31    │   870.55ms │            847.78ms │    no change │
│ QQuery 32    │  2706.05ms │           2642.34ms │    no change │
│ QQuery 33    │  3363.50ms │           3384.95ms │    no change │
│ QQuery 34    │  3411.12ms │           3390.22ms │    no change │
│ QQuery 35    │  1298.50ms │           1322.64ms │    no change │
│ QQuery 36    │   127.33ms │            135.10ms │ 1.06x slower │
│ QQuery 37    │    55.66ms │             55.95ms │    no change │
│ QQuery 38    │   123.29ms │            125.11ms │    no change │
│ QQuery 39    │   202.71ms │            198.19ms │    no change │
│ QQuery 40    │    48.95ms │             48.29ms │    no change │
│ QQuery 41    │    43.96ms │             44.27ms │    no change │
│ QQuery 42    │    37.96ms │             38.06ms │    no change │
└──────────────┴────────────┴─────────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 57532.57ms │
│ Total Time (constant_agg_window)   │ 58040.30ms │
│ Average Time (HEAD)                │  1337.97ms │
│ Average Time (constant_agg_window) │  1349.77ms │
│ Queries Faster                     │          0 │
│ Queries Slower                     │          5 │
│ Queries with No Change             │         38 │
└────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     HEAD ┃ constant_agg_window ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 116.20ms │            119.80ms │     no change │
│ QQuery 2     │  21.89ms │             21.33ms │     no change │
│ QQuery 3     │  34.88ms │             35.14ms │     no change │
│ QQuery 4     │  19.97ms │             20.03ms │     no change │
│ QQuery 5     │  52.52ms │             53.93ms │     no change │
│ QQuery 6     │  11.89ms │             11.83ms │     no change │
│ QQuery 7     │  96.92ms │             98.26ms │     no change │
│ QQuery 8     │  25.43ms │             25.58ms │     no change │
│ QQuery 9     │  59.28ms │             60.08ms │     no change │
│ QQuery 10    │  57.36ms │             57.96ms │     no change │
│ QQuery 11    │  11.32ms │             11.34ms │     no change │
│ QQuery 12    │  42.56ms │             41.51ms │     no change │
│ QQuery 13    │  27.44ms │             27.29ms │     no change │
│ QQuery 14    │   9.73ms │              9.67ms │     no change │
│ QQuery 15    │  22.41ms │             22.12ms │     no change │
│ QQuery 16    │  21.10ms │             21.64ms │     no change │
│ QQuery 17    │  95.24ms │             95.86ms │     no change │
│ QQuery 18    │ 207.86ms │            216.85ms │     no change │
│ QQuery 19    │  26.86ms │             25.40ms │ +1.06x faster │
│ QQuery 20    │  35.35ms │             34.34ms │     no change │
│ QQuery 21    │ 158.00ms │            161.87ms │     no change │
│ QQuery 22    │  16.94ms │             16.75ms │     no change │
└──────────────┴──────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 1171.17ms │
│ Total Time (constant_agg_window)   │ 1188.59ms │
│ Average Time (HEAD)                │   53.24ms │
│ Average Time (constant_agg_window) │   54.03ms │
│ Queries Faster                     │         1 │
│ Queries Slower                     │         0 │
│ Queries with No Change             │        21 │
└────────────────────────────────────┴───────────┘

@alamb
Copy link
Contributor

alamb commented Jun 3, 2025

🤖: Benchmark completed

ran the wrong benchmark 😢 -- will fix

@alamb
Copy link
Contributor

alamb commented Jun 3, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubuntu SMP Wed Apr 2 16:34:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing constant_agg_window (cf3d1b0) to 7248259 diff
Benchmarks: h2o_small_window
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Jun 3, 2025

🤖: Benchmark completed

Details

Comparing HEAD and constant_agg_window
--------------------
Benchmark h2o_window.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ constant_agg_window ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  1873.46ms │            335.27ms │ +5.59x faster │
│ QQuery 2     │  3470.24ms │           3437.67ms │     no change │
│ QQuery 3     │  3803.69ms │           3677.12ms │     no change │
│ QQuery 4     │   997.60ms │            958.41ms │     no change │
│ QQuery 5     │  2284.09ms │           2326.52ms │     no change │
│ QQuery 6     │  3003.38ms │           3024.87ms │     no change │
│ QQuery 7     │  2920.74ms │           2988.71ms │     no change │
│ QQuery 8     │ 11458.37ms │          11838.48ms │     no change │
│ QQuery 9     │   863.68ms │            852.94ms │     no change │
│ QQuery 10    │   914.87ms │            922.66ms │     no change │
│ QQuery 11    │   942.20ms │            926.73ms │     no change │
│ QQuery 12    │  1790.66ms │           1804.70ms │     no change │
└──────────────┴────────────┴─────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                  ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                  │ 34322.98ms │
│ Total Time (constant_agg_window)   │ 33094.07ms │
│ Average Time (HEAD)                │  2860.25ms │
│ Average Time (constant_agg_window) │  2757.84ms │
│ Queries Faster                     │          1 │
│ Queries Slower                     │          0 │
│ Queries with No Change             │         11 │
└────────────────────────────────────┴────────────┘

@alamb
Copy link
Contributor

alamb commented Jun 3, 2025

│ QQuery 1 │ 1873.46ms │ 335.27ms │ +5.59x faster │

Quite nice!

@jonathanc-n
Copy link
Contributor

Those benchmarks look nice, seems to have been skewed on my computer for the other queries.

@alamb
Copy link
Contributor

alamb commented Jun 3, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubuntu SMP Wed Apr 2 16:34:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing constant_agg_window (cf3d1b0) to 7248259 diff
Benchmarks: h2o_medium_window
Results will be posted here when complete

Copy link
Contributor

@2010YOUY01 2010YOUY01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Nice optimization!

@@ -186,6 +186,10 @@ pub trait AggregateWindowExpr: WindowExpr {
accumulator: &mut Box<dyn Accumulator>,
) -> Result<ScalarValue>;

/// Indicates whether this window function always produces the same result
/// for all rows in the partition.
fn is_constant(&self) -> bool;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn is_constant(&self) -> bool;
fn is_constant_in_partition(&self) -> bool;

nit: It would be great to be more specific.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll address it a bit later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@2010YOUY01 2010YOUY01 requested a review from Copilot June 4, 2025 02:54
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves the performance of constant aggregate window expressions by bypassing repeated accumulator evaluations for unbounded window functions.

  • Introduces the is_constant method in the AggregateWindowExpr trait
  • Implements constant window optimization in window_expr.rs and adds a corresponding field and evaluation logic in aggregate.rs
  • Updates the SlidingAggregateWindowExpr to reflect non-constant behavior

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
datafusion/physical-expr/src/window/window_expr.rs Adds is_constant method and shortcut evaluation for constant windows
datafusion/physical-expr/src/window/sliding_aggregate.rs Implements is_constant as false
datafusion/physical-expr/src/window/aggregate.rs Adds is_window_constant field and logic to determine constant windows
Comments suppressed due to low confidence (1)

datafusion/physical-expr/src/window/window_expr.rs:280

  • Please add unit tests to explicitly verify that the constant window optimization behaves correctly for both constant and non-constant window cases.
if self.is_constant() {

Copy link
Contributor

@jonathanc-n jonathanc-n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @suibianwanwank, this looks good to me! @alamb Should be good for merge.

@alamb alamb merged commit 0c30374 into apache:main Jun 4, 2025
27 checks passed
@alamb
Copy link
Contributor

alamb commented Jun 4, 2025

🚀 thank you

@alamb
Copy link
Contributor

alamb commented Jun 6, 2025

We found a bug in this change during DataFusion 48 testing

if we can't fix it shortly I think we should back out this PR and we can include it in DataFusion 49

@suibianwanwank
Copy link
Contributor Author

@alamb I'm sorry for the issue that occurred. After a preliminary review, I suspect the cause might be:

diff --git a/datafusion/physical-expr/src/window/window_expr.rs b/datafusion/physical-expr/src/window/window_expr.rs
index 70a73c44a..8c9cb90b9 100644
--- a/datafusion/physical-expr/src/window/window_expr.rs
+++ b/datafusion/physical-expr/src/window/window_expr.rs
@@ -276,11 +276,12 @@ pub trait AggregateWindowExpr: WindowExpr {
         not_end: bool,
     ) -> Result<ArrayRef> {
         let values = self.evaluate_args(record_batch)?;
+        let length = values[0].len();

         if self.is_constant_in_partition() {
             accumulator.update_batch(&values)?;
             let value = accumulator.evaluate()?;
-            return value.to_array_of_size(record_batch.num_rows());
+            return value.to_array_of_size(length - idx);
         }
         let order_bys = get_orderby_values(self.order_by_columns(record_batch)?);
         let most_recent_row_order_bys = most_recent_row
@@ -289,7 +290,6 @@ pub trait AggregateWindowExpr: WindowExpr {
             .map(get_orderby_values);

         // We iterate on each row to perform a running calculation.
-        let length = values[0].len();
         let mut row_wise_results: Vec<ScalarValue> = vec![];
         let is_causal = self.get_window_frame().is_causal();
         while idx < length {

@andygrove Could you please help me confirm this? Thanks!

@andygrove
Copy link
Member

@alamb I'm sorry for the issue that occurred. After a preliminary review, I suspect the cause might be:

diff --git a/datafusion/physical-expr/src/window/window_expr.rs b/datafusion/physical-expr/src/window/window_expr.rs
index 70a73c44a..8c9cb90b9 100644
--- a/datafusion/physical-expr/src/window/window_expr.rs
+++ b/datafusion/physical-expr/src/window/window_expr.rs
@@ -276,11 +276,12 @@ pub trait AggregateWindowExpr: WindowExpr {
         not_end: bool,
     ) -> Result<ArrayRef> {
         let values = self.evaluate_args(record_batch)?;
+        let length = values[0].len();

         if self.is_constant_in_partition() {
             accumulator.update_batch(&values)?;
             let value = accumulator.evaluate()?;
-            return value.to_array_of_size(record_batch.num_rows());
+            return value.to_array_of_size(length - idx);
         }
         let order_bys = get_orderby_values(self.order_by_columns(record_batch)?);
         let most_recent_row_order_bys = most_recent_row
@@ -289,7 +290,6 @@ pub trait AggregateWindowExpr: WindowExpr {
             .map(get_orderby_values);

         // We iterate on each row to perform a running calculation.
-        let length = values[0].len();
         let mut row_wise_results: Vec<ScalarValue> = vec![];
         let is_causal = self.get_window_frame().is_causal();
         while idx < length {

@andygrove Could you please help me confirm this? Thanks!

Sure, trying this now!

@andygrove
Copy link
Member

I still see test failures:

2025-06-06T18:02:24.9752586Z - aggregate window function for all types *** FAILED *** (260 milliseconds)
2025-06-06T18:02:24.9755112Z   64 did not equal 1027 [64] did not equal [1027] (CometTestBase.scala:130)

and

2025-06-06T18:02:25.3200520Z - Windows support *** FAILED *** (342 milliseconds)
2025-06-06T18:02:25.3201792Z   Results do not match for query:

2025-06-06T18:02:25.3254822Z   == Results ==
2025-06-06T18:02:25.3255257Z   !== Correct Answer - 10 ==                                                                  == Spark Answer - 10 ==
2025-06-06T18:02:25.3256419Z    struct<count(_1) OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING):bigint>   struct<count(_1) OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING):bigint>
2025-06-06T18:02:25.3257476Z   ![10]                                                                                       [12]
2025-06-06T18:02:25.3257914Z   ![10]                                                                                       [12]
2025-06-06T18:02:25.3258358Z   ![10]                                                                                       [20]
2025-06-06T18:02:25.3258817Z   ![10]                                                                                       [20]
2025-06-06T18:02:25.3259269Z   ![10]                                                                                       [2]
2025-06-06T18:02:25.3259712Z   ![10]                                                                                       [2]
2025-06-06T18:02:25.3260149Z   ![10]                                                                                       [30]
2025-06-06T18:02:25.3260763Z   ![10]                                                                                       [30]
2025-06-06T18:02:25.3261200Z   ![10]                                                                                       [6]
2025-06-06T18:02:25.3261669Z   ![10]                                                                                       [6]

I suggest we revert this PR for now and then add more tests based on the failing tests in Spark/Comet so that we can have more confidence when the PR is updated.

alamb pushed a commit that referenced this pull request Jun 6, 2025
…ndow expression" (#16307)

* Revert "Improve performance of constant aggregate window expression (#16234)"

This reverts commit 0c30374.

* update changelog

* update changelog
@alamb
Copy link
Contributor

alamb commented Jun 6, 2025

I suggest we revert this PR for now and then add more tests based on the failing tests in Spark/Comet so that we can have more confidence when the PR is updated.

Update:@andygrove reverted the change on the branch-48 line:

Let's continue the discussion and debugging on

kosiew pushed a commit to kosiew/datafusion that referenced this pull request Jun 10, 2025
…6234)

* Improve performance of constant aggregate window expression

* Update datafusion/physical-expr/src/window/aggregate.rs

Co-authored-by: Jonathan Chen <[email protected]>

* fmt

* Update datafusion/physical-expr/src/window/aggregate.rs

Co-authored-by: Yongting You <[email protected]>

* Rename

* fmt

---------

Co-authored-by: Jonathan Chen <[email protected]>
Co-authored-by: Yongting You <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
physical-expr Changes to the physical-expr crates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants