You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
## Summary
## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Refactor**
- Improved handling of tables without partition columns to ensure
smoother data loading.
- The system now gracefully loads unpartitioned tables instead of
raising errors.
- **New Features**
- Added new data sources and group-by configurations for enhanced
purchase data aggregation.
- Introduced environment-specific upload and deletion of additional
BigQuery tables to support new group-by views.
- **Bug Fixes**
- Resolved issues where missing partition columns would previously cause
exceptions, enhancing reliability for various table types.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- av pr metadata
This information is embedded by the av CLI when creating PRs to track
the status of stacks when using Aviator. Please do not delete or edit
this section of the PR.
```
{"parent":"main","parentHead":"","trunk":"main"}
```
-->
---------
Co-authored-by: thomaschow <[email protected]>
Copy file name to clipboardExpand all lines: api/python/test/canary/group_bys/gcp/purchases.py
+63Lines changed: 63 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -16,8 +16,71 @@
16
16
time_column="ts") # The event time
17
17
))
18
18
19
+
view_source=Source(
20
+
events=EventSource(
21
+
table="data.purchases_native_view", # This points to the log table in the warehouse with historical purchase events, updated in batch daily
22
+
topic=None, # See the 'returns' GroupBy for an example that has a streaming source configured. In this case, this would be the streaming source topic that can be listened to for realtime events
23
+
query=Query(
24
+
selects=selects("user_id","purchase_price"), # Select the fields we care about
25
+
time_column="ts") # The event time
26
+
))
27
+
19
28
window_sizes= [Window(length=day, time_unit=TimeUnit.DAYS) fordayin [3, 14, 30]] # Define some window sizes to use below
20
29
30
+
v1_view_dev=GroupBy(
31
+
backfill_start_date="2023-11-01",
32
+
sources=[view_source],
33
+
keys=["user_id"], # We are aggregating by user
34
+
online=True,
35
+
aggregations=[Aggregation(
36
+
input_column="purchase_price",
37
+
operation=Operation.SUM,
38
+
windows=window_sizes
39
+
), # The sum of purchases prices in various windows
40
+
Aggregation(
41
+
input_column="purchase_price",
42
+
operation=Operation.COUNT,
43
+
windows=window_sizes
44
+
), # The count of purchases in various windows
45
+
Aggregation(
46
+
input_column="purchase_price",
47
+
operation=Operation.AVERAGE,
48
+
windows=window_sizes
49
+
), # The average purchases by user in various windows
50
+
Aggregation(
51
+
input_column="purchase_price",
52
+
operation=Operation.LAST_K(10),
53
+
),
54
+
],
55
+
)
56
+
57
+
v1_view_test=GroupBy(
58
+
backfill_start_date="2023-11-01",
59
+
sources=[view_source],
60
+
keys=["user_id"], # We are aggregating by user
61
+
online=True,
62
+
aggregations=[Aggregation(
63
+
input_column="purchase_price",
64
+
operation=Operation.SUM,
65
+
windows=window_sizes
66
+
), # The sum of purchases prices in various windows
67
+
Aggregation(
68
+
input_column="purchase_price",
69
+
operation=Operation.COUNT,
70
+
windows=window_sizes
71
+
), # The count of purchases in various windows
72
+
Aggregation(
73
+
input_column="purchase_price",
74
+
operation=Operation.AVERAGE,
75
+
windows=window_sizes
76
+
), # The average purchases by user in various windows
0 commit comments