-
Notifications
You must be signed in to change notification settings - Fork 81
feat(ibis): pushdown the limit of the query request into SQL #1082
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis update introduces enhanced error handling around the Changes
Suggested reviewers
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
wren-core-py/src/context.rs (3)
194-229
: Improve error handling in the new pushdown_limit methodThe implementation looks solid overall for enforcing a maximum limit on queries. However, there's an
.unwrap()
call on line 214 that could panic if the parsed number isn't a validusize
. Consider replacing it with proper error handling.- if n.parse::<usize>().unwrap() > pushdown { + if let Ok(limit_value) = n.parse::<usize>() { + if limit_value > pushdown { + q.limit = Some(Expr::Value(Value::Number( + pushdown.to_string(), + is.clone(), + ))); + } + } else { + return Err(CoreError::new("Failed to parse limit as usize").into()); + }
210-227
: Consider clarifying behavioral edge cases with commentsThe implementation correctly maintains existing limits that are smaller than the pushdown limit, which is good. Consider adding a comment to clarify this behavior for future maintainers.
visit_statements_mut(&mut statements, |stmt| { if let Statement::Query(q) = stmt { if let Some(limit) = &q.limit { + // If statement already has a limit, only replace it if it's larger than our pushdown limit if let Expr::Value(Value::Number(n, is)) = limit { if n.parse::<usize>().unwrap() > pushdown { q.limit = Some(Expr::Value(Value::Number( pushdown.to_string(), is.clone(), ))); } + // If existing limit is smaller, preserve it } } else { + // If no limit exists, add one with the pushdown value q.limit = Some(Expr::Value(Value::Number(pushdown.to_string(), false))); } } ControlFlow::<()>::Continue(()) });
198-199
: Enhance the method signature with more descriptive parameter namesThe method signature could be clearer about what the limit parameter actually represents.
-#[pyo3(signature = (sql, limit=None))] -pub fn pushdown_limit(&self, sql: &str, limit: Option<usize>) -> PyResult<String> { +#[pyo3(signature = (sql, max_limit=None))] +pub fn pushdown_limit(&self, sql: &str, max_limit: Option<usize>) -> PyResult<String> {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
ibis-server/app/routers/v2/connector.py
(2 hunks)ibis-server/app/routers/v3/connector.py
(2 hunks)ibis-server/app/util.py
(2 hunks)ibis-server/tests/routers/v2/connector/test_postgres.py
(1 hunks)ibis-server/tests/routers/v3/connector/postgres/test_query.py
(1 hunks)wren-core-py/src/context.rs
(2 hunks)wren-core-py/src/errors.rs
(2 hunks)wren-core-py/tests/test_modeling_core.py
(1 hunks)wren-core/core/src/lib.rs
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: cargo test (win64)
- GitHub Check: cargo test (macos)
- GitHub Check: cargo test (macos-aarch64)
- GitHub Check: test
- GitHub Check: ci
🔇 Additional comments (14)
wren-core/core/src/lib.rs (1)
7-7
: Looks good - this export will support the SQL limit pushdown featureThe addition of
pub use datafusion::sql::sqlparser::*;
makes the SQL parsing capabilities available to other modules, which is necessary for implementing the limit pushdown functionality.ibis-server/tests/routers/v3/connector/postgres/test_query.py (1)
343-355
: Good test coverage for limit pushdown functionalityThis test effectively verifies that the limit parameter (10) correctly overrides the SQL query's own limit (100). The test confirms the response contains exactly 10 rows of data, validating the pushdown_limit functionality works as expected.
wren-core-py/src/errors.rs (3)
1-1
: Properly added import for ParseIntErrorGood addition of the import needed for the new error conversion implementation.
58-62
: Appropriate error handling for parser errorsThis implementation provides proper conversion from parser errors to CoreError, maintaining consistency with the existing error handling pattern in the codebase.
64-68
: Good addition of ParseIntError conversionThis error conversion is likely needed for properly handling integer parsing errors that might occur during limit value processing, following the established error handling pattern.
ibis-server/app/routers/v2/connector.py (2)
22-22
: Import added for the new pushdown_limit functionThe import has been properly updated to include the new utility function.
47-52
:✅ Verification successful
Correctly implemented limit pushdown in query function
The SQL query is now processed through the pushdown_limit function before being rewritten, which ensures that the limit parameter is properly applied to the SQL query. This implementation aligns with the PR objective.
I recommend adding a similar test for the v2 router as you did for v3, to ensure consistent behavior across both API versions.
🏁 Script executed:
#!/bin/bash # Check if there's a test for limit pushdown in the v2 router tests fd -t f "test.*\.py" ibis-server/tests/routers/v2 | xargs grep -l "test_limit_pushdown\|pushdown_limit"Length of output: 161
Approved: Limit Pushdown Verified in v2 Router
The SQL query is correctly processed through thepushdown_limit
function before being rewritten. Additionally, tests for the v2 router already include coverage for this functionality (seeibis-server/tests/routers/v2/connector/test_postgres.py
), ensuring consistent behavior with the v3 endpoint.ibis-server/app/util.py (2)
6-7
: Good addition of the necessary import.The addition of the
wren_core
import is appropriate for the new functionality being introduced.
104-107
: Clean implementation of the limit pushdown functionality.This function is well-structured, follows the established pattern of using the OpenTelemetry tracer, and properly delegates the actual work to the
SessionContext
class. The type annotations clearly indicate that it takes a SQL string and an optional limit integer, returning a modified SQL string.ibis-server/app/routers/v3/connector.py (2)
21-21
: Good import update to include the new functionality.The import has been correctly updated to include the
pushdown_limit
function along with the existing imports.
41-44
: Effective integration of the limit pushdown feature.The implementation correctly applies the
pushdown_limit
function to the SQL query before passing it to the Rewriter, ensuring the limit is properly enforced at the SQL level. This approach is more efficient as it potentially reduces the amount of data processed by the database.ibis-server/tests/routers/v2/connector/test_postgres.py (1)
288-301
: Good test case for the limit pushdown functionality.This test effectively verifies that the limit parameter (10) overrides the limit in the SQL query (100), confirming that the
pushdown_limit
function is working correctly. The test is well-structured and makes clear assertions.wren-core-py/tests/test_modeling_core.py (1)
210-240
: Comprehensive test coverage for the limit pushdown feature.Excellent test coverage for the
pushdown_limit
method, covering multiple scenarios:
- Adding a limit to a query without one
- Overriding a higher limit with a lower one
- Preserving a lower limit when a higher one is requested
- Maintaining the offset when applying a limit
These tests thoroughly validate the behavior of the limit pushdown functionality across various query patterns.
wren-core-py/src/context.rs (1)
206-209
:❓ Verification inconclusive
Consider handling multiple statements more flexibly
Currently, an error is returned if there are multiple statements. Consider if a more flexible approach might be beneficial, such as applying the limit to all query statements or just the last one.
🏁 Script executed:
#!/bin/bash # Check if this limitation is appropriate by examining how this function is used rg "pushdown_limit" --type rust --type python -A 3 -B 3Length of output: 86
Below is a new verification script to help us better understand how the “Only one statement is allowed” limitation is used across the repository. This search should reveal whether similar error messages (or patterns) are being used elsewhere, which can indicate if the current strict limitation is intentional or if a more flexible approach might indeed be beneficial:
Action Required: Verify Statement Limitation Handling
The previous command with the--type python
flag did not produce the expected output due to an unrecognized file type. Please run the above script (or confirm its output) so we can determine whether the “one statement only” rule is consistently enforced across the codebase. Once we have that context, we can confidently rewrite the review comment.
🏁 Script executed:
#!/bin/bash # Search for occurrences of the error message related to the limitation. rg "Only one statement is allowed" -A 3 -B 3Length of output: 710
Action: Reconsider SQL statement processing flexibility
The current implementation in
wren-core-py/src/context.rs
(lines 206–209) enforces that exactly one statement is present by returning an error with the message "Only one statement is allowed." There’s no indication elsewhere in the repository that this behavior is adjusted based on context. If future use cases require supporting multiple statements, you might consider one of the following approaches:
- Process only the last query statement if multiple valid statements are provided.
- Apply the statement limit specifically to query statements while allowing other valid non-query statements.
This suggestion is to evaluate whether a more flexible handling meets current or anticipated requirements.
fc4fe84
to
78eb956
Compare
78eb956
to
c4a8ef7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
wren-core-py/src/context.rs
(2 hunks)wren-core-py/src/errors.rs
(2 hunks)wren-core-py/tests/test_modeling_core.py
(1 hunks)wren-core/core/src/lib.rs
(1 hunks)ibis-server/app/routers/v2/connector.py
(2 hunks)ibis-server/app/routers/v3/connector.py
(2 hunks)ibis-server/app/util.py
(2 hunks)ibis-server/tests/routers/v2/connector/test_postgres.py
(1 hunks)ibis-server/tests/routers/v3/connector/postgres/test_query.py
(1 hunks)ibis-server/app/util.py
(1 hunks)ibis-server/app/util.py
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
- wren-core/core/src/lib.rs
- ibis-server/app/routers/v2/connector.py
- ibis-server/app/util.py
- ibis-server/tests/routers/v3/connector/postgres/test_query.py
- ibis-server/tests/routers/v2/connector/test_postgres.py
- wren-core-py/src/errors.rs
- ibis-server/app/util.py
- ibis-server/app/util.py
- ibis-server/app/routers/v3/connector.py
🔇 Additional comments (2)
wren-core-py/tests/test_modeling_core.py (1)
210-240
: Comprehensive test coverage for the new pushdown_limit feature!The test covers all important scenarios:
- Adding a limit to a query without one
- Reducing an existing limit when it's higher than requested
- Maintaining a lower existing limit when requested limit is higher
- Preserving offset when a limit exists
- Maintaining offset when reducing a limit
One minor observation: you're instantiating
SessionContext
without parameters on line 211, while other test functions provide a manifest string. This appears intentional sincepushdown_limit
doesn't depend on the manifest, but it might be worth adding a comment to clarify this if the behavior differs from other methods.wren-core-py/src/context.rs (1)
26-26
: Good additions to the import listThese imports are necessary for the new
pushdown_limit
functionality.Also applies to: 28-29
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
ibis-server/app/routers/v2/connector.py (1)
48-52
: Good defensive programming with graceful fallback.This try-except block provides a robust error handling mechanism for the limit pushdown operation. If the operation fails, the system properly logs a warning and gracefully falls back to using the original SQL query, ensuring the service remains functional even when the enhancement cannot be applied.
I'd recommend including more specific error details in the log message to assist in future debugging:
- logger.warning("Failed to pushdown limit. Using original SQL: {}", e) + logger.warning("Failed to pushdown limit for query. Using original SQL. Error: {}", e)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
ibis-server/app/routers/v2/connector.py
(2 hunks)ibis-server/app/routers/v3/connector.py
(2 hunks)ibis-server/tests/routers/v2/connector/test_postgres.py
(1 hunks)ibis-server/app/mdl/rewriter.py
(1 hunks)ibis-server/app/routers/v2/connector.py
(2 hunks)ibis-server/app/routers/v3/connector.py
(2 hunks)ibis-server/tests/routers/v2/connector/test_postgres.py
(1 hunks)ibis-server/tests/routers/v2/connector/test_postgres.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
- ibis-server/tests/routers/v2/connector/test_postgres.py
- ibis-server/app/routers/v2/connector.py
- ibis-server/tests/routers/v2/connector/test_postgres.py
- ibis-server/app/routers/v3/connector.py
- ibis-server/tests/routers/v2/connector/test_postgres.py
- ibis-server/app/routers/v3/connector.py
🔇 Additional comments (2)
ibis-server/app/routers/v2/connector.py (1)
5-5
: Added logger for improved error handling.The addition of the logger import is necessary for the new error handling mechanism in the
query
function.ibis-server/app/mdl/rewriter.py (1)
108-108
:❓ Verification inconclusive
Changed log level from error to warning.
The logging level has been downgraded from error to warning for manifest extraction issues. This is appropriate if these exceptions are expected in normal operation and don't necessarily represent system failures.
However, ensure this aligns with your team's logging policies - if manifest extraction failures should trigger alerts or are considered critical issues, maintaining the error level might be more appropriate.
Can you verify that this change in log level is intentional and consistent with how other non-critical exceptions are handled in the codebase?
🏁 Script executed:
#!/bin/bash # Check how other similar exceptions are logged in the codebase rg "logger\.(error|warning|info).*exception" --type pyLength of output: 56
Log Level Adjustment for Manifest Extraction
It appears that the downgrade from error to warning for manifest extraction issues is intended for non-critical exceptions. However, our automated search for similar logger calls (using the pattern with "exception") didn’t yield any results. This means we couldn’t conclusively verify that the changed log level is consistent with how other non-critical exceptions are handled across the codebase.
Next Steps:
- Manually verify that similar non-critical exception scenarios (if any) are logged at the warning level.
- Confirm that manifest extraction failures are indeed expected and not indicative of critical issues per your team’s logging policies.
Please ensure that this handling aligns with your overall logging strategy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @goldmedal ! LGTM
Description
This PR introduces the way to push down the limit parameter into the SQL. If the SQL has the limit clause, the smaller one of them will be used in the SQL.
Summary by CodeRabbit
New Features
Bug Fixes
Tests