Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(query): improve update handling about nondeterministic_update and add nullable columns #17586

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

loloxwg
Copy link

@loloxwg loloxwg commented Mar 11, 2025

Added replace_column_datatype_to_nullable method to support updates involving nullable column types. Refactored rewrite_nondeterministic_update logic to use matcher-based pattern matching for enhanced clarity and maintainability. Updated tests to include scenarios with CTEs and nullable columns for improved test coverage.

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-bugfix this PR patches a bug in codebase label Mar 11, 2025
@loloxwg loloxwg force-pushed the fix-17573 branch 2 times, most recently from b9db1db to 82386a4 Compare March 12, 2025 08:58
@BohuTANG
Copy link
Member

Thanks for contributing!
The PR is failing tests - please check CI logs, and let us know if need assistance.

@BohuTANG BohuTANG requested a review from forsaken628 March 18, 2025 00:15
@forsaken628
Copy link
Collaborator

Please provide some test cases to reproduce the bug you wish to fix

@loloxwg
Copy link
Author

loloxwg commented Mar 18, 2025

@forsaken628 #17573

@loloxwg
Copy link
Author

loloxwg commented Mar 20, 2025

Thanks for contributing! The PR is failing tests - please check CI logs, and let us know if need assistance.

Should the default field type of a temporary table created with a WITH statement be set to NULL? I believe doing so would resolve the issue. @BohuTANG @forsaken628

@forsaken628
Copy link
Collaborator

@loloxwg

-- This sql is working fine.
with tbb as (select col0::string null col1,col1::string null col2,col2::string null col3 from (values ('1', 'add', '11'), ('4', 'add', '44')))
update test_merge tba set tba.col1 =tbb.col1, tba.col2 = 'update', tba.col3 = tbb.col3 from tbb where tba.col1 = tbb.col1;


-- But this one no work
with tbb("col1", "col2", "col3") as (values ('1', 'add', '11'), ('4', 'add', '44'))
update test_merge tba set tba.col1 =tbb.col1::string null, tba.col2 = 'update', tba.col3 = tbb.col3::string null from tbb where tba.col1 = tbb.col1::string null;

My guess is that type_check is not handling the type correctly, so you can check it out.

@loloxwg
Copy link
Author

loloxwg commented Mar 27, 2025

@loloxwg

-- This sql is working fine.
with tbb as (select col0::string null col1,col1::string null col2,col2::string null col3 from (values ('1', 'add', '11'), ('4', 'add', '44')))
update test_merge tba set tba.col1 =tbb.col1, tba.col2 = 'update', tba.col3 = tbb.col3 from tbb where tba.col1 = tbb.col1;


-- But this one no work
with tbb("col1", "col2", "col3") as (values ('1', 'add', '11'), ('4', 'add', '44'))
update test_merge tba set tba.col1 =tbb.col1::string null, tba.col2 = 'update', tba.col3 = tbb.col3::string null from tbb where tba.col1 = tbb.col1::string null;

My guess is that type_check is not handling the type correctly, so you can check it out.

CleanShot 2025-03-27 at 09 37 40@2x
I fixed update stmt above by this way, but i import new issue, cant insert literal string value
insert into test_merge values(2,'abc',2),(3,'abc',3),(4,'abc',4);

@forsaken628
Copy link
Collaborator

forsaken628 commented Mar 27, 2025

let used_columns = mutation
.matched_evaluators
.iter()
.flat_map(|eval| {
eval.update.iter().flat_map(|update| {
update
.values()
.flat_map(|expr| expr.used_columns().into_iter())
})
})
.chain(mutation.required_columns.iter().copied())
.collect::<HashSet<_>>();

If expr is a not null type, cast to a nullable type, because the following aggregation will converts the column to nullable

@loloxwg
Copy link
Author

loloxwg commented Apr 3, 2025

let used_columns = mutation
.matched_evaluators
.iter()
.flat_map(|eval| {
eval.update.iter().flat_map(|update| {
update
.values()
.flat_map(|expr| expr.used_columns().into_iter())
})
})
.chain(mutation.required_columns.iter().copied())
.collect::<HashSet<_>>();

If expr is a not null type, cast to a nullable type, because the following aggregation will converts the column to nullable

image how can i change the block datatype

Copy link
Member

@drmingdrmer drmingdrmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nesting level is too deep and is hard to understand. Please extract inner logic to some small function or use intermediate variable to improve the readability.

@forsaken628
Copy link
Collaborator

forsaken628 commented Apr 4, 2025

how can i change the block datatype

here 0638e0f

Next you need to change the PR title and add some tests.
Should have a look if there are some other tests which will fail if error_on_nondeterministic_update = 1.

@loloxwg
Copy link
Author

loloxwg commented Apr 4, 2025

how can i change the block datatype

Add a RelOperator::EvalScalar between RelOperator::Mutation and RelOperator::Aggregate then cast nullable to not null.

image If I haven't misunderstood the meaning, should it be like this? And how can i fill the EvaScalar' s items?

@loloxwg loloxwg changed the title fix(query): refactor type handling in physical mutation planning fix(query): improve update handling about nondeterministic_update and add nullable columns Apr 7, 2025
@loloxwg
Copy link
Author

loloxwg commented Apr 7, 2025

image image why ci test failed? image

@loloxwg loloxwg force-pushed the fix-17573 branch 4 times, most recently from 45a65d8 to 3c02631 Compare April 8, 2025 03:18
@sundy-li
Copy link
Member

sundy-li commented Apr 8, 2025

maybe also need change to:

let mut aggr_func = ScalarExpr::AggregateFunction(AggregateFunction {
                    span: None,
                    func_name: "any".to_string(),
                    distinct: false,
                    params: vec![],
                    args: vec![ScalarExpr::BoundColumnRef(BoundColumnRef {
                        span: None,
                        column: binding.clone(),
                    })],
                    return_type: Box::new(binding.data_type.clone().wrap_nullable()),
                    sort_descs: vec![],
                    display_name: display_name.clone(),
                });

🐳 root@default:)  with tbb("col1", "col2", "col3") as (values ('1', 'add', '11'), ('4', 'add', '44'))
update test_merge tba set tba.col1 =tbb.col1, tba.col2 = 'update', tba.col3 = tbb.col3 from tbb where tba.col1 = tbb.col1;


╭────────────────────────╮
│ number of rows updated │
│         UInt64         │
├────────────────────────┤
│                      1 │
╰────────────────────────╯
1 row read in 0.343 sec. Processed 3 row, 133 B (8.75 rows/s, 387 B/s)

… add nullable columns

Added `replace_column_datatype_to_nullable` method to support updates involving nullable column types. Refactored `rewrite_nondeterministic_update` logic to use matcher-based pattern matching for enhanced clarity and maintainability. Updated tests to include scenarios with CTEs and nullable columns for improved test coverage.

Handle non-nullable expressions in update bindings

This update ensures non-nullable expressions are unified with nullable data types where required, maintaining consistency in type handling. Additionally, it includes error logging for cases when data type retrieval fails, improving debugging visibility.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-bugfix this PR patches a bug in codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants