-
Notifications
You must be signed in to change notification settings - Fork 3.5k
[fix](iceberg) fix the iceberg timstamp_ntz write schema and values bug. #51384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
546e823
to
aaad9d4
Compare
run buildall |
TPC-H: Total hot run time: 33643 ms
|
TPC-DS: Total hot run time: 193001 ms
|
ClickBench: Total hot run time: 28.56 s
|
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
@@ -70,7 +70,9 @@ Status ArrowSchemaUtil::convert_to(const iceberg::NestedField& field, | |||
break; | |||
|
|||
case iceberg::TypeID::TIMESTAMP: { | |||
arrow_type = std::make_shared<arrow::TimestampType>(arrow::TimeUnit::MICRO, timezone); | |||
iceberg::TimestampType* t_type = static_cast<iceberg::TimestampType*>(field.field_type()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add regression test cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integration test case has been added
run buildall |
TPC-H: Total hot run time: 34461 ms
|
TPC-DS: Total hot run time: 185734 ms
|
ClickBench: Total hot run time: 29.33 s
|
run buildall |
TPC-H: Total hot run time: 33998 ms
|
TPC-DS: Total hot run time: 185846 ms
|
ClickBench: Total hot run time: 28.76 s
|
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
a95e906
to
26d24b9
Compare
TPC-H: Total hot run time: 34130 ms
|
TPC-DS: Total hot run time: 192455 ms
|
ClickBench: Total hot run time: 29.41 s
|
26d24b9
to
6ef3ab8
Compare
run buildall |
TPC-H: Total hot run time: 33809 ms
|
TPC-DS: Total hot run time: 186511 ms
|
ClickBench: Total hot run time: 29.41 s
|
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
run buildall |
TPC-H: Total hot run time: 33811 ms
|
TPC-DS: Total hot run time: 186435 ms
|
ClickBench: Total hot run time: 29.03 s
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
What problem does this PR solve?
This PR is mainly to fix the time zone issue of doris writing in Iceberg in the data lake scenario. In Iceberg, there are two time zone types: timestamp_ltz and timestamp_ntz. Currently, there is no distinction between these two scenarios for doris, resulting in different spark query results for data written by doris under the timestamp_ntz type. This PR is to solve this problem.
Check List (For Author)
Manual test and The test report is as follows:
Doris-iceberg-timestampntz-test-report.pdf
Behavior changed:
No.
Does this need documentation?
No.