Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rowRestriction when fetching BQ schema #2273

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

joy91227
Copy link

@joy91227 joy91227 commented Mar 26, 2025

Issue: When using the BQ to Parquet flex template on partitioned BQ tables, we encounter error [2]

Analysis:

  • The stack trace [3] suggests the issue is when we try to create a read session.
  • It seems that we missed setting rowRestrictions [1]

[1] https://cloud.google.com/java/docs/reference/google-cloud-bigquerystorage/3.11.4/com.google.cloud.bigquery.storage.v1.ReadSession.TableReadOptions.Builder#com_google_cloud_bigquery_storage_v1_ReadSession_TableReadOptions_Builder_setRowRestriction_java_lang_String_

[2] INVALID_ARGUMENT: request failed: Query error: Cannot query over table 'xxxx' without a filter over column(s) 'xxxx' that can be used for partition elimination

[3]
... at com.google.cloud.bigquery.storage.v1beta1.BigQueryStorageClient.createReadSession(BigQueryStorageClient.java:239)
.... at com.google.cloud.teleport.v2.templates.BigQueryToParquet.run(BigQueryToParquet.java:253)

@shunping
Copy link
Contributor

Thanks for the fix @joy91227 , and it looks promising to me.

Could you also add a test to verify that it works?

Copy link

codecov bot commented Mar 26, 2025

Codecov Report

Attention: Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.

Project coverage is 48.97%. Comparing base (42c951e) to head (c35d9fd).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...cloud/teleport/v2/templates/BigQueryToParquet.java 0.00% 2 Missing ⚠️

❌ Your patch check has failed because the patch coverage (0.00%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff            @@
##               main    #2273   +/-   ##
=========================================
  Coverage     48.97%   48.97%           
  Complexity     4515     4515           
=========================================
  Files           911      911           
  Lines         55232    55234    +2     
  Branches       5892     5893    +1     
=========================================
+ Hits          27051    27053    +2     
+ Misses        26263    26262    -1     
- Partials       1918     1919    +1     
Components Coverage Δ
spanner-templates 70.05% <ø> (-0.01%) ⬇️
spanner-import-export 68.04% <ø> (-0.02%) ⬇️
spanner-live-forward-migration 78.42% <ø> (ø)
spanner-live-reverse-replication 80.31% <ø> (ø)
spanner-bulk-migration 88.65% <ø> (ø)
Files with missing lines Coverage Δ
...cloud/teleport/v2/templates/BigQueryToParquet.java 1.58% <0.00%> (-0.06%) ⬇️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@joy91227
Copy link
Author

The integration test involves creating a time partitioned table. Currently BigqueryResourceManager, the testing util in Beam for BQ tables, does not support this.

I opened a PR to add this feature
apache/beam#34471

@pull-request-size pull-request-size bot added size/M and removed size/XS labels Apr 4, 2025
@pull-request-size pull-request-size bot added size/L and removed size/M labels Apr 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants