Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support setting max_stream_count when fetching query result #2051

Merged
merged 7 commits into from
Nov 22, 2024

Conversation

kien-truong
Copy link
Contributor

Allow user to set max_stream_count when fetching result using BigQuery Storage API with incremental methods:

  • to_arrow_iterable
  • to_dataframe_iterable

Fixes #2030 🦕

@kien-truong kien-truong requested review from a team as code owners November 3, 2024 03:29
@product-auto-label product-auto-label bot added size: s Pull request size is small. api: bigquery Issues related to the googleapis/python-bigquery API. labels Nov 3, 2024
@alvarowolfx alvarowolfx requested review from chalmerlowe and removed request for alvarowolfx November 4, 2024 14:08
@Linchin Linchin added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 11, 2024
@Linchin Linchin assigned Linchin and unassigned alvarowolfx Nov 11, 2024
@@ -1836,6 +1837,17 @@ def to_arrow_iterable(
created by the server. If ``max_queue_size`` is :data:`None`, the queue
size is infinite.

max_stream_count (Optional[int]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be more consistent if we use the same docstring as here. It also mentions the effect of preserve_order (in this case self._preserve_order), which I think we should make clear here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, _preserve_order is automatically set by parsing the queries, and not a user-facing API. I'll update the docstring to mention that effect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@@ -2008,6 +2022,17 @@ def to_dataframe_iterable(

.. versionadded:: 2.14.0

max_stream_count (Optional[int]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Linchin
Copy link
Contributor

Linchin commented Nov 11, 2024

Thank you @kien-truong for adding further support for max_stream_count! I left some comments, and I wonder if you could add some unit tests too (we typically require 100% unit test coverage). You could find similar tests in PR #2039. If you need any help with it, feel free to let me know. I'd be more than glad to help!

@Linchin Linchin added the kokoro:run Add this label to force Kokoro to re-run the tests. label Nov 11, 2024
@yoshi-kokoro yoshi-kokoro removed kokoro:run Add this label to force Kokoro to re-run the tests. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Nov 11, 2024
@kien-truong
Copy link
Contributor Author

Hi, the default code path with the default arguments is already covered by the current tests.
I'll see if I can add tests for when the user manually set the argument when I have time at weekend.

@Linchin
Copy link
Contributor

Linchin commented Nov 11, 2024

@kien-truong sounds good. The mypy test is also failing, could you fix it too? You can run it by running nox -s mypy in the repo. (You can install nox by pip install nox)

@product-auto-label product-auto-label bot added size: m Pull request size is medium. and removed size: s Pull request size is small. labels Nov 16, 2024
@kien-truong kien-truong force-pushed the max-stream-count-api branch 3 times, most recently from 31662ad to 0e52722 Compare November 16, 2024 03:51
@kien-truong
Copy link
Contributor Author

I have added tests to cover user-provided max_stream_count flow and fixed the mypy test as well.

@chalmerlowe chalmerlowe added the kokoro:run Add this label to force Kokoro to re-run the tests. label Nov 18, 2024
@yoshi-kokoro yoshi-kokoro removed the kokoro:run Add this label to force Kokoro to re-run the tests. label Nov 18, 2024
@Linchin Linchin added the kokoro:run Add this label to force Kokoro to re-run the tests. label Nov 18, 2024
@yoshi-kokoro yoshi-kokoro removed the kokoro:run Add this label to force Kokoro to re-run the tests. label Nov 18, 2024
@Linchin
Copy link
Contributor

Linchin commented Nov 19, 2024

Thanks @kien-truong, the PR mostly looks good. I just have some small question regarding ignored coverage for the tests.

@Linchin Linchin added the kokoro:run Add this label to force Kokoro to re-run the tests. label Nov 22, 2024
@yoshi-kokoro yoshi-kokoro removed the kokoro:run Add this label to force Kokoro to re-run the tests. label Nov 22, 2024
@Linchin Linchin self-requested a review November 22, 2024 21:34
Copy link
Contributor

@Linchin Linchin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you!

@Linchin Linchin merged commit d461297 into googleapis:main Nov 22, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. size: m Pull request size is medium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make max_stream_count configurable when using Bigquery Storage API
5 participants