-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
low-code: Yield records from generators instead of keeping them in in-memory lists #36406
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Ignored Deployment
|
…m:airbytehq/airbyte into alex/replace_last_records_from_paginators
@@ -219,7 +223,7 @@ def get_request_body_data( | |||
stream_state: Optional[StreamState] = None, | |||
stream_slice: Optional[StreamSlice] = None, | |||
next_page_token: Optional[Mapping[str, Any]] = None, | |||
) -> Optional[Union[Mapping[str, Any], str]]: | |||
) -> Union[Mapping[str, Any], str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this never returns None
@@ -228,7 +232,7 @@ def get_request_body_json( | |||
stream_state: Optional[StreamState] = None, | |||
stream_slice: Optional[StreamSlice] = None, | |||
next_page_token: Optional[Mapping[str, Any]] = None, | |||
) -> Optional[Mapping[str, Any]]: | |||
) -> Mapping[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this never returns None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This reverts commit c7b94f9.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changes look good to me, but i suspect there will be some merge conflicts since it looks like you are fixing some mypy errors on certain files that i also fixed in one of my PRs that I merged
if self.page_size_option and self.pagination_strategy.get_page_size() and self.page_size_option.inject_into == option_type: | ||
options[self.page_size_option.field_name.eval(config=self.config)] = self.pagination_strategy.get_page_size() | ||
options[self.page_size_option.field_name.eval(config=self.config)] = self.pagination_strategy.get_page_size() # type: ignore # field_name is known to be an interpolated string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sigh i hate these so much
@@ -64,7 +64,8 @@ class SimpleRetriever(Retriever): | |||
def __post_init__(self, parameters: Mapping[str, Any]) -> None: | |||
self._paginator = self.paginator or NoPagination(parameters=parameters) | |||
self._last_response: Optional[requests.Response] = None | |||
self._records_from_last_response: List[Record] = [] | |||
self._last_page_size: int = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussed a bit w/ Alex over Slack, since we yield from the previous set of records before calculating the next_page_token
, we should have the up to date count of records when we pass self._last_page_size
to the paginator
What
Improve memory usage by yielding records from generators instead of returning lists of objects
This PR addresses a part of https://github.com/airbytehq/airbyte-internal-issues/issues/6554
How
Reading order
airbyte-cdk/python/airbyte_cdk/sources/declarative/extractors/record_extractor.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/extractors/http_selector.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/extractors/dpath_extractor.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/extractors/record_selector.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/extractors/record_filter.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/paginators/paginator.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/paginators/no_pagination.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/paginators/default_paginator.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/paginators/strategies/pagination_strategy.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/paginators/strategies/offset_increment.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/paginators/strategies/page_increment.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/paginators/strategies/cursor_pagination_strategy.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/paginators/strategies/stop_condition.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/retrievers/simple_retriever.py
airbyte-cdk/python/airbyte_cdk/sources/declarative/retrievers/simple_retriever.py