-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(python-sources): add unit integration testing utilities for simplification #43338
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
from airbyte_cdk.test.entrypoint_wrapper import EntrypointOutput, read | ||
|
||
|
||
def catalog(stream_name: str, sync_mode: SyncMode) -> ConfiguredAirbyteCatalog: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes a ton of sense to me to bring these two methods in the CDK since they're reimplemented by all connectors!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am personally skeptic about this one because of the "brittleness" I mentioned here. I feel like the builder is as readable as this method and I don't know what is the benefit of this method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be fine with a builder. what I like about the current approach is we get to remove the duplicated _read
static methods which is maybe not related to this specific function, so my bad
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These functions are what many tests use, I just moved them. These are provided in a common location for people to use them when they don't need the full builder or an __init__()
with default arguments. It's difficult to generalize everything, but these little functions come handy in many cases, if they are not handy, people can fall back to the builder.
airbyte-integrations/connectors/source-woocommerce/unit_tests/integration/test_basic_read.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks clean. Nice work.
Do you think you'll add State assertions, too?
airbyte-integrations/connectors/source-woocommerce/unit_tests/integration/utils.py
Outdated
Show resolved
Hide resolved
# Conflicts: # airbyte-integrations/connectors/source-woocommerce/poetry.lock # airbyte-integrations/connectors/source-woocommerce/pyproject.toml # airbyte-integrations/connectors/source-woocommerce/unit_tests/integration/test_coupons.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a couple of comments. What I like most about this PR is pushing more validation logic in the CDK
return CatalogBuilder().with_stream(stream_name, sync_mode).build() | ||
|
||
|
||
def read_records( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this interface brittle. If at some point we need more than just the stream_name, sync_mode
to describe the catalog, we will have to update this interface. Why not pass the catalog directly? We have a builder that should ease the process of instantiating it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as the other comment above in reading.py
"stream_name, num_records, http_calls", | ||
[("orders", 2, orders_empty_last_page())] | ||
) | ||
def test_read_with_multiple_pages_with_empty_last_page_successfully(stream_name, num_records, http_calls) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What ensures that we trigger pagination here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The freeze date that goes beyond the first page of one month, and creates second http call. Do you think that is not reliable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the benefit from the HttpMocker was that it validated that all the HTTP requests that were mocked were called (at least when it is used as an annotation but we have also started to use it through the unittest.TestCase so we should probably add the same logic to the exit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. However, the intention of this modeling of tests is not to provide all the features but provide a simpler/flexible alternative. For now I added a call-count assertion. Is it OK if we add such features in subsequent improvements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the HttpMocker if it's already implemented there? Which makes me think that I don't see from airbyte_cdk.test.utils.http_mocking import register_mock_responses
available on this branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry about the missing file. I'm not using that mocker because it requires the specific Request
type objects and I wanted to make it simpler. The vanilla http_mocking.Mocker can use simple requests and regexes that make it easier to match things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well... the consequence of using private interfaces!
@@ -11,7 +11,7 @@ | |||
from airbyte_cdk.test.catalog_builder import CatalogBuilder | |||
from airbyte_cdk.test.entrypoint_wrapper import EntrypointOutput, read | |||
from airbyte_cdk.test.mock_http import HttpMocker | |||
from airbyte_cdk.test.mock_http.response_builder import _get_unit_test_folder | |||
from airbyte_cdk.test.mock_http.response_builder import get_unit_test_folder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, this is unfortunate! We will need to update the CDK version before doing that. It uses CDK version 0.90.0 which means a lot of breaking changes.
The options I see are:
- Pay the debt of importing private methods now by updating the CDK (this seems to increase the scope way too much)
- Keep
_get_unit_test_folder
and have it just callget_unit_test_folder
internally with a big red comment in the CDK andsource-amazon-seller-partner
that says why we have this and when we can clean this. In that case, if you want to moveget_unit_test_folder
todata
, that works too
I prefer the second option personally but will let you choose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, going with the second option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for tackling the comments diligently
What
This PR tackles a few challenges we have when writing integration tests:
Here is provided a new, more generalized way of creating integration tests that reduces size of tests by up to 75%, enhances thoroughness, makes mocking simpler and powerful, and makes matching requests much easier. The suite of woocommerce tests was migrated to show this in action.
How
Review guide
test_read.py
- implements 22 test cases covering basic reads and a multiple reads test.conftest.py
- implements specific test configurations.common.py
- implements common helpers for a source.utils.py
- implements reusable test utilities.assertion.py
- implements reusable, thorough assertion functions.data.py
- implements utility function to load test JSON data files.User Impact
None, just easier testing improves probability of quality.
Can this PR be safely reverted and rolled back?