Skip to content

Regression testing version comparison #35534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

clnoll
Copy link
Contributor

@clnoll clnoll commented Feb 22, 2024

What

Introduces the first phase of regression tests, which allows us to compare two arbitrary versions of a connector against each other.

The regression test run can be invoked with the following:

airbyte-ci connectors --name=<connector_name> regression_test
                     [--control-version=<version>]  # where version is a docker image name & tag, e.g. airbyte/source-faker:0.1.1
                     [--target-version=<version>]
                     [--fail-fast]

Implementation

At a high level, we are using the airbyte-ci framework to do 2 CAT runs:

  1. A "control" run, which runs CATs against an input version of the container (intended to be a trusted version). During this run we write the records to a file and skip the expected record check.
  2. A "target" run, which runs CATs against an input version of the container (the version to be tested), whose records are compared against the stored records from 1.

To make this work, we pass the control CAT run flag instructing it to store records at a specific location; when run via airbyte-ci/dagger, the records are then exported to the host machine. After the control CATs are complete, we prepare the the target run by writing a new acceptance test config file, which uses the new location for expected records; when run via airbyte-ci/dagger, the new config file and the control's records are mounted to the target CAT container.

High-level overview (source):
image

CAT modifications:

  • --acceptance-test-config-filepath: Path to an acceptance test config yml, to override the acceptance-test-config.yml in the default location.
  • --connector-version: The image tag for the version of the connector to test. If not set, we use the connector version in acceptance-test-config.yml.
  • --store-expected-records: Folder to store expected records, e.g. for inspection or use in a subsequent test. If not set records will not be stored.
  • --container-id-filepath: Path to the container ID, when one is present and differs from the default (/tmp/container_id.txt).

airbyte-ci regression_test subcommand:

  • --control-version: Control version of the connector to be tested. If not set, defaults to the latest version.
  • --target-version: Target version of the connector being tested. If not set, defaults to a dev version.
  • --fail-fast: When enabled, tests will fail fast.

Closes https://github.com/airbytehq/airbyte-internal-issues/issues/2855.

@clnoll clnoll requested a review from a team February 22, 2024 02:35
@clnoll clnoll marked this pull request as draft February 22, 2024 02:35
Copy link

vercel bot commented Feb 22, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Feb 25, 2024 2:31am

@clnoll clnoll closed this Feb 22, 2024
@clnoll clnoll deleted the regression-testing-version-comparison branch February 22, 2024 02:36
@clnoll clnoll restored the regression-testing-version-comparison branch February 22, 2024 02:36
@clnoll clnoll deleted the regression-testing-version-comparison branch February 22, 2024 02:38
@clnoll clnoll restored the regression-testing-version-comparison branch February 22, 2024 02:38
@clnoll clnoll reopened this Feb 22, 2024
@clnoll clnoll force-pushed the regression-testing-version-comparison branch 2 times, most recently from 1289099 to cc7dd6c Compare February 22, 2024 05:35
@clnoll clnoll force-pushed the regression-testing-version-comparison branch from 808fcd9 to 299a543 Compare February 25, 2024 02:31
@clnoll clnoll marked this pull request as ready for review February 25, 2024 02:31
@clnoll clnoll requested a review from alafanechere February 25, 2024 02:31
Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before raising my concerns I really want to thank your for the care you've put in crafting this PR!

I'm sorry this feedback comes after and implementation. But our recent discussions in the blowing up your test suites sheds some lights on how clearly defined test suites scopes could help our connector teams.

I have the following concern with the design you're introducing:

Regression testing, as a tool, is totally coupled to two components:

  • airbyte-ci for interface declaration and CAT orchestration
  • CAT to run the expected record tests
    It means we'll have to make the regression testing framework evolve according to changes to these projects, e vice versa. It also mean regression test can't not easily run independently, which is something we might want if we ever use regression testing in blue green deployments.

And conceptually your changes is not making regression test a new test suite, it's a new CAT orchestration mode. I think it makes it hard to reason about what regression tests are, and to reason about CAT too as they're being complexified.

In the light of the "blowing up our test suite" discussions we had last week I'd love to push for introducing a separate and independent regression test suite (which could be run indenpendly or via airbyte-ci). This would allow to:

  • Not add more things to CAT whose scoped is not clearly identified: it currently does too much things.
  • Keep airbyte-ci as an orchestrator with no "business logic" awareness.

In practice I think this could look like this:

  • We create a new python package (in airbyte-ci/connectors/regression_testing)?
  • This package uses dagger and reuses the ConnectorRunner to run commands against connectors
  • This packages has a CLI entry point (using click) - we can reuse the one you've defined: regression-test --control-version=0.1.0 --target-version=1.0.0
  • Actual tests are declared inside this package - (We could use programatically call pytest if we want to declare tests with it)
  • If we want to manually run regression tests we could directly call its CLI.
  • If we want to automatically run regression tests in CI we would add a new Step in the airbyte-ci connectors test pipeline.

Let me know how this sounds to you. I'd happily jump in helping you making these changes!
I think it's the kind of move I've taken for connectors_qa: a new python package with clearly scoped checks - orchestrated by airbyte-ci connector tests

@clnoll
Copy link
Contributor Author

clnoll commented Feb 26, 2024

Hey @alafanechere thanks for taking a look. The tldr is that I think I agree with you.

A few thoughts -

  • If the point of regression testing is to identify differences in the behavior of two connector runs, we can see that CATs already do regression testing - they just do it against a hardcoded "control" version instead of a dynamically generated one. With that in mind, one of the primary reasons that I tied this to CATs to begin with was because I felt that most of the basic tests we'd want to run in a regression test suite were already being handled by CATs. I still do think that's true, but don't mind having some overlap between what's covered by which suite.
  • If people like more separation of concerns, I suppose we could get rid of regression-style tests in CATs, and only have them ensure that connectors are conforming to the Airbyte protocol. I think that could work, as long as in other suites we're not too strict about what kind of test goes where. I think too much strictness could put us at greater risk of missing coverage if it causes people to think in terms of test suites instead of what overall tests there should be for a given change.
  • I agree with the drawbacks to coupling them too closely that you brought up, and I do think there's value in letting the regression tests evolve on their own. So, I'm going to close this PR for now.

I played around with your suggestion and like the result - and think that we'll put something very useful together quickly. It's very WIP and needs some cleanup but the basic structure is there and it runs (writing output to different directories for each version of the connector; haven't implemented a comparison of the output yet). LMK what you think.

@clnoll clnoll closed this Feb 26, 2024
@clnoll clnoll deleted the regression-testing-version-comparison branch May 8, 2024 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants