Skip to content

🐛 Destination Redshift: Fix DATs blind spot - connection check #13090

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

alexandertsukanov
Copy link
Contributor

What

Investigating the issue I was able to find that problem appears due to redshiftDataTmpTableMode which was passed into RedshiftCopyS3Destination.java (Deprecated class) and subsequently fixed in the next PR:

https://github.com/airbytehq/airbyte/pull/12002/files

Looks like getSqlOperations() was called in some parent class (abstract class) and this was the root cause of the issue.
Currently, the CopyConsumerFactory approach was replaced with StagingOperations, and Redshift no longer uses CopyConsumerFactory.
(PR https://github.com/airbytehq/airbyte/pull/12601/files#diff-d6c1570221cad1a2b3eeb081801329eb)

How

Locally, I was not able to reproduce the issue with the latest Dest. redshift version. The only thing I have done is remove redshiftDataTmpTableMode as we don't need it anymore. The main purpose of mentioned ENUM was to support multiple
DATA TMP table types (e.g. VARCHAR and SUPER) - this is not required, as we have only SUPER now. Let's follow the KISS principle.

Also, I found that we have the next integration tests:

  • io/airbyte/integrations/destination/redshift/RedshiftS3StagingInsertDestinationAcceptanceTest.java (Designed to test S3 COPY)
  • io/airbyte/integrations/destination/redshift/RedshiftInsertDestinationAcceptanceTest.java (Designed to test standard INSERTS)

Both are the child of DestinationAcceptanceTest which contains testCheckConnection() method to verify whether the connection with the DB source was established successfully. ("Test connection button" on UI)

Recommended reading order

  1. x.java

Pre-merge Checklist

Expand the relevant checklist and delete the others.

New Connector

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • docs/SUMMARY.md
    • docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
    • docs/integrations/README.md
    • airbyte-integrations/builds.md
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the connector is published, connector added to connector index as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here
Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub and connector version bumped by running the /publish command described here
Connector Generator
  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • If adding a new generator, add it to the list of scaffold modules being tested
  • The generator test modules (all connectors with -scaffold in their name) have been updated with the latest scaffold by running ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates then checking in your changes
  • Documentation which references the generator is updated as needed

Tests

Unit

Put your unit tests output here.

Integration

Put your integration tests output here.

Acceptance

Put your acceptance tests output here.

@github-actions github-actions bot added the area/connectors Connector related issues label May 23, 2022
@alexandertsukanov alexandertsukanov linked an issue May 23, 2022 that may be closed by this pull request
@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label May 23, 2022
@alexandertsukanov
Copy link
Contributor Author

alexandertsukanov commented May 23, 2022

/test connector=connectors/destination-redshift

🕑 connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2372000504
✅ connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2372000504
Python tests coverage:

Name                                                              Stmts   Miss  Cover
-------------------------------------------------------------------------------------
normalization/transform_config/__init__.py                            2      0   100%
normalization/transform_catalog/reserved_keywords.py                 13      0   100%
normalization/transform_catalog/__init__.py                           2      0   100%
normalization/destination_type.py                                    13      0   100%
normalization/__init__.py                                             4      0   100%
normalization/transform_catalog/destination_name_transformer.py     155      8    95%
normalization/transform_config/transform.py                         159     31    81%
normalization/transform_catalog/table_name_registry.py              174     34    80%
normalization/transform_catalog/utils.py                             38      9    76%
normalization/transform_catalog/dbt_macro.py                         22      7    68%
normalization/transform_catalog/catalog_processor.py                147     80    46%
normalization/transform_catalog/transform.py                         61     38    38%
normalization/transform_catalog/stream_processor.py                 543    352    35%
-------------------------------------------------------------------------------------
TOTAL                                                              1333    559    58%

Copy link
Contributor

@edgao edgao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

I think the goal of the ticket was more focused on understanding why DestinationAcceptanceTest#testCheckConnection didn't fail. Were you able to checkout a broken version and investigate that? (the issue says 6ef2f80 was the latest broken version)

@alexandertsukanov
Copy link
Contributor Author

:shipit:

I think the goal of the ticket was more focused on understanding why DestinationAcceptanceTest#testCheckConnection didn't fail. Were you able to checkout a broken version and investigate that? (the issue says 6ef2f80 was the latest broken version)

@edgao
Yep, I had also checked out 6ef2f80eba24675e03a290e6acc368cf64e4c0e8 and I tried to reproduce this issue but without success, on both combinations of options (With S3 Staging and without). All attempts of connection testing were successful.

Also, I think maybe it is somehow related to connection version bumping and the difference in the connector's specification (before the issue the version was 0.3.28). I have tried to download 0.3.28, save the settings, and migrate to 6ef2f80eba24675e03a290e6acc368cf64e4c0e8, but still without any result.

@alexandr-shegeda alexandr-shegeda marked this pull request as ready for review May 24, 2022 13:22
…tion_check

# Conflicts:
#	docs/integrations/destinations/redshift.md
…pot_connection_check' into otsukanov/12459_Fix_DATs_blind_spot_connection_check

# Conflicts:
#	docs/integrations/destinations/redshift.md
@alexandertsukanov
Copy link
Contributor Author

alexandertsukanov commented May 25, 2022

/publish connector=connectors/destination-redshift

🕑 connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2385254820
❌ Failed to publish connectors/destination-redshift
❌ Couldn't auto-bump version for connectors/destination-redshift

@alexandertsukanov
Copy link
Contributor Author

alexandertsukanov commented May 25, 2022

/publish connector=connectors/destination-redshift

🕑 connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2385469786
❌ Failed to publish connectors/destination-redshift
❌ Couldn't auto-bump version for connectors/destination-redshift

@alexandertsukanov
Copy link
Contributor Author

alexandertsukanov commented May 25, 2022

/publish connector=connectors/destination-redshift

🕑 connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2385550096
🚀 Successfully published connectors/destination-redshift
🚀 Auto-bumped version for connectors/destination-redshift
✅ connectors/destination-redshift https://github.com/airbytehq/airbyte/actions/runs/2385550096

@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets May 25, 2022 16:13 Inactive
@alexandertsukanov alexandertsukanov merged commit 5371f87 into master May 25, 2022
@alexandertsukanov alexandertsukanov deleted the otsukanov/12459_Fix_DATs_blind_spot_connection_check branch May 25, 2022 16:14
@edgao
Copy link
Contributor

edgao commented May 25, 2022

this was my bad, the issue had the wrong commit hash. I think ae8c4f2b3a should be broken, can you try again with that hash?

@alexandertsukanov
Copy link
Contributor Author

@edgao I have switched to the commit, looks like the problem was here:

AbstractJdbcDestination.attemptSQLCreateAndDropTableOperations(outputSchema, database, nameTransformer, getSqlOperations());

and here:

@Override
public SqlOperations getSqlOperations() {
throw new UnsupportedOperationException("RedshiftCopyS3Destination.getSqlOperations() without arguments not supported in Redshift destination connector.");
}

As Destination Redshift operated RedshiftCopyS3Destination.java before, for S3 Staging processing - the reasonable exception was thrown. Also, I have tried RedshiftCopyDestinationAcceptanceTest.testCheckConnection() and test was failed, what is correct behaviour. So, I don't have an idea why this part of the code wasn't caught by the integration test before.

Screenshot 2022-05-26 at 11 22 46

The issue we faced.

@edgao
Copy link
Contributor

edgao commented May 27, 2022

something might be weird with how the CI tests work - I made a branch corresponding to that commit (#13246), where /test succeeds but running tests locally fails. I think we should reopen #12459 and investigate that behavior.

@alexandertsukanov
Copy link
Contributor Author

alexandertsukanov commented May 27, 2022

@edgao found why this doesn't work on CI. The problem in GCM, the file SECRET_DESTINATION-REDSHIFT__CREDS doesn't contain info about the bucket that's why S3 flow never triggers. 🙂

Here is the log https://github.com/airbytehq/airbyte/runs/6621069306?check_suite_focus=true#step:11:8804
The logic in destination-redshift is pretty straightforward:

public static DestinationType determineUploadMode(final JsonNode config) {
final var bucketNode = config.get("s3_bucket_name");
final var regionNode = config.get("s3_bucket_region");
final var accessKeyIdNode = config.get("access_key_id");
final var secretAccessKeyNode = config.get("secret_access_key");
// Since region is a Json schema enum with an empty string default, we consider the empty string an
// unset field.
final var emptyRegion = regionNode == null || regionNode.asText().equals("");
if (bucketNode == null && emptyRegion && accessKeyIdNode == null && secretAccessKeyNode == null) {
LOGGER.warn("The \"standard\" upload mode is not performant, and is not recommended for production. " +
"Please use the Amazon S3 upload mode if you are syncing a large amount of data.");
return DestinationType.INSERT_WITH_SUPER_TMP_TYPE;
}
if (bucketNode == null || regionNode == null || accessKeyIdNode == null || secretAccessKeyNode == null) {
throw new RuntimeException("Error: Partially missing S3 Configuration.");
}
return DestinationType.COPY_S3_WITH_SUPER_TMP_TYPE;
}

@alexandertsukanov
Copy link
Contributor Author

Updated secrets. Can now switch my job to the detective I think. 😁

@edgao
Copy link
Contributor

edgao commented May 27, 2022

nice catch! thanks for looking into it 👍

jscottpolevault pushed a commit to jscottpolevault/airbyte that referenced this pull request Jun 1, 2022
…tehq#13090)

* airbyte-12459: Removed RedshiftDataTmpTableMode.java as not needed enum. Code refactoring and clean-up.

* airbyte-12459: Removed RedshiftDataTmpTableMode.java as not needed enum. Code refactoring and clean-up.

* airbyte-12459: Merged master.

* airbyte-12459: Merged master.

* auto-bump connector version

Co-authored-by: Oleksandr Sheheda <[email protected]>
Co-authored-by: Octavia Squidington III <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix DATs blind spot: connection check
4 participants