Skip to content

🚨 Validate source JDBC url parameters 🚨 #14586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jul 15, 2022

Conversation

ryankfu
Copy link
Contributor

@ryankfu ryankfu commented Jul 11, 2022

What

Adding in validating logic for the check functionality within Postgres Source connector #13851

EDIT: bumped version for postgres-strict-encrypt here: #14735

How

The solution follows similar to the Postgres Destination connector where the code verifies that JDBC parameters do not overlap with the custom set parameters by the user in jdbc_url_params

NOTE: The order of validation mirrors the AbstractJdbcDestination class where the validation check comes after a toJdbcConfig has been created. There is some questions regarding the bifurcation of why the naming of methods between Source and Destination are not the same for creating a database config but that has been punted in favor of getting this functionality matching the Destination class

Recommended reading order

  1. AbstractJdbcDestination#getDataSource.java
  2. PostgresDestination.java
  3. AbstractJdbcSource#createDataSource.java
  4. PostgresSource.java

🚨 User Impact 🚨

Are there any breaking changes? What is the end result perceived by the user? If yes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.

Should not have any breaking changes but it CAN due to the situation where the code now throws an IllegalArgumentException because in the past we did not notify the user their jdbc_url_params had conflicting parameters to the custom parameters set within Postgres

Pre-merge Checklist

Expand the relevant checklist and delete the others.

New Connector

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
    • docs/integrations/README.md
    • airbyte-integrations/builds.md
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the connector is published, connector added to connector index as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here
Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub and connector version bumped by running the /publish command described here
Connector Generator
  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • If adding a new generator, add it to the list of scaffold modules being tested
  • The generator test modules (all connectors with -scaffold in their name) have been updated with the latest scaffold by running ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates then checking in your changes
  • Documentation which references the generator is updated as needed

Tests

Unit

Put your unit tests output here.

Integration

Put your integration tests output here.

Acceptance

Put your acceptance tests output here.

@ryankfu ryankfu requested a review from a team as a code owner July 11, 2022 16:54
@CLAassistant
Copy link

CLAassistant commented Jul 11, 2022

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@grishick grishick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall - great job on refactoring! My main concern is the behavior immediate after we upgrade a JDBC source connector to a version that contains this change. We need to avoid breaking sync jobs that were working before.

@@ -62,4 +65,8 @@ public static Map<String, String> parseJdbcParameters(final String jdbcPropertie
return parameters;
}

public static boolean useSsl(final JsonNode config) {
return !config.has(SSL_KEY) || config.get(SSL_KEY).asBoolean();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am reading this correctly, useSsl returns true if there is no SSL_KEY in config? Is that the intended behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be quite frank, I'm not completely sure the reason for why useSsl would want to return true if there is no SSL_KEY. I mainly refactored this portion of code from other parts of the code base

However, it seem that all the places I saw this usage of useSsl if no SSL_KEY has been set in the config object that we want to set ssl in the config. My presumption would be because these connections require ssl to operate correctly. I can update the function with javadoc comment that the code checks for if no SSL_KEY exists we also return true so it's clear in case there's a misunderstanding on the functionality of this method. Would there be a group or individual that knows more about why this method operates the way it does?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was changed here f308690#diff-f14bc8e9216a505150004f54d41ca7aeeba9a9460fc4a660b11225dd7abb274fR81 and my guess is that if that the default behaviour should be ssl true if the field is not present.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to clarify the logic and will be adding some tests to codify the logic but the asBoolean is the key to understanding this method. The asBoolean function will map a value to a boolean if it can (e.g. <int>0 -> <boolean>false, <string>false -> <boolean>false, <int>3 -> <boolean> true). Basically the logic is:

If SSL_KEY is not set OR (if SSL_KEY is set AND the value of the SSL_KEY is mapped to true) THEN we set ssl fields

@@ -66,9 +67,6 @@ public JsonNode toJdbcConfig(final JsonNode config) {
return Jsons.jsonNode(configBuilder.build());
}

private boolean useSsl(final JsonNode config) {
return !config.has("ssl") || config.get("ssl").asBoolean();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job factoring out this copy-pasta!

* @return A mapping of the default connection properties
*/
protected Map<String, String> getDefaultConnectionProperties(final JsonNode config) {
return JdbcUtils.parseJdbcParameters(config, "connection_properties", getJdbcParameterDelimiter());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use "connection_properties" string in 10+ places in Java code, let's make it a public static String constant in JdbcUtils

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as with the usage of ssl and sslmode for constants. I'm happy to make the necessary commit to update these as constants, however I left them out merely to focus on getting the main functionality addressed

final Map<String, String> defaultParameters) {
for (final String key : defaultParameters.keySet()) {
if (customParameters.containsKey(key) && !Objects.equals(customParameters.get(key), defaultParameters.get(key))) {
throw new IllegalArgumentException("Cannot overwrite default JDBC parameter " + key);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: when Airbyte tries to run a sync job, will this exception be thrown if a user has created a connection with an older version of the connector that did not perform this validation and then upgraded to the version of connector that performs this validation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there will be an exception thrown upon upgrading to this new version where validation would exist. Based on that, it seems like I should note this commit will potentially cause breaking changes.

I'm curious to know if when the validation check was added to the AbstractJdbcDestination did we notify customers ahead of time as well because the validation check did not originally exist to the best of my knowledge. @girarda it seems you were the person that last touched this assertCustomParametersDontOverwriteDefaultParameters method in AbstractJdbcDestination, can you confirm if we notified customers that their jdbc_url parameters could break or how did we approach this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We didn't notify the customers because it wasn't yet possible to set jdbc_url parameters when we added that check

Copy link
Contributor Author

@ryankfu ryankfu Jul 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we would like to have feature parity of validation between Sources and Destinations it seems that this is a concern that I would defer the decision to someone who can make that executive call.

My thoughts would be, while it may be unfortunate that users will have breaking changes, the scenario for a breaking change is mostly if Postgres users have conflicting ssl settings in their jdbc_url in which case they may even already have unexpected results since the jdbc_url parameters may be overwriting our own custom ssl configurations. However, I cannot say with 100% certainty that there might not be other scenarios that could break a customer's jdbc_url parameters so will want to proceed with caution

EDIT: The conclusion reached is to confirm if this validation check is in the critical path for syncs, meaning that this change would break users who already configured their jdbc_url parameters improperly but the code was not aware. If the validation check is in the critical path, opt to instead of throwing an IllegalArgumentException to have a AirbyteTraceMessage. Otherwise if the logic only affects the UI side then remain with throwing an Exception

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its fine, we should improve the error message to include which property is causing the problem and ask user to remove that property from the setup form

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed with Subodh


static final String DRIVER_CLASS = DatabaseDriver.POSTGRESQL.getDriverClassName();
static final Map<String, String> SSL_JDBC_PARAMETERS = ImmutableMap.of(
"ssl", "true",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should use shared constants for "ssl" and "sslmode"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree as well, I opted not to add this change into this commit since it's not quite in scope. However, I'm not married to the idea that it should be a separate ticket, if anything it should be a separate commit in case we want to roll back that change. What are your thoughts? Should I include the shared constants in this PR or create a separate ticket to address the ssl and sslmode constants?

final Builder<Object, Object> configBuilder = ImmutableMap.builder()
.put("username", config.get("username").asText())
.put("jdbc_url", jdbcUrl.toString());
final ImmutableMap.Builder<Object, Object> configBuilder = ImmutableMap.builder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good job getting rid of strings

@ryankfu ryankfu changed the title Validate source JDBC url parameters 🚨 Validate source JDBC url parameters 🚨 Jul 11, 2022
Copy link
Contributor

@subodh1810 subodh1810 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense! I would approve once you run all the tests for each of the connector that we are editing here.

*/
protected Map<String, String> getConnectionProperties(final JsonNode config) {
final Map<String, String> customProperties = JdbcUtils.parseJdbcParameters(config, JdbcUtils.JDBC_URL_PARAMS_KEY);
final Map<String, String> defaultProperties = getDefaultConnectionProperties(config);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The getDefaultConnectionProperties in master uses the jdbcConfig (output of final JsonNode jdbcConfig = toDatabaseConfig(config); ) not the config but we are changing it to use the config, is that intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the reason I chose to have config as opposed to jdbcConfig is because it will match the same arguments that AbstractJdbcDestination#getDefaultConnectionProperties gets passed from AbstractJdbcDestination#getDataSource. Also jdbcConfig is merely a subset of config so it will have the same information overall

final Map<String, String> defaultParameters) {
for (final String key : defaultParameters.keySet()) {
if (customParameters.containsKey(key) && !Objects.equals(customParameters.get(key), defaultParameters.get(key))) {
throw new IllegalArgumentException("Cannot overwrite default JDBC parameter " + key);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its fine, we should improve the error message to include which property is causing the problem and ask user to remove that property from the setup form

@@ -62,4 +65,8 @@ public static Map<String, String> parseJdbcParameters(final String jdbcPropertie
return parameters;
}

public static boolean useSsl(final JsonNode config) {
return !config.has(SSL_KEY) || config.get(SSL_KEY).asBoolean();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was changed here f308690#diff-f14bc8e9216a505150004f54d41ca7aeeba9a9460fc4a660b11225dd7abb274fR81 and my guess is that if that the default behaviour should be ssl true if the field is not present.

@ryankfu ryankfu temporarily deployed to more-secrets July 12, 2022 21:38 Inactive
@grishick
Copy link
Contributor

grishick commented Jul 12, 2022

/test connector=source-postgres

🕑 source-postgres https://github.com/airbytehq/airbyte/actions/runs/2659819852
❌ source-postgres https://github.com/airbytehq/airbyte/actions/runs/2659819852
🐛 https://gradle.com/s/yeuz4wutxwmqu

Build Failed

Test summary info:

Could not find result summary

@ryankfu ryankfu force-pushed the ryan/validate-source-jdbc-params branch from 6d38f20 to b96e09e Compare July 12, 2022 23:29
@ryankfu ryankfu temporarily deployed to more-secrets July 12, 2022 23:32 Inactive
@ryankfu ryankfu temporarily deployed to more-secrets July 13, 2022 01:14 Inactive
Copy link
Contributor

@grishick grishick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is a compilation error after the rebase, because @tuliren moved isCdC from PostgresSource to PostgresUtils

@ryankfu ryankfu force-pushed the ryan/validate-source-jdbc-params branch from d8af95a to 724e21b Compare July 13, 2022 04:39
@ryankfu ryankfu temporarily deployed to more-secrets July 13, 2022 04:42 Inactive
@ryankfu
Copy link
Contributor Author

ryankfu commented Jul 13, 2022

/test connector=source-postgres

🕑 source-postgres https://github.com/airbytehq/airbyte/actions/runs/2664977213
✅ source-postgres https://github.com/airbytehq/airbyte/actions/runs/2664977213
No Python unittests run

Build Passed

Test summary info:

All Passed

@ryankfu ryankfu requested a review from grishick July 13, 2022 17:55
@ryankfu ryankfu force-pushed the ryan/validate-source-jdbc-params branch from 724e21b to de6750f Compare July 13, 2022 17:57
@ryankfu ryankfu temporarily deployed to more-secrets July 13, 2022 17:59 Inactive
@ryankfu
Copy link
Contributor Author

ryankfu commented Jul 13, 2022

/test connector=source-postgres

🕑 source-postgres https://github.com/airbytehq/airbyte/actions/runs/2665683874
✅ source-postgres https://github.com/airbytehq/airbyte/actions/runs/2665683874
No Python unittests run

Build Passed

Test summary info:

All Passed

@ryankfu ryankfu temporarily deployed to more-secrets July 13, 2022 21:07 Inactive
@ryankfu ryankfu temporarily deployed to more-secrets July 13, 2022 21:45 Inactive
@ryankfu ryankfu force-pushed the ryan/validate-source-jdbc-params branch from de6750f to 43240fe Compare July 13, 2022 23:49
@ryankfu
Copy link
Contributor Author

ryankfu commented Jul 14, 2022

/publish connector=connectors/source-postgres

🕑 Publishing the following connectors:
connectors/source-postgres
https://github.com/airbytehq/airbyte/actions/runs/2673696734


Connector Did it publish? Were definitions generated?
connectors/source-postgres

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@ryankfu ryankfu temporarily deployed to more-secrets July 14, 2022 23:23 Inactive
@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets July 15, 2022 00:46 Inactive
@grishick
Copy link
Contributor

@ryankfu you can merge this PR now

@ryankfu ryankfu force-pushed the ryan/validate-source-jdbc-params branch from 6d6c4e2 to 68e14e9 Compare July 15, 2022 01:51
@ryankfu ryankfu temporarily deployed to more-secrets July 15, 2022 01:53 Inactive
@ryankfu ryankfu temporarily deployed to more-secrets July 15, 2022 02:10 Inactive
@ryankfu ryankfu temporarily deployed to more-secrets July 15, 2022 02:34 Inactive
@ryankfu ryankfu force-pushed the ryan/validate-source-jdbc-params branch from 68e14e9 to a8dcaf6 Compare July 15, 2022 04:44
@ryankfu ryankfu temporarily deployed to more-secrets July 15, 2022 04:46 Inactive
@ryankfu ryankfu force-pushed the ryan/validate-source-jdbc-params branch from a8dcaf6 to f1bff81 Compare July 15, 2022 15:33
@ryankfu ryankfu temporarily deployed to more-secrets July 15, 2022 15:36 Inactive
@ryankfu ryankfu temporarily deployed to more-secrets July 15, 2022 17:21 Inactive
@ryankfu ryankfu force-pushed the ryan/validate-source-jdbc-params branch from f1bff81 to 2e5988f Compare July 15, 2022 17:46
@ryankfu ryankfu temporarily deployed to more-secrets July 15, 2022 17:49 Inactive
@ryankfu ryankfu merged commit 5edf3fe into master Jul 15, 2022
@ryankfu ryankfu deleted the ryan/validate-source-jdbc-params branch July 15, 2022 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants