Description
Tell us about the problem you're trying to solve
Like described in this doc, we want to support TLS encryption when connecting to the major databases/warehouse.
Note that we do NOT need to support certificate verification as part of this issue -- just encryption of data over the wire. In other words, the focus is protecting against eavesdropping, not man-in-the-middle attacks. See the document linked for more details.
Here are the DBs we should support:
Must-haves for Airbyte Cloud
- Add support for MSSQL source/destination via TLS/SSL #6007
- Add support for Postgres source via TLS/SSL #6008
- Add support for Postgres destination via TLS/SSL #6009
- Add support for MySQL source via TLS/SSL #6010
- Add support for Oracle source via TLS/SSL #6011
- Add support for MySQL destination via TLS/SSL #6012
- Add support for Oracle destination via TLS/SSL #6013
- Add support for MongoDB source via TLS/SSL #6014
- Add support for MongoDB destination via TLS/SSL #6015
Nice-to-haves for Airbyte Cloud
- Redshift Src/Destination #6429
- BigQuery Src/Destination #6431
- Snowflake Src/Destination #6432
- DB2 source #6434
- Clickhouse source TLS #6435
- Kafka source/destination #6436
any other warehouses/DBs I'm missing
Describe the solution you’d like
Go through each source/destination in the must-have list. If the connector doesn't support encryption at all then create a ticket to support TLS/SSL for it.
The acceptance criteria for each ticket is:
- Implement encryption support in the connector if not already implemented. Where possible, support encryption by default. If encryption-by-default is a bad idea (for example, if most MySQL versions do not support encryption and would require special work from the DB administrator) then expose it as an option in the connector specification, and encrypt when the user requests it.
- The external documentation of the connector mentions that encryption is supported
- If encryption is exposed as an option, add in the connector spec and docs a recommendation to use it (for example, MSSQL source mentions that encryption without server certification is used for testing purposes only, which is not true, see the doc above)
- Encrypted connections are tested as part of either a custom integration test or acceptance tests. Where possible, test it using a test container. If that's impossible and it must be tested on a real DB instance, create a DB instance in AWS ideally using terraform (but if TF is too hard just create it manually and make a ticket to encode it in TF)
- Create a PR
Implementation hints
There is a difference when implementing this for sources & destinations because destinations might need to change normalization as well.
When implementing this for sources, it's probably as simple as setting a flag e.g: Mysql uses the --ssl=REQUIRED
flag.
When implementing for destinations it might be very similar, but there will be two places to edit this: in the destination connector itself and in the normalization module. It might be easiest to ask the Python team to implement the piece around normalization, but it really shouldn't be that complicated e.g: if it's just adding a flag -- it's ideal if you can implement it yourself since you'll learn a bit about normalization, but this is not a primary goal of this ticket. The goal is to support TLS as soon as possible.