Description
Connector Name
destination-snowflake
Connector Version
3.15.4
What step the error happened?
During the sync
Relevant information
I’m using Airbyte OSS to sync data from Postgres to Snowflake in incremental + append mode, with CDC (Change Data Capture) enabled.
- My Postgres source is configured in CDC mode using WAL logs.
- In Airbyte, the connection is set to:
- “Append historical changes” + “Append new rows and updates only”
- Tables are configured as “incremental + append”
- “Detect and propagate schema changes” is set to “Propagate field changes only”
Recently, I encountered an issue when a column in one of my source tables was renamed. Airbyte detected this as a deletion of the old column and an addition of a new one (with the updated name). This makes sense from a schema evolution perspective, but the problem occurred on the destination side in Snowflake:
- The old column was completely removed from Snowflake.
- A new column (with the updated name) was added.
- Historical data in the old column was lost because the column itself was deleted.
- The new column remains NULL for all existing records unless they get updated in the future (since I’m using CDC).
This behavior contradicts what I understand from the official documentation, which states:
“The old field will be retained in the destination, but stop updating with updated values. If the connection is ever cleared or refreshed, the field and its historical data will be removed entirely.”
However, in my case, the old column was not retained at all—it was deleted entirely, leading to data loss for past records. I expected Airbyte to keep the old column in Snowflake, even if it stopped updating.
Additional Context:
I was able to reproduce this issue in a local environment using:
- Postgres 17
- Airbyte OSS 1.4
- Postgres source connector 3.6.28
- Snowflake destination connector 3.15.4
To investigate further, I also tested simply deleting a column (without renaming it) in my local environment. I observed the same behavior: the column was also deleted from Snowflake, leading to data loss. This means the issue is not just related to column renaming, but also to column deletion.
Relevant log output
Contribute
- Yes, I want to contribute