You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Destination Connector and version: destination-s3 0.3.5
Step where error happened: Sync job
Current Behavior
we have some connection syncs that sometimes succeed after multiple failed attempts, I am fine with that (sync time is not important to us), However the data is duplicated in the destination (s3 bucket) because failed attempts data is not cleaned by the S3 destination,
Expected Behavior
keeping track of the files written during an attempt, and then deleting them if the attempt fails.
Are you willing to submit a PR?
Probably, if only the s3 destination connector is concerned by the update (not Airbyte Core )
The text was updated successfully, but these errors were encountered:
Hey @nord-ine,
The problem you mention is not specific to MSSQL or S3 but is more global on our database connectors.
Airbyte connectors keep track of the offset of the record they consumed in a state object. For database connectors this state is stored at the end of a successful sync. It means that on the next sync, after a failure, the same data will be replicated. We need to improve this behavior by implementing intermittent checkpointing to reduce the number of duplicates. This is something that is available for API connectors but not for database connectors. I created an issue for this improvement here: please subscribe to it to follow updates.
@alafanechere Hi I am facing an issue of data space getting used up over multiple failed attempts. Can you tell me where airbyte stores this intermediate state object or the intermediate records. I have not been able to see the data even on the airbyte db. I removed the dockers for airbyte, but still am not able to reclaim almost more than 60gb of local data space.
Environment
Current Behavior
we have some connection syncs that sometimes succeed after multiple failed attempts, I am fine with that (sync time is not important to us), However the data is duplicated in the destination (s3 bucket) because failed attempts data is not cleaned by the S3 destination,
Expected Behavior
keeping track of the files written during an attempt, and then deleting them if the attempt fails.
Are you willing to submit a PR?
Probably, if only the s3 destination connector is concerned by the update (not Airbyte Core )
The text was updated successfully, but these errors were encountered: