Open
Description
Environment
- Airbyte version: 0.32.0-alpha
- OS Version / Instance: Linux/UNIX
- Deployment: Docker
- Memory / Disk: 30GB SSD (c5a.2xlarge)
- Source Connector and version: Kafka 0.1.2
- Destination Connector and version: Trying on postgres. ( this is independent )
- Severity: Medium
- Step where error happened: During reset.
Following scenario is based on the assumption that Kafka is maintaining data (or some part of the data) and it is expected that data beyond retention period is not recoverable.
Current Behavior
- When I was working with other source connector. On reset, source's cursor also reset and data starting coming from the beginning of the table. But when I was working with Kafka connector, this behaviour was not reflected here. If someone reset the connection then table from destination get drop and on re-sync no data comes, because kafka offset was not moved back to its initial position.
Expected Behavior
- On reset, along with deletion of data from destination, Kafka offset should also get reset. Else reset will be a problem for user. User will never able to bring back old data.
At-least this behaviour is aligned with the behaviour of the other connection.
Logs
Not required. No error is coming on logs.
How to replicate/ Steps to reproduce.
User can ingest data from kafka and after ingestion reset the connection. After complete reset, user can re-sync the connection. No data will be ingested. Unless new data has been loaded by Kafka producer. Also, if new data has been pushed by Kafka producer, then only new data will be ingested, no previous data will come.
Suggestion
Airbyte should able to reset offset using Airbyte's kafka consumer.