Skip to content

Sync job fails/retries itself after successfully transferring all the data. #5870

Closed
@gui0506

Description

@gui0506

Enviroment

  • Airbyte version: 0.29.13-alpha
  • OS Version / Instance: AWS EKS
  • Deployment: Kubernetes
  • Source Connector and version: Postgres 0.3.11
  • Destination Connector and version: BigQuery 0.1.1
  • Severity: Very Low / Low / Medium / High / Critical
  • Step where error happened: Deploy / Sync job / Setup new connection / Update connector / Upgrade Airbyte

Current Behavior

Sync job fails after successfully transferring all the data. The kubernetes pods are terminated gracefully with completed status.

Expected Behavior

Sync job should not fail after successfully transferring all data.

Logs

logs-185-0.txt

2021-09-06 23:56:25 ERROR () DefaultReplicationWorker(run):148 - Sync worker failed.
java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: Cannot find pod while trying to retrieve exit code. This probably means the Pod was not correctly created.
at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:?]
at java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:?]
at io.airbyte.workers.DefaultReplicationWorker.run(DefaultReplicationWorker.java:140) ~[io.airbyte-airbyte-workers-0.29.12-alpha.jar:?]
at io.airbyte.workers.DefaultReplicationWorker.run(DefaultReplicationWorker.java:52) ~[io.airbyte-airbyte-workers-0.29.12-alpha.jar:?]
at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$2(TemporalAttemptExecution.java:146) ~[io.airbyte-airbyte-workers-0.29.12-alpha.jar:?]
at java.lang.Thread.run(Thread.java:832) [?:?]
Suppressed: java.lang.RuntimeException: Cannot find pod while trying to retrieve exit code. This probably means the Pod was not correctly created.
at io.airbyte.workers.process.KubePodProcess.getReturnCode(KubePodProcess.java:548) ~[io.airbyte-airbyte-workers-0.29.12-alpha.jar:?]
at io.airbyte.workers.process.KubePodProcess.exitValue(KubePodProcess.java:573) ~[io.airbyte-airbyte-workers-0.29.12-alpha.jar:?]
at java.lang.Process.hasExited(Process.java:333) ~[?:?]
at java.lang.Process.isAlive(Process.java:323) ~[?:?]
at io.airbyte.workers.WorkerUtils.gentleCloseWithHeartbeat(WorkerUtils.java:111) ~[io.airbyte-airbyte-workers-0.29.12-alpha.jar:?]
at io.airbyte.workers.WorkerUtils.gentleCloseWithHeartbeat(WorkerUtils.java:95) ~[io.airbyte-airbyte-workers-0.29.12-alpha.jar:?]
at io.airbyte.workers.protocols.airbyte.DefaultAirbyteSource.close(DefaultAirbyteSource.java:126) ~[io.airbyte-airbyte-workers-0.29.12-alpha.jar:?]
at io.airbyte.workers.DefaultReplicationWorker.run(DefaultReplicationWorker.java:121) ~[io.airbyte-airbyte-workers-0.29.12-alpha.jar:?]
at io.airbyte.workers.DefaultReplicationWorker.run(DefaultReplicationWorker.java:52) ~[io.airbyte-airbyte-workers-0.29.12-alpha.jar:?]
at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$2(TemporalAttemptExecution.java:146) ~[io.airbyte-airbyte-workers-0.29.12-alpha.jar:?]
at java.lang.Thread.run(Thread.java:832) [?:?]

Steps to Reproduce

  1. Set up a connection with Postgres as the source (CDC) and BigQuery as the target
  2. Sync a large table (In my case, 10.18 GB | 37,656,941 records)
  3. Sometimes the sync will fail and retry even though everything was successful. (Not always. About 20% of the time)

Are you willing to submit a PR?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions