Skip to content

Destination Redshift: Connector v0.6.5 runs out of memory  #30995

Open
@honggyu-rr

Description

@honggyu-rr

Connector Name

destination-redshift

Connector Version

0.6.5

What step the error happened?

During the sync

Revelant information

I'm moving a fairly large table from Postgres to Redshift with S3 staging. I started hitting oom with 2gb of memory (which seemed reasonable for this fairly wide and tall 500gb table) so I upped the memory to 8 to 16 to 24gb, but I would hit the same oom error every time in the same way. Verified by checking memory from the destination container with docker stats close to time of error.

I reverted back to 0.6.3, and I haven't seen my memory tick above 2.5gb

Relevant log output

2023-10-02 18:43:52 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):143 Flush Worker (dff4a) -- Worker picked up work.
6442
2023-10-02 18:43:52 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):145 Flush Worker (dff4a) -- Attempting to read from queue namespace: null, stream: profiles.
6443
2023-10-02 18:43:52 destination > INFO i.a.i.d.FlushWorkers(lambda$flush$1):158 Flush Worker (dff4a) -- Batch contains: 25007 records, 50 MB bytes.
6444
2023-10-02 18:43:52 destination > INFO i.a.i.d.r.BaseSerializedBuffer(flush):172 Finished writing data to ae67d319-a012-4d19-87f1-696bb4ffb2406406735843580315446.csv.gz (6 MB)
6445
2023-10-02 18:43:52 destination > INFO i.a.i.d.s.AsyncFlush(flush):94 Flushing CSV buffer for stream profiles (6 MB) to staging
6446
2023-10-02 18:43:52 destination > INFO a.m.s.StreamTransferManager(getMultiPartOutputStreams):329 Initiated multipart upload to airbyte-self/airbyte_profiles/2023_10_02_18_d91e0d24-71ba-415e-a5ea-423163cb6f47/3cc7d4ef-867a-4424-8c2b-1c119aabc836.csv.gz with full ID mL2I2g4VpOhzeBOqnDEXIyAmPPvrSAdtg9eYP1kWYlXHVP9c.V2nbZld.MHPp4GKL15vmbS7TpAVlC7cBWT4s8h3LduoI1N5GRTI7JX6p1uMDhyllLb4smrZKHw4FErW
6447
2023-10-02 18:43:52 destination > [1199.090s][warning][gc,alloc] pool-5-thread-1: Retried waiting for GCLocker too often allocating 2097154 words
6448
2023-10-02 18:43:52 destination > Terminating due to java.lang.OutOfMemoryError: Java heap space
6449
2023-10-02 18:43:53 destination > Destination process done (exit code 3)
6450
2023-10-02 18:43:53 destination > Skipping in-connector normalization
        at java.lang.ProcessBuilder$NullOutputStream.write(ProcessBuilder.java:445) ~[?:?]
        at java.io.OutputStream.write(OutputStream.java:164) ~[?:?]
        at java.io.BufferedOutputStream.implWrite(BufferedOutputStream.java:216) ~[?:?]
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:205) ~[?:?]
        at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:313) ~[?:?]
        at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:409) ~[?:?]
        at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:414) ~[?:?]
        at sun.nio.cs.StreamEncoder.lockedFlush(StreamEncoder.java:218) ~[?:?]
        at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:205) ~[?:?]
        at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:263) ~[?:?]
        at java.io.BufferedWriter.implFlush(BufferedWriter.java:372) ~[?:?]
        at java.io.BufferedWriter.flush(BufferedWriter.java:359) ~[?:?]
        at io.airbyte.workers.internal.DefaultAirbyteMessageBufferedWriter.flush(DefaultAirbyteMessageBufferedWriter.java:31) ~[io.airbyte-airbyte-commons-worker-0.50.30.jar:?]
        at io.airbyte.workers.internal.DefaultAirbyteDestination.notifyEndOfInput(DefaultAirbyteDestination.java:117) ~[io.airbyte-airbyte-commons-worker-0.50.30.jar:?]
        at io.airbyte.workers.general.BufferedReplicationWorker.writeToDestination(BufferedReplicationWorker.java:420) ~[io.airbyte-airbyte-commons-worker-0.50.30.jar:?]
        at io.airbyte.workers.general.BufferedReplicationWorker.lambda$runAsync$2(BufferedReplicationWorker.java:229) ~[io.airbyte-airbyte-commons-worker-0.50.30.jar:?]
        at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
        at java.lang.Thread.run(Thread.java:1589) ~[?:?]
2023-10-02 18:43:53 INFO i.a.w.g.BufferedReplicationWorker(writeToDestination):428 - writeToDestination: done. (forDest.isDone:false, isDestRunning:false)
2023-10-02 18:43:54 INFO i.a.w.g.BufferedReplicationWorker(readFromSource):353 - readFromSource: exception caught
java.lang.IllegalStateException: Source process is still alive, cannot retrieve exit value.
        at com.google.common.base.Preconditions.checkState(Preconditions.java:502) ~[guava-31.1-jre.jar:?]
        at io.airbyte.workers.internal.DefaultAirbyteSource.getExitValue(DefaultAirbyteSource.java:114) ~[io.airbyte-airbyte-commons-worker-0.50.30.jar:?]
        at io.airbyte.workers.general.BufferedReplicationWorker.readFromSource(BufferedReplicationWorker.java:340) ~[io.airbyte-airbyte-commons-worker-0.50.30.jar:?]
        at io.airbyte.workers.general.BufferedReplicationWorker.lambda$runAsyncWithHeartbeatCheck$3(BufferedReplicationWorker.java:236) ~[io.airbyte-airbyte-commons-worker-0.50.30.jar:?]
        at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
        at java.lang.Thread.run(Thread.java:1589) ~[?:?]
2023-10-02 18:43:54 INFO i.a.w.g.BufferedReplicationWorker(readFromSource):356 - readFromSource: done. (source.isFinished:false, fromSource.isClosed:true)

Contribute

  • Yes, I want to contribute

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions