|
| 1 | +# Profile Java Connector Memory Usage |
| 2 | + |
| 3 | +This tutorial demos how to profile the memory usage of a Java connector with Visual VM. Such profiling can be useful when we want to debug memory leaks, or optimize the connector's memory footprint. |
| 4 | + |
| 5 | +The example focuses on docker deployment, because it is more straightforward. It is also possible to apply the same procedure to Kubernetes deployments. |
| 6 | + |
| 7 | +## Prerequisite |
| 8 | +- [Docker](https://www.docker.com/products/personal) running locally. |
| 9 | +- [VisualVM](https://visualvm.github.io/) preinstalled. |
| 10 | + |
| 11 | +## Step-by-Step |
| 12 | +1. Enable JMX in `airbyte-integrations/connectors/<connector-name>/build.gradle`, and expose it on port 6000. The port is chosen arbitrary, and can be port number that's available. |
| 13 | + |
| 14 | + ```gradle |
| 15 | + application { |
| 16 | + mainClass = 'io.airbyte.integrations.<connector-main-class>' |
| 17 | + applicationDefaultJvmArgs = [ |
| 18 | + '-XX:+ExitOnOutOfMemoryError', |
| 19 | + '-XX:MaxRAMPercentage=75.0', |
| 20 | +
|
| 21 | + // add the following JVM arguments to enable JMX: |
| 22 | + '-XX:NativeMemoryTracking=detail', |
| 23 | + '-XX:+UsePerfData', |
| 24 | + '-Djava.rmi.server.hostname=localhost', |
| 25 | + '-Dcom.sun.management.jmxremote=true', |
| 26 | + '-Dcom.sun.management.jmxremote.port=6000', |
| 27 | + "-Dcom.sun.management.jmxremote.rmi.port=6000", |
| 28 | + '-Dcom.sun.management.jmxremote.local.only=false', |
| 29 | + '-Dcom.sun.management.jmxremote.authenticate=false', |
| 30 | + '-Dcom.sun.management.jmxremote.ssl=false', |
| 31 | +
|
| 32 | + // optionally, add a max heap size to limit the memory usage |
| 33 | + '-Xmx2000m', |
| 34 | + ] |
| 35 | + } |
| 36 | + ``` |
| 37 | + |
| 38 | +2. Modify `airbyte-integrations/connectors/<connector-name>/Dockerfile` to expose the JMX port. |
| 39 | + |
| 40 | + ```dockerfile |
| 41 | + // optionally install procps to enable the ps command in the connector container |
| 42 | + RUN apt-get update && apt-get install -y procps && rm -rf /var/lib/apt/lists/* |
| 43 | + |
| 44 | + // expose the same JMX port specified in the previous step |
| 45 | + EXPOSE 6000 |
| 46 | + ``` |
| 47 | + |
| 48 | +3. Expose the same port in `airbyte-workers/src/main/java/io/airbyte/workers/process/DockerProcessFactory.java`. |
| 49 | + |
| 50 | + ```java |
| 51 | + // map local 6000 to the JMX port from the container |
| 52 | + if (imageName.startsWith("airbyte/<connector-name>")) { |
| 53 | + LOGGER.info("Exposing image {} port 6000", imageName); |
| 54 | + cmd.add("-p"); |
| 55 | + cmd.add("6000:6000"); |
| 56 | + } |
| 57 | + ``` |
| 58 | + |
| 59 | + Disable the [`host` network mode](https://docs.docker.com/network/host/) by _removing_ the following code block in the same file. This is necessary because under the `host` network mode, published ports are discarded. |
| 60 | + |
| 61 | + ```java |
| 62 | + if (networkName != null) { |
| 63 | + cmd.add("--network"); |
| 64 | + cmd.add(networkName); |
| 65 | + } |
| 66 | + ``` |
| 67 | + |
| 68 | + (This [commit](https://github.com/airbytehq/airbyte/pull/10394/commits/097ec57869a64027f5b7858aa8bb9575844e8b76) can be used as a reference. It reverts them. So just do the opposite.) |
| 69 | + |
| 70 | +4. Build and launch Airbyte locally. It is necessary to build it because we have modified the `DockerProcessFactory.java`. |
| 71 | + |
| 72 | + ```sh |
| 73 | + SUB_BUILD=PLATFORM ./gradlew build -x test |
| 74 | + VERSION=dev docker compose up |
| 75 | + ``` |
| 76 | + |
| 77 | +5. Build the connector to be profiled locally. It will create a `dev` version local image: `airbyte/<connector-name>:dev`. |
| 78 | + |
| 79 | + ```sh |
| 80 | + ./gradlew :airbyte-integrations:connectors:<connector-name>:airbyteDocker |
| 81 | + ``` |
| 82 | + |
| 83 | +6. Connect to the launched local Airbyte server at `localhost:8000`, go to the `Settings` page, and change the version of the connector to be profiled to `dev` which was just built in the previous step. |
| 84 | + |
| 85 | +7. Create a connection using the connector to be profiled. |
| 86 | + - The `Replication frequency` of this connector should be `manual` so that we can control when it starts. |
| 87 | + - We can use the e2e test connectors as either the source or destination for convenience. |
| 88 | + - The e2e test connectors are usually very reliable, and requires little configuration. |
| 89 | + - For example, if we are profiling a source connector, create an e2e test destination at the other end of the connection. |
| 90 | + |
| 91 | +8. Profile the connector in question. |
| 92 | + - Launch a data sync run. |
| 93 | + - After the run starts, open Visual VM, and click `File` / `Add JMX Connection...`. A modal will show up. Type in `localhost:6000`, and click `OK`. |
| 94 | + - Now we can see a new connection shows up under the `Local` category on the left, and the information about the connector's JVM gets retrieved. |
| 95 | +
|
| 96 | +  |
0 commit comments