Skip to content

Commit 80a2b64

Browse files
authored
📝 Document java connector memory profiling (#10983)
* Add doc for java connector memory profiling * Add doc to summary * Update doc * Update doc * Update doc
1 parent 0c064f1 commit 80a2b64

File tree

2 files changed

+97
-0
lines changed

2 files changed

+97
-0
lines changed

docs/SUMMARY.md

+1
Original file line numberDiff line numberDiff line change
@@ -231,6 +231,7 @@
231231
* [Building a Python Source](connector-development/tutorials/building-a-python-source.md)
232232
* [Building a Python Destination](connector-development/tutorials/building-a-python-destination.md)
233233
* [Building a Java Destination](connector-development/tutorials/building-a-java-destination.md)
234+
* [Profile Java Connector Memory](connector-development/tutorials/profile-java-connector-memory.md)
234235
* [Connector Development Kit (Python)](connector-development/cdk-python/README.md)
235236
* [Basic Concepts](connector-development/cdk-python/basic-concepts.md)
236237
* [Defining Stream Schemas](connector-development/cdk-python/schemas.md)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
# Profile Java Connector Memory Usage
2+
3+
This tutorial demos how to profile the memory usage of a Java connector with Visual VM. Such profiling can be useful when we want to debug memory leaks, or optimize the connector's memory footprint.
4+
5+
The example focuses on docker deployment, because it is more straightforward. It is also possible to apply the same procedure to Kubernetes deployments.
6+
7+
## Prerequisite
8+
- [Docker](https://www.docker.com/products/personal) running locally.
9+
- [VisualVM](https://visualvm.github.io/) preinstalled.
10+
11+
## Step-by-Step
12+
1. Enable JMX in `airbyte-integrations/connectors/<connector-name>/build.gradle`, and expose it on port 6000. The port is chosen arbitrary, and can be port number that's available.
13+
14+
```gradle
15+
application {
16+
mainClass = 'io.airbyte.integrations.<connector-main-class>'
17+
applicationDefaultJvmArgs = [
18+
'-XX:+ExitOnOutOfMemoryError',
19+
'-XX:MaxRAMPercentage=75.0',
20+
21+
// add the following JVM arguments to enable JMX:
22+
'-XX:NativeMemoryTracking=detail',
23+
'-XX:+UsePerfData',
24+
'-Djava.rmi.server.hostname=localhost',
25+
'-Dcom.sun.management.jmxremote=true',
26+
'-Dcom.sun.management.jmxremote.port=6000',
27+
"-Dcom.sun.management.jmxremote.rmi.port=6000",
28+
'-Dcom.sun.management.jmxremote.local.only=false',
29+
'-Dcom.sun.management.jmxremote.authenticate=false',
30+
'-Dcom.sun.management.jmxremote.ssl=false',
31+
32+
// optionally, add a max heap size to limit the memory usage
33+
'-Xmx2000m',
34+
]
35+
}
36+
```
37+
38+
2. Modify `airbyte-integrations/connectors/<connector-name>/Dockerfile` to expose the JMX port.
39+
40+
```dockerfile
41+
// optionally install procps to enable the ps command in the connector container
42+
RUN apt-get update && apt-get install -y procps && rm -rf /var/lib/apt/lists/*
43+
44+
// expose the same JMX port specified in the previous step
45+
EXPOSE 6000
46+
```
47+
48+
3. Expose the same port in `airbyte-workers/src/main/java/io/airbyte/workers/process/DockerProcessFactory.java`.
49+
50+
```java
51+
// map local 6000 to the JMX port from the container
52+
if (imageName.startsWith("airbyte/<connector-name>")) {
53+
LOGGER.info("Exposing image {} port 6000", imageName);
54+
cmd.add("-p");
55+
cmd.add("6000:6000");
56+
}
57+
```
58+
59+
Disable the [`host` network mode](https://docs.docker.com/network/host/) by _removing_ the following code block in the same file. This is necessary because under the `host` network mode, published ports are discarded.
60+
61+
```java
62+
if (networkName != null) {
63+
cmd.add("--network");
64+
cmd.add(networkName);
65+
}
66+
```
67+
68+
(This [commit](https://github.com/airbytehq/airbyte/pull/10394/commits/097ec57869a64027f5b7858aa8bb9575844e8b76) can be used as a reference. It reverts them. So just do the opposite.)
69+
70+
4. Build and launch Airbyte locally. It is necessary to build it because we have modified the `DockerProcessFactory.java`.
71+
72+
```sh
73+
SUB_BUILD=PLATFORM ./gradlew build -x test
74+
VERSION=dev docker compose up
75+
```
76+
77+
5. Build the connector to be profiled locally. It will create a `dev` version local image: `airbyte/<connector-name>:dev`.
78+
79+
```sh
80+
./gradlew :airbyte-integrations:connectors:<connector-name>:airbyteDocker
81+
```
82+
83+
6. Connect to the launched local Airbyte server at `localhost:8000`, go to the `Settings` page, and change the version of the connector to be profiled to `dev` which was just built in the previous step.
84+
85+
7. Create a connection using the connector to be profiled.
86+
- The `Replication frequency` of this connector should be `manual` so that we can control when it starts.
87+
- We can use the e2e test connectors as either the source or destination for convenience.
88+
- The e2e test connectors are usually very reliable, and requires little configuration.
89+
- For example, if we are profiling a source connector, create an e2e test destination at the other end of the connection.
90+
91+
8. Profile the connector in question.
92+
- Launch a data sync run.
93+
- After the run starts, open Visual VM, and click `File` / `Add JMX Connection...`. A modal will show up. Type in `localhost:6000`, and click `OK`.
94+
- Now we can see a new connection shows up under the `Local` category on the left, and the information about the connector's JVM gets retrieved.
95+
96+
![visual vm screenshot](https://visualvm.github.io/images/visualvm_screenshot_20.png)

0 commit comments

Comments
 (0)