Skip to content

[exporter/clickhouse] Integration test hits a panic #32530

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
crobert-1 opened this issue Apr 18, 2024 · 9 comments · Fixed by #38798
Closed

[exporter/clickhouse] Integration test hits a panic #32530

crobert-1 opened this issue Apr 18, 2024 · 9 comments · Fixed by #38798

Comments

@crobert-1
Copy link
Member

Component(s)

exporter/clickhouse

Describe the issue you're reporting

Failing CI/CD link

For context, some of these tests haven't been run for a while (if ever), so this may not be a recent break. See #32207.

Failure output:

Running target 'mod-integration-test' in module 'exporter/clickhouseexporter' as part of group 'exporter-0'
make --no-print-directory -C exporter/clickhouseexporter mod-integration-test
go: downloading github.com/klauspost/compress v1.17.7
running go integration test ./... in /home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/exporter/clickhouseexporter
/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/.tools/gotestsum --rerun-fails=1 --packages="./..." -- -race -timeout 360s -parallel 4 -tags=integration,""
go: downloading github.com/jmoiron/sqlx v1.3.5
go: downloading github.com/testcontainers/testcontainers-go v0.30.0
go: downloading dario.cat/mergo v1.0.0
go: downloading github.com/containerd/containerd v1.7.12
go: downloading github.com/cpuguy83/dockercfg v0.3.1
go: downloading github.com/moby/patternmatcher v0.6.0
go: downloading github.com/moby/term v0.5.0
go: downloading golang.org/x/exp v0.0.0-20230711023510-fffb14384f22
go: downloading go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.49.0
go: downloading github.com/containerd/log v0.1.0
go: downloading github.com/moby/sys/sequential v0.5.0
go: downloading github.com/morikuni/aec v1.0.0
go: downloading github.com/moby/sys/user v0.1.0
∅  internal/metadata
✓  internal (1.02s)
✖  . (13.602s)

=== Failed
=== FAIL: . TestIntegration/test_clickhouse_24-alpine (13.53s)
2024/04/18 18:48:46 github.com/testcontainers/testcontainers-go - Connected to docker: 
  Server Version: 24.0.9
  API Version: 1.43
  Operating System: Ubuntu 22.04.4 LTS
  Total Memory: 15981 MB
  Resolved Docker Host: unix:///var/run/docker.sock
  Resolved Docker Socket Path: /var/run/docker.sock
  Test SessionID: a51b8c75e41a407120f69d0b2cfc5ad0ce071bcc50ee005aac596474ef9bc907
  Test ProcessID: aa1bae38-c496-4218-9763-4306ba02d328
2024/04/18 18:48:48 🐳 Creating container for image testcontainers/ryuk:0.7.0
2024/04/18 18:48:48 ✅ Container created: 8f6246042821
2024/04/18 18:48:48 🐳 Starting container: 8f6246042821
2024/04/18 18:48:48 ✅ Container started: 8f6246042821
2024/04/18 18:48:48 🚧 Waiting for container id 8f6246042821 image: testcontainers/ryuk:0.7.0. Waiting for: &{Port:8080/tcp timeout:<nil> PollInterval:100ms}
2024/04/18 18:48:48 🔔 Container is ready: 8f6246042821
2024/04/18 18:48:58 🐳 Creating container for image clickhouse/clickhouse-server:24-alpine
2024/04/18 18:48:58 ✅ Container created: 5b2768928837
2024/04/18 18:48:58 🐳 Starting container: 5b2768928837
2024/04/18 18:48:58 ✅ Container started: 5b2768928837
2024/04/18 18:48:58 🚧 Waiting for container id 5b2768928837 image: clickhouse/clickhouse-server:24-alpine. Waiting for: &{Port:9000 timeout:0xc0004dc398 PollInterval:100ms}
2024/04/18 18:48:59 🔔 Container is ready: 5b2768928837
2024/04/18 18:48:59 🐳 Starting container: 5b2768928837
2024/04/18 18:48:59 ✅ Container started: 5b2768928837
2024/04/18 18:48:59 🚧 Waiting for container id 5b2768928837 image: clickhouse/clickhouse-server:24-alpine. Waiting for: &{Port:9000 timeout:0xc0004dc398 PollInterval:100ms}
2024/04/18 18:48:59 🔔 Container is ready: 5b2768928837
2024/04/18 18:48:59 🐳 Terminating container: 5b2768928837
2024/04/18 18:48:59 🚫 Container terminated: 5b2768928837

=== FAIL: . TestIntegration (13.53s)
panic: Fail in goroutine after TestTracesTableEngineConfig has completed [recovered]
	panic: Fail in goroutine after TestTracesTableEngineConfig has completed

goroutine 141 [running]:
testing.tRunner.func1.2({0x1626740, 0xc00011fe00})
	/opt/hostedtoolcache/go/1.21.9/x64/src/testing/testing.go:1545 +0x3f7
testing.tRunner.func1()
	/opt/hostedtoolcache/go/1.21.9/x64/src/testing/testing.go:1548 +0x716
panic({0x1626740?, 0xc00011fe00?})
	/opt/hostedtoolcache/go/1.21.9/x64/src/runtime/panic.go:920 +0x270
testing.(*common).Fail(0xc0003efa00)
	/opt/hostedtoolcache/go/1.21.9/x64/src/testing/testing.go:952 +0x1b4
testing.(*common).Fail(0xc00049c680)
	/opt/hostedtoolcache/go/1.21.9/x64/src/testing/testing.go:946 +0x85
testing.(*common).Errorf(0xc00049c680, {0x17fd5ea, 0x3}, {0xc00011fdb0, 0x1, 0x1})
	/opt/hostedtoolcache/go/1.21.9/x64/src/testing/testing.go:1069 +0xaf
github.com/stretchr/testify/assert.Fail({0x1a894c0, 0xc00049c680}, {0x181da90, 0x21}, {0x0, 0x0, 0x0})
	/home/runner/go/pkg/mod/github.com/stretchr/[email protected]/assert/assertions.go:363 +0x436
github.com/stretchr/testify/assert.Error({0x1a894c0, 0xc00049c680}, {0x0, 0x0?}, {0x0, 0x0, 0x0})
	/home/runner/go/pkg/mod/github.com/stretchr/[email protected]/assert/assertions.go:1562 +0xa5
github.com/stretchr/testify/require.Error({0x1a8e260, 0xc00049c680}, {0x0, 0x0}, {0x0, 0x0, 0x0})
	/home/runner/go/pkg/mod/github.com/stretchr/[email protected]/require/require.go:290 +0x97
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/clickhouseexporter.testTableEngineConfig.func1.1({0xc0004a8700, 0x68c}, {0xc0004e4bf8?, 0xc0004e4bf8?, 0x441c5e?})
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/exporter/clickhouseexporter/exporter_sql_test.go:170 +0x11d
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/clickhouseexporter.(*testClickhouseDriverStmt).Exec(0xc0000d34a0, {0x2382d40, 0x0, 0x0})
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/exporter/clickhouseexporter/exporter_logs_test.go:245 +0x89
database/sql.ctxDriverStmtExec({0x1a95198, 0x2382d40}, {0x1a950f0, 0xc0000d34a0}, {0x2382d40?, 0x0, 0x0})
	/opt/hostedtoolcache/go/1.21.9/x64/src/database/sql/ctxutil.go:77 +0x34f
database/sql.resultFromStatement({0x1a95198, 0x2382d40}, {0x1a91558, 0xc0001202d0}, 0xc0004e4fa8, {0x0, 0x0, 0x0})
	/opt/hostedtoolcache/go/1.21.9/x64/src/database/sql/sql.go:2642 +0x1e9
database/sql.(*DB).execDC(0x7f2562b82401?, {0x1a95198?, 0x2382d40}, 0xc00040e630, 0xc0004e5148, {0xc0004a8700, 0x68c}, {0x0, 0x0, 0x0})
	/opt/hostedtoolcache/go/1.21.9/x64/src/database/sql/sql.go:1694 +0x6cd
database/sql.(*DB).exec(0x4adc39?, {0x1a95198, 0x2382d40}, {0xc0004a8700, 0x68c}, {0x0, 0x0, 0x0}, 0x0?)
	/opt/hostedtoolcache/go/1.21.9/x64/src/database/sql/sql.go:1655 +0x15b
database/sql.(*DB).ExecContext.func1(0x0?)
	/opt/hostedtoolcache/go/1.21.9/x64/src/database/sql/sql.go:1634 +0xd0
database/sql.(*DB).retry(0x4?, 0xc0004e52e8)
	/opt/hostedtoolcache/go/1.21.9/x64/src/database/sql/sql.go:[153](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8743113381/job/23992945839?pr=32529#step:5:154)8 +0x4b
database/sql.(*DB).ExecContext(0xc0002f3e10, {0x1a95198?, 0x2382d40}, {0xc0004a8700, 0x68c}, {0x0, 0x0, 0x0})
	/opt/hostedtoolcache/go/1.21.9/x64/src/database/sql/sql.go:1633 +0x194
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/clickhouseexporter.createLogsTable({0x1a95198, 0x2382d40}, 0xc0004baea0?, 0x1a894c0?)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/exporter/clickhouseexporter/exporter_logs.go:230 +0x71
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/clickhouseexporter.(*logsExporter).start(0xc000261b60, {0x1a95198, 0x2382d40}, {0x0?, 0x0?})
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/exporter/clickhouseexporter/exporter_logs.go:49 +0xc5
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/clickhouseexporter.newTestLogsExporter(0xc000500680, {0xc000589608, 0x14}, {0x0, 0x0, 0x0})
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/exporter/clickhouseexporter/exporter_logs_test.go:141 +0x12e
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/clickhouseexporter.TestIntegration.func1(0xc000500680)
	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/exporter/clickhouseexporter/integration_test.go:62 +0x50f
ERROR rerun aborted because previous run had a suspected panic and some test may not have run
make[2]: *** [../../Makefile.Common:142: mod-integration-test] Error 3
make[1]: *** [Makefile:165: exporter/clickhouseexporter] Error 2
make: *** [Makefile:122: gointegration-test] Error 2
testing.tRunner(0xc000500680, 0xc0004d6f10)
	/opt/hostedtoolcache/go/1.21.9/x64/src/testing/testing.go:[159](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/runs/8743113381/job/23992945839?pr=32529#step:5:160)5 +0x262
created by testing.(*T).Run in goroutine 140
	/opt/hostedtoolcache/go/1.21.9/x64/src/testing/testing.go:1648 +0x846

=== FAIL: . TestLoadConfig (unknown)

=== FAIL: . TestTableEngineConfigParsing (unknown)

=== FAIL: . TestClusterString (unknown)

=== FAIL: . TestExporter_pushMetricsData (unknown)

DONE 82 tests, 6 failures in 28.348s
make[1]: Leaving directory '/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib'
@crobert-1 crobert-1 added the needs triage New item requiring triage label Apr 18, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Jun 18, 2024
@SpencerTorres
Copy link
Member

Not sure if we have any ideas to improve reliability for these tests, definitely want to keep this issue open though

@SpencerTorres
Copy link
Member

I would like to group this issue with #32275 since they both reference the integration tests, we need to revisit these and improve stability. I have it working locally, but the CI seems unreliable.

@github-actions github-actions bot removed the Stale label Jun 18, 2024
@crobert-1
Copy link
Member Author

I would like to group this issue

As far as grouping together, it's nice to have open issues for different test failures to track frequencies independently. It's totally fine to open a single PR that fixes both issues though for something like this, if that works for you.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@SpencerTorres
Copy link
Member

Still need to investigate integration test stability further. For now these are disabled, and when I manually run them they work fine. Let me know if anyone else here has any ideas

Copy link
Contributor

github-actions bot commented Dec 2, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Dec 2, 2024
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 31, 2025
songy23 pushed a commit that referenced this issue Apr 15, 2025
…tests (#38798)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description
clickhouse exporter tests use their own sql drivers to validate the
queries sent during the tests.
Each driver has a unique name (which is the same as the test name). This
name must be passed to the sql.Open function during database
initialization.

Unfortunately, instead of explicitly passing the driver name into the
buildDB function, it was passed implicitly through a global variable.
This approach works in production but leads to data races when tests are
executed in parallel because each test writes its own name to
driverName. As a result, integration tests expected driverName to be set
to the standard "clickhouse", but instead, it retained the value set by
the previous test. This caused the validation callbacks to be triggered
incorrectly, leading to test failures (panics).

There were two possible solutions to fix this issue:

1. Hardcode the "clickhouse" constant in integration tests.
2. Pass driverName explicitly into sql.Open.
Although the first option might work, I believe it is not a careful way
to write tests and code. A global variable that changes during tests
introduces bugs that are difficult to debug and reproduce. Therefore, I
prefer the second option.


<!-- Issue number (e.g. #1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Fixes
Close
#32530

---------

Co-authored-by: Antoine Toulme <[email protected]>
akshays-19 pushed a commit to akshays-19/opentelemetry-collector-contrib that referenced this issue Apr 23, 2025
…tests (open-telemetry#38798)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description
clickhouse exporter tests use their own sql drivers to validate the
queries sent during the tests.
Each driver has a unique name (which is the same as the test name). This
name must be passed to the sql.Open function during database
initialization.

Unfortunately, instead of explicitly passing the driver name into the
buildDB function, it was passed implicitly through a global variable.
This approach works in production but leads to data races when tests are
executed in parallel because each test writes its own name to
driverName. As a result, integration tests expected driverName to be set
to the standard "clickhouse", but instead, it retained the value set by
the previous test. This caused the validation callbacks to be triggered
incorrectly, leading to test failures (panics).

There were two possible solutions to fix this issue:

1. Hardcode the "clickhouse" constant in integration tests.
2. Pass driverName explicitly into sql.Open.
Although the first option might work, I believe it is not a careful way
to write tests and code. A global variable that changes during tests
introduces bugs that are difficult to debug and reproduce. Therefore, I
prefer the second option.


<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Fixes
Close
open-telemetry#32530

---------

Co-authored-by: Antoine Toulme <[email protected]>
Fiery-Fenix pushed a commit to Fiery-Fenix/opentelemetry-collector-contrib that referenced this issue Apr 24, 2025
…tests (open-telemetry#38798)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description
clickhouse exporter tests use their own sql drivers to validate the
queries sent during the tests.
Each driver has a unique name (which is the same as the test name). This
name must be passed to the sql.Open function during database
initialization.

Unfortunately, instead of explicitly passing the driver name into the
buildDB function, it was passed implicitly through a global variable.
This approach works in production but leads to data races when tests are
executed in parallel because each test writes its own name to
driverName. As a result, integration tests expected driverName to be set
to the standard "clickhouse", but instead, it retained the value set by
the previous test. This caused the validation callbacks to be triggered
incorrectly, leading to test failures (panics).

There were two possible solutions to fix this issue:

1. Hardcode the "clickhouse" constant in integration tests.
2. Pass driverName explicitly into sql.Open.
Although the first option might work, I believe it is not a careful way
to write tests and code. A global variable that changes during tests
introduces bugs that are difficult to debug and reproduce. Therefore, I
prefer the second option.


<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Fixes
Close
open-telemetry#32530

---------

Co-authored-by: Antoine Toulme <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants