Skip to content

fix(kafkaexporter): Make Sarama's ConfigurationError as permanent to prevent retries #38608

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

an-mmx
Copy link
Contributor

@an-mmx an-mmx commented Mar 13, 2025

Description

This fix unifies message send error handling for all types of telemetry. It is designed to identify whether the error was caused by a ConfigurationError and then reclassify it as a permanent consumer error to prevent further retries.

Link to tracking issue

#38604

Testing

Unit test coverage added

Documentation

No changes

@an-mmx an-mmx requested review from MovieStoreGuy and a team as code owners March 13, 2025 13:45
@github-actions github-actions bot requested a review from pavolloffay March 13, 2025 13:46
@an-mmx an-mmx force-pushed the feat/kafkaexporter_permanent_config_error branch from 3e2ada1 to 1577beb Compare March 13, 2025 15:15
@MovieStoreGuy
Copy link
Contributor

If there a configuration error, should it not fail to start up?

@MovieStoreGuy
Copy link
Contributor

Reading the issue, it sounds like this should be part of the configuration validation step, not part of the execution.

@an-mmx
Copy link
Contributor Author

an-mmx commented Mar 17, 2025

Unfortunately, it's impossible to validate size of a message upon configuration validation.

The fact is that Sarama client produce such kind of error and this error should be processed by otelcol as a permanent as this is some kind idempotent state.

Comment on lines 314 to 334
if !errors.As(err, &producerErrs) || len(producerErrs) == 0 {
return err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of errors.As is that will check that it was cast into the expected type, and work with the interface.

It shouldn't mutate the passed value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The errors.As writes result to the second argument and returns boolean result of type casting. Thus it works with variable, not interface.
https://pkg.go.dev/errors#As

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does my previous response clarify your concern?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@atoulme
Copy link
Contributor

atoulme commented Mar 20, 2025

Please address review and mark the PR as ready for review afterwards.

@atoulme atoulme marked this pull request as draft March 20, 2025 06:13
Copy link
Contributor

github-actions bot commented Apr 5, 2025

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added Stale and removed Stale labels Apr 5, 2025
@an-mmx an-mmx force-pushed the feat/kafkaexporter_permanent_config_error branch from 1577beb to d9c998a Compare April 10, 2025 10:37
@github-actions github-actions bot requested a review from axw April 10, 2025 10:38
@an-mmx an-mmx marked this pull request as ready for review April 10, 2025 10:38
return consumererror.NewPermanent(confErr)
}

return kafkaErrors{len(prodErr), prodErr[0].Err.Error()}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like we are loosing some information performing this. I might follow up in a seperate PR.

@MovieStoreGuy MovieStoreGuy merged commit 735fef2 into open-telemetry:main May 7, 2025
173 checks passed
@github-actions github-actions bot added this to the next release milestone May 7, 2025
dragonlord93 pushed a commit to dragonlord93/opentelemetry-collector-contrib that referenced this pull request May 23, 2025
…prevent retries (open-telemetry#38608)

#### Description
This fix unifies message send error handling for all types of telemetry.
It is designed to identify whether the error was caused by a
ConfigurationError and then reclassify it as a permanent consumer error
to prevent further retries.

#### Link to tracking issue

open-telemetry#38604

#### Testing
Unit test coverage added

#### Documentation
No changes

Co-authored-by: Antoine Toulme <[email protected]>
Co-authored-by: Sean Marciniak <[email protected]>
dd-jasminesun pushed a commit to DataDog/opentelemetry-collector-contrib that referenced this pull request Jun 23, 2025
…prevent retries (open-telemetry#38608)

#### Description
This fix unifies message send error handling for all types of telemetry.
It is designed to identify whether the error was caused by a
ConfigurationError and then reclassify it as a permanent consumer error
to prevent further retries.

#### Link to tracking issue

open-telemetry#38604

#### Testing
Unit test coverage added

#### Documentation
No changes

Co-authored-by: Antoine Toulme <[email protected]>
Co-authored-by: Sean Marciniak <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants