-
Notifications
You must be signed in to change notification settings - Fork 637
Support CDC source from log compacted kafka topic #9267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
IIUC, current implementation doesn't distinguish between CDC source w/ or w/o log compaction. So we may need to just ensure the three properties you mentioned above? |
To be more specific, there will be two major changes in all CDC format parser:
|
I think we can unify the handling of kafka source with debezium format:
Summary
cc @tabVersion @KveinAxel @idx0-dev One caveat here is that DELETE row emitted to downstream can contain a full row ( |
In short, the materizlied executor with |
Can we close this issue now? |
Yes, it is done by #9944 |
Is your feature request related to a problem? Please describe.
Many CDC source connectors are designed to work with log compacted topic in kafka. Examples: PG, MongoDB, MySQL)
In this case, the old record in the CDC message can be wrong or missing and we may not handle it well in the current implementation.
In the current design, CDC source must be materiliazed and defined with primary key constraints. When there is a PK conflict, the current behavior is to overwrite. Therefore, the implementation enfoces UPSERT semantic on all CDC sources, in which case the old record in the CDC message is unused.
Describe the solution you'd like
Given that we enforce the following properties for CDC source:
We can unify the processing of CDC source w/ and w/o log compaction by ignoring the old value in the CDC message without breaking compactibility. Also we should explicitly assert on the above properties in the implementation. If there is any change in the above properties, it is a new behvior anyway and requires further discussion.
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: