-
Notifications
You must be signed in to change notification settings - Fork 90
Do No Continue Processing Event in Batch Mode for Kinesis/DDBStreams #1820
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hey @belugabehr, thank you for opening this issue and even taking the time to find the code inside library. I would like to understand your use-case better. Can you elaborate on this? I don't fully understand what you mean by
If we stopped processing the batch on the first failure the same (failing) message would still be re-processed again. Am I missing something? Regarding the behavior itself: This is the expected implementation. If a batch fails partially we still finish processing the batch and then report the failed events back to the DDB Streams service so that a checkpoint will be left at the index of the failed item in the batch. We have a nice diagram for this in the Powertools for AWS Lambda (Python) documentation: https://docs.powertools.aws.dev/lambda/python/latest/utilities/batch/#kinesis-and-dynamodb-streams. To confirm my understanding of your request. Are you looking for |
Hey, Thanks for the feedback. Yes. we have DDB Stream hooked up to an SNS FIFO queue. The issue we have is if we have three events: (C)reate, (U)pdate, and (D)elete, then we need to process those in order. For example, if we receive the C, and then we fail to handle the U, we do not want to continue in the stream and process the D. We just need the checkpoint to move up to the U event and wait until the issue clears. So maybe just need a flag on the existing batch processor to exit early instead of continuing to process messages (and reset the checkpoint back to the latest queue offset). |
Hey @belugabehr, thanks for explaining your use-case. I will get back to to you with some new information next week and will do some tests in the meantime. |
Thanks for opening this issue @belugabehr! This also can happen with Kinesis stream and if using bisect configuration can have some other side effects. I have a few additional thoughts here before we make a decision and I hope to share them by Monday. Thanks |
Hello and thank you for the engagement.
Once I realized that the current implementation of power tools batching
sets a checkpoint on the broken event, but continued to process the batch,
I found that surprising. As I understand it, if the batch size was 10 (for
example), and the first item in the batch fails, then the entire batch of
10 items will be retried repeatedly - even if the last nine items are
processed successfully.
When I looked into it more, I came across the aforementioned example from
the AWS docs that seemed to align with my understanding of the situation
(the early return).
It would be great if you could clarify and provide guidance.
In our particular use case, we are sending the events our via SNS FIFO for
fan out. We use the event sequence ID as the deduplication ID to add some
extra protection on replay.
Also, as an aside, we process large Kinesis batches (e.g., 1000) into
smaller batches and publish events to SNS (max batch of ten). So, I think
it would be an interesting thought exercise here. Power tools could support
this use case by provide configuration for both input and output batch
sizes. To maintain the existing behavior, one would specify a output size
of one.
…On Thu, Apr 17, 2025, 2:07 PM Leandro Damascena ***@***.***> wrote:
Thanks for opening this issue @belugabehr <https://github.com/belugabehr>!
This also can happen with Kinesis stream and if using bisect configuration
can have some other side effects.
I have a few additional thoughts here before we make a decision and I hope
to share them by Monday.
Thanks
—
Reply to this email directly, view it on GitHub
<#1820 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AC766EY63VTWG3VKTQSZT732Z7UWTAVCNFSM6AAAAAB3IZEKT6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMJTGY4TGNRZGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
*leandrodamascena* left a comment
(aws-powertools/powertools-lambda-java#1820)
<#1820 (comment)>
Thanks for opening this issue @belugabehr <https://github.com/belugabehr>!
This also can happen with Kinesis stream and if using bisect configuration
can have some other side effects.
I have a few additional thoughts here before we make a decision and I hope
to share them by Monday.
Thanks
—
Reply to this email directly, view it on GitHub
<#1820 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AC766EY63VTWG3VKTQSZT732Z7UWTAVCNFSM6AAAAAB3IZEKT6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMJTGY4TGNRZGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Also, just for completeness sake, note that:
DynamoDB Streams captures a time-ordered sequence of item-level
modifications.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.html
Again, order matters here and the default behavior IMHO is that there
should be a fail-fast if they cannot be processed in order.
I could maybe see a "fancy solution" whereby items are inserted into a bag
based on the DDB event partition/sort keys, and processed that way, but
that feels overkill.
Thanks.
…On Thu, Apr 17, 2025, 9:59 PM David ***@***.***> wrote:
Hello and thank you for the engagement.
Once I realized that the current implementation of power tools batching
sets a checkpoint on the broken event, but continued to process the batch,
I found that surprising. As I understand it, if the batch size was 10 (for
example), and the first item in the batch fails, then the entire batch of
10 items will be retried repeatedly - even if the last nine items are
processed successfully.
When I looked into it more, I came across the aforementioned example from
the AWS docs that seemed to align with my understanding of the situation
(the early return).
It would be great if you could clarify and provide guidance.
In our particular use case, we are sending the events our via SNS FIFO for
fan out. We use the event sequence ID as the deduplication ID to add some
extra protection on replay.
Also, as an aside, we process large Kinesis batches (e.g., 1000) into
smaller batches and publish events to SNS (max batch of ten). So, I think
it would be an interesting thought exercise here. Power tools could support
this use case by provide configuration for both input and output batch
sizes. To maintain the existing behavior, one would specify a output size
of one.
On Thu, Apr 17, 2025, 2:07 PM Leandro Damascena ***@***.***>
wrote:
> Thanks for opening this issue @belugabehr <https://github.com/belugabehr>!
> This also can happen with Kinesis stream and if using bisect configuration
> can have some other side effects.
>
> I have a few additional thoughts here before we make a decision and I
> hope to share them by Monday.
>
> Thanks
>
> —
> Reply to this email directly, view it on GitHub
> <#1820 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AC766EY63VTWG3VKTQSZT732Z7UWTAVCNFSM6AAAAAB3IZEKT6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMJTGY4TGNRZGQ>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
> *leandrodamascena* left a comment
> (aws-powertools/powertools-lambda-java#1820)
> <#1820 (comment)>
>
> Thanks for opening this issue @belugabehr <https://github.com/belugabehr>!
> This also can happen with Kinesis stream and if using bisect configuration
> can have some other side effects.
>
> I have a few additional thoughts here before we make a decision and I
> hope to share them by Monday.
>
> Thanks
>
> —
> Reply to this email directly, view it on GitHub
> <#1820 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AC766EY63VTWG3VKTQSZT732Z7UWTAVCNFSM6AAAAAB3IZEKT6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMJTGY4TGNRZGQ>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
When processing a DDB Stream in batch mode, I want to stop processing when a failure is reached. Since this is a stream, and ordering of message is important for me, the processing should immediately stop.
That is to say, if my data is partitioned on Purchase ID, I want to ensure all events related to the same purchase are played in order. If a failure occurs, the processing of the stream should stop and retry later.
Expected Behavior
When an error occurs, the offending event should be checkpointed, and processing should stop.
https://docs.aws.amazon.com/lambda/latest/dg/services-ddb-batchfailurereporting.html
Current Behavior
For DDB Stream Batch processing, the stream will continue to be reprocessed, and the same messages will be repeated again, and again.
powertools-lambda-java/powertools-batch/src/main/java/software/amazon/lambda/powertools/batch/handler/DynamoDbBatchMessageHandler.java
Lines 59 to 75 in 8dcdddf
Possible Solution
Return on any error. Take a look at the following example as reference:
https://docs.aws.amazon.com/lambda/latest/dg/services-ddb-batchfailurereporting.html
or a little bit nicer:
The text was updated successfully, but these errors were encountered: