Skip to content

Backend streams failing after some time #60

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yatharthranjan opened this issue Feb 27, 2018 · 5 comments · Fixed by #77
Closed

Backend streams failing after some time #60

yatharthranjan opened this issue Feb 27, 2018 · 5 comments · Fixed by #77

Comments

@yatharthranjan
Copy link
Member

We need to investigate this and find a solution. The streams fail one by one right now with the following log -

[2018-02-27 15:33:36 UTC] ERROR [kafka-producer-network-thread | org.radarcns.stream.phone.PhoneAccelerationStream-0.2.2-SNAPSHOT1209600090-35e3e339-c114-424c-b52b-10dc4225425d-StreamThread-56-producer] (RecordCollectorImpl.java:113) - task [1_0] Error sending record to topic org.radarcns.stream.phone.PhoneAccelerationStream-0.2.2-SNAPSHOT1209600090-From-android_phone_acceleration-To-android_phone_acceleration_1week-changelog. No more offsets will be recorded for this task and the exception will eventually be thrown (org.apache.kafka.streams.processor.internals.RecordCollectorImpl)
[2018-02-27 15:33:36 UTC]  INFO [pool-1-thread-1] (Monitor.java:50) - 0 records have been read from KafkaTopic<android_phone_usage_event> to KafkaTopic<android_phone_usage_event_output> (org.radarcns.stream.phone.PhoneUsageStream)
[2018-02-27 15:33:38 UTC]  INFO [pool-1-thread-1] (Monitor.java:50) - 0 records have been read from KafkaTopic<android_phone_usage_event_output> to KafkaTopic<android_phone_usage_event_aggregated> (org.radarcns.stream.phone.PhoneUsageAggregationStream)
[2018-02-27 15:33:39 UTC]  INFO [pool-1-thread-1] (Monitor.java:50) - 0 records have been read from KafkaTopic<android_phone_battery_level> to KafkaTopic<android_phone_battery_level_10sec> (org.radarcns.stream.phone.PhoneBatteryStream)
[2018-02-27 15:33:39 UTC]  INFO [pool-1-thread-1] (Monitor.java:50) - 0 records have been read from KafkaTopic<android_phone_battery_level> to KafkaTopic<android_phone_battery_level_1min> (org.radarcns.stream.phone.PhoneBatteryStream)
[2018-02-27 15:33:40 UTC]  INFO [pool-1-thread-1] (Monitor.java:50) - 0 records have been read from KafkaTopic<android_phone_battery_level> to KafkaTopic<android_phone_battery_level_10min> (org.radarcns.stream.phone.PhoneBatteryStream)
[2018-02-27 15:33:40 UTC]  INFO [pool-1-thread-1] (Monitor.java:50) - 0 records have been read from KafkaTopic<android_phone_battery_level> to KafkaTopic<android_phone_battery_level_1hour> (org.radarcns.stream.phone.PhoneBatteryStream)
[2018-02-27 15:33:40 UTC]  INFO [pool-1-thread-1] (Monitor.java:50) - 0 records have been read from KafkaTopic<android_phone_battery_level> to KafkaTopic<android_phone_battery_level_1day> (org.radarcns.stream.phone.PhoneBatteryStream)
[2018-02-27 15:33:40 UTC]  INFO [pool-1-thread-1] (Monitor.java:50) - 0 records have been read from KafkaTopic<android_phone_battery_level> to KafkaTopic<android_phone_battery_level_1week> (org.radarcns.stream.phone.PhoneBatteryStream)
[2018-02-27 15:33:42 UTC]  INFO [pool-1-thread-1] (Monitor.java:50) - 0 records have been read from KafkaTopic<android_phone_acceleration> to KafkaTopic<android_phone_acceleration_10sec> (org.radarcns.stream.phone.PhoneAccelerationStream)
[2018-02-27 15:33:43 UTC]  INFO [pool-1-thread-1] (Monitor.java:50) - 0 records have been read from KafkaTopic<android_phone_acceleration> to KafkaTopic<android_phone_acceleration_1min> (org.radarcns.stream.phone.PhoneAccelerationStream)
[2018-02-27 15:33:44 UTC]  INFO [pool-1-thread-1] (Monitor.java:50) - 0 records have been read from KafkaTopic<android_phone_acceleration> to KafkaTopic<android_phone_acceleration_10min> (org.radarcns.stream.phone.PhoneAccelerationStream)
[2018-02-27 15:33:44 UTC]  INFO [pool-1-thread-1] (Monitor.java:50) - 0 records have been read from KafkaTopic<android_phone_acceleration> to KafkaTopic<android_phone_acceleration_1hour> (org.radarcns.stream.phone.PhoneAccelerationStream)
[2018-02-27 15:33:44 UTC]  INFO [pool-1-thread-1] (Monitor.java:50) - 0 records have been read from KafkaTopic<android_phone_acceleration> to KafkaTopic<android_phone_acceleration_1day> (org.radarcns.stream.phone.PhoneAccelerationStream)
[2018-02-27 15:33:45 UTC]  INFO [pool-1-thread-1] (Monitor.java:50) - 0 records have been read from KafkaTopic<android_phone_acceleration> to KafkaTopic<android_phone_acceleration_1week> (org.radarcns.stream.phone.PhoneAccelerationStream)
[2018-02-27 15:33:54 UTC] ERROR [kafka-producer-network-thread | org.radarcns.stream.phone.PhoneAccelerationStream-0.2.2-SNAPSHOT172800090-34c3106d-e7a4-4e2d-96ef-f0d5229fdac4-StreamThread-55-producer] (RecordCollectorImpl.java:113) - task [1_0] Error sending record to topic org.radarcns.stream.phone.PhoneAccelerationStream-0.2.2-SNAPSHOT172800090-From-android_phone_acceleration-To-android_phone_acceleration_1day-changelog. No more offsets will be recorded for this task and the exception will eventually be thrown (org.apache.kafka.streams.processor.internals.RecordCollectorImpl)
@yatharthranjan
Copy link
Member Author

yatharthranjan commented Feb 27, 2018

According to https://groups.google.com/d/msg/confluent-platform/IFtTgEkct4k/cRvPTDKgAwAJ

setting the stream properties to -

final Properties props = new Properties();
...
props.put(ProducerConfig.RETRIES_CONFIG, 10);  <---- increase to 10 from default of 0
props.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, Integer.toString(Integer.MAX_VALUE)); <--------- increase to infinity from default of 300 s

Along with

props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, 20000);  <---- increase to 20000 from default of 10000

Should help in solving the problem.

@yatharthranjan
Copy link
Member Author

Also getting these errors -

org.apache.kafka.common.errors.RecordTooLargeException: The request included a message larger than the max message size the server will accept.

Increasing the Max request size and corresponding increase in Max message size for broker will solve this

@blootsvoets
Copy link
Contributor

blootsvoets commented Mar 5, 2018

Ah, that’s an additional clue as to why the week-state is getting too large whereas other entries are seemingly fine. The double value collector contains a list with all history in a window, in order to compute the quartiles. To prevent these errors, the value collector should decrease memory consumption, although this would decrease quartile accuracy somewhat. Options: reservoir sampling (seems relatively easy to implement) or an algorithm in any of the papers referenced in a related StackOverflow question.

@yatharthranjan
Copy link
Member Author

With the current streams properties configuration i am getting a stable output till now. But if the message size will scale proportional to the amount of data, then yes we need to look at alternatives.

@yatharthranjan
Copy link
Member Author

closed by RADAR-base/radar-commons#46 and #62

This was referenced Jul 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants