Skip to content

Fix: Add keep-alive options to Redis clients to prevent idle timeouts and socket closing. #4377

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 14, 2025

Conversation

nikitas-novatix
Copy link
Contributor

Problem:

When Flowise is deployed with Redis (especially in Queue Mode or environments like Azure Cache for Redis with idle timeouts), Redis connections can be closed by the server after a period of inactivity (e.g., 10-15 minutes). This results in uncaughtException: Socket closed unexpectedly errors, crashing worker processes or causing failures in other Redis-dependent components.

This issue is tracked in #2186.

Solution:

This PR adds keep-alive mechanisms to the Redis clients used throughout Flowise to ensure connections remain active even during idle periods. The following settings were added to the respective client configurations:

  • Core Queue Mode (@redis/client): Added socket: { keepAlive: 60000 } and pingInterval: 60000 to the clients used by RedisEventPublisher and RedisEventSubscriber.
  • BullMQ (QueueManager - ioredis): Added keepAlive: 60000 and enableReadyCheck: true to the ioredis connection configuration used by BullMQ.
  • Other ioredis Clients: Added keepAlive: 60000 and a default retryStrategy to clients used in CachePool, RateLimiterManager, RedisCache, RedisEmbeddingsCache, and RedisBackedChatMemory nodes.
  • Other @redis/client Clients: Added socket: { keepAlive: 60000 } and pingInterval: 60000 to the client used in the Redis Vector Store node.

The keep-alive/ping intervals are set to 60000ms (1 minute) as a robust default, significantly shorter than typical server timeouts, ensuring connections are maintained without excessive background traffic.

Testing:

  • Successfully reproduced the "Socket closed unexpectedly" error locally using Queue Mode, SQLite, and a Dockerized Redis instance configured with a short (--timeout 10) idle timeout.
  • Verified that applying the keep-alive fixes with a short interval (e.g., 5000ms) prevented the original error under the same test conditions.

Closes:

Fixes #2186

@nikitas-novatix
Copy link
Contributor Author

@HenryHengZJ thanks for your comments.

I did the following changes:

  • Added a REDIS_KEEP_ALIVE env variable, 60000 by default (1min).
  • Added a one-liner based on your suggestion, but made it robust to mistakes in the environment variable. So if a person sets something that isn't a number, it still runs (without the KeepAlive mechanism)
  • Simplified the code for consistency by removing non-necessary parts (pingInterval, RetryMechanism).

@HenryHengZJ HenryHengZJ merged commit eadf1b1 into FlowiseAI:main May 14, 2025
2 checks passed
@nikitas-novatix
Copy link
Contributor Author

Hey @HenryHengZJ

I realized the PingInterval is actually necessary for @redis/client instances, as they need both socket.keepAlive (TCP-level) and pingInterval (Redis protocol-level) to prevent timeouts. ioredis handles this differently with just keepAlive. Updated the PR to use REDIS_KEEP_ALIVE for both settings.

I tested this thoroughly this time, sorry for the inconvenience.

I created a new PR with the patch in #4431

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Socket closed unexpectedly - Redis
2 participants