RFC: Limiting the creation of client fibers #11390

rorororom · 2025-04-14T10:05:23Z

rorororom
Apr 14, 2025
Collaborator

Reviewers

Tickets

In some cases, a single Tarantool could potentially use too much memory, leading to resource exhaustion.
To prevent such scenarios, it is proposed to introduce a limit on the number of user-created fibers.

Context and Terminology

System and User Fibers

Tarantool employs fibers (lightweight cooperative threads) to execute tasks. These fibers can be categorized into two groups:

System fibers: Fibers that are created by the Tarantool engine itself for internal purposes. They handle core functions such as replication, WAL writing, network input/output (iproto), etc. System fibers typically start at process boot or when a particular feature is enabled, and they run with specific roles.
User fibers: Fibers that originate from user activity. This includes fibers explicitly created by the user via the fiber Lua API (e.g. fiber.create()) as well as fibers that handle client requests or user transactions. User fibers can be short-lived or long-lived depending logic, and their count can grow or shrink dynamically based on client behavior.

Existing Fiber Pools (iproto and `tx_user_pool`)

Tarantool uses fiber pools to recycle fibers for frequent operations, reducing the overhead of constantly creating and destroying fibers. There are two notable fiber pools in the current design:

IProto fiber pool: The network (iproto) thread utilizes a pool of fibers to handle incoming client requests (which are parsed from the network). Each incoming request message may be assigned a fiber from this pool for execution in the transaction thread (tx thread). Tarantool limits the number of concurrent fibers processing network requests via the configuration parameter iproto.net_msg_max. By capping concurrent in-flight messages, it indirectly limits the number of fibers used for network requests. This prevents unlimited fiber proliferation from a flood of network requests.
Transaction user fiber pool (tx_user_pool): In the transaction (tx) thread, there is a similar pool for fibers that execute user transactions or requests forwarded from the iproto thread. This pool reuses fibers for running the Lua or internal transaction routines that constitute user requests. It works in tandem with the iproto fiber pool — when a network request arrives, the iproto thread passes it to the tx thread, which takes a fiber from tx_user_pool to execute the request. The size or concurrency of this pool is effectively governed by the same iproto.net_msg_max limit, since that limits how many requests can be processed in parallel in the tx thread.

Each fiber pool has an upper bound on the number of fibers it will create. If a request arrives and a free fiber is available in the pool, it will reuse it; if no free fiber is available but the pool has not reached its limit, a new fiber will be created and added to the pool. If the pool is at its maximum and all fibers are busy, the incoming request will be queued or blocked until a fiber becomes .

Fibers Created Outside Any Pool

Not all fibers in the system come from these managed pools. Fibers created outside any pool refer to those that are spawned directly via the fiber API or other subsystems without using the iproto/tx fiber pooling. For instance, when a user calls fiber.create(function() ... end) in a Lua application, a new fiber is allocated on the fly. These fibers are not drawn from a pre-allocated pool because their usage is entirely application-driven and unpredictable. Similarly, some internal modules or background tasks might create fibers on demand outside of the iproto request flow (for example, a background fiber for periodic tasks in a module).

The behavior of fibers created outside a pool is straightforward: each fiber.create() will allocate a new fiber structure and a stack for it (typically via a memory allocation or mmap call to reserve stack space, including a guard page for safety). Tarantool does maintain a registry of all alive fibers in a cord thread, but there is currently no built-in limit on how many fibers can be created outside of the pools. The only practical limits are memory. This means a user could create thousands of fibers via the API, which might eventually exhaust system resources or degrade performance.

To summarize the types of fiber

It’s important to note that system fibers (like the ones for replication or WAL) are typically created at specific events (configuring replication or starting the instance) and are not drawn from a pool either — they are just created as needed, but these are few in number (usually on the order of the number of replicas or subsystems active). They also fall outside the iproto and tx user fiber pools, but since they are finite and critical, we consider them separately from user-created fibers. In the current implementation, any fiber that is not obtained from a pool can be considered either a user fiber or a system fiber depending on who triggered its creation. This RFC’s scope is limiting client-triggered (user) fiber creation, and it will exclude system fibers and fiber pool fibers from its new limiting mechanism.

Goal

Introduce a mechanism to limit the number of user-created fibers in Tarantool in order to improve stability and prevent uncontrolled resource consumption. User applications or clients can inadvertently create thousands of fibers – for example, by starting a new fiber for each task or request without any limit. This can lead to excessive memory usage (each fiber has a stack and control block) and even exhaustion of OS resources (e.g., hitting the Linux vm.max_map_count limit on virtual memory areas due to too many stacks). The goal is to provide a safety net that:

Puts an upper bound on the number of concurrently alive user fibers (fibers initiated outside the built-in fiber pools).
Does not interfere with system fibers or the existing fiber pools (which have their own separate limits such as iproto.net_msg_max).
Is configurable by the user via a new configuration option, so that different deployments can adjust the limit based on their workload needs and system capabilities.
Fails early or prevents fiber over-creation in a controlled manner, giving the user a clear error/diagnostic message rather than causing obscure crashes or out-of-memory situations.
Improves observability around fiber usage so that users can monitor how close they are to the limit and take action (or tune the limit) accordingly.

To summarize the current state:

Fibers handling client requests are effectively controlled in number (by the design of the iproto and tx thread fiber pools and net_msg_max).
Fibers explicitly created by users in code are not controlled; their count can grow until something breaks.
System/internal fibers are few and necessary, and not part of the user’s domain to control.

Memory Consumption Analysis

Fibers allocate two pages of memory, simultaneous creation of thousands of fibers can lead to significant memory consumption.

In system, each fiber memory allocation is composed of two major parts:

Fiber Stack:

The default fiber stack size is set using a CMake variable. In our sources, we have the following macro:

set(FIBER_STACK_SIZE "524288" CACHE STRING "Fiber stack size")

This means that, by default, each fiber is allocated 524,288 bytes (which is 512 KB) of stack memory.

Fiber Structure:

Aside from the stack, the fiber itself has a control structure, which occupies approximately 456 bytes.

Note on External Constraints:

On typical systems, the default value for net_msg_max is 768.
Since net_msg_max indirectly limits the number of fibers in the fiber_pool, it is logical that our fiber limit should not be set lower than that.
Moreover, each request may create an additional fiber for its own purposes. This effectively means that we could have up to 768 * 2 fibers, plus a margin for safety.
With this reasoning, setting the client fiber limit to a rounded value of 2000 fibers is considered appropriate.

Option name and operating principle

It is suggested to name the option (under discussion):

client_fiber_limit

Example of work:

fiber.client_fiber_limit(N)

This option sets the maximum number of user-created fibers, including those created by the fiber_pool mechanism (i.e., fibers that process network requests).
The default value is 2000.

When the number of current user fibers reaches client_fiber_limit = N, no additional fibers are created, and the call new create fiber returns an error, for example:

error: fiber cannot be created, the user limit on the number of fibers has been reached (client_fiber_limit = N)

This gives a clear signal: the limit has been reached, and no new fibers will be started. The system does not wait for resources to be freed or remove guard pages — it simply refuses to create a new fiber.

nshy · 2025-04-16T12:59:34Z

nshy
Apr 16, 2025
Collaborator

It is proposed to redesign so that the creation of new fibers in fiber_pool is considered as user-defined and included in the constraint.

If fiber pool is counted in constraint then user can spawn a lot of client fibers and fiber pool can stop working. As a result iproto will not function and some system tasks which are executed in the pool will hang. Fiber pool has it's limit. I guess limiting just client fibers will fix the issue when client has a fork bomb (or code that works like for bomb).

fiber.client_fiber_limit(N)

It is a global configuration option. I'd set it with box.cfg().

3 replies

drewdzzz Apr 17, 2025
Collaborator

Fiber pool has its limit. I guess limiting just client fibers will fix the issue when client has a fork bomb (or code that works like for bomb).

If you mean net_msg_max - it's not just fiber pool limit, it's limit for all in-progress iproto requests. I guess it's difficult to control size of fiber pool with it.
AFAIU, we want not only prevent a "fiber fork bomb", but allow user to control fiber limit quite precisely, and it seems that we have to limit fiber pool size as well. (Or we could introduce a new fiber_pool_size_limit option, but two options for limiting amount of fibers feels too complicated).

As a result iproto will not function and some system tasks which are executed in the pool will hang.

Would it be difficult to mark some tasks as system ones and create a fiber for them even if the limit is reached?

It is a global configuration option. I'd set it with box.cfg().

Subsystem fiber can be used without box - how could I limit fiber count on instance without configured database then?
As for me, it makes more sense to configure it in fiber module, just as we did with fiber.set_max_slice(...).

nshy Apr 21, 2025
Collaborator

Would it be difficult to mark some tasks as system ones and create a fiber for them even if the limit is reached?

Then you have to add another limit for fiber pool and so overall limit for fibers if fiber pool limit + client fiber limit. Not precisely fiber pool limit.

Subsystem fiber can be used without box.

Good point.

drewdzzz Apr 21, 2025
Collaborator

Discussed f2f with Nikolay - it makes sense to have a separate limit for client fibers (without fiber pool) so that a "fork bomb" (a bad request spawning lots of fibers) won't affect IPROTO.

Totktonada · 2025-04-23T18:02:37Z

Totktonada
Apr 23, 2025
Maintainer

Regarding the document

Please, add context/terminology information

As one who don't remember all the details at the moment I would like if the document has a context information, where it defines terms we're using later:

system/user fiber
existing fiber pools, how they're limited
- iproto and tx_user_pool -- we have two ones, correct?
fibers out of any pool, how they're limited

I would also mention, when a fiber is picked up from a pool and when it is created from scratch.

After this context information it should be easier to understand, where we need an additional limit and why. In particular, it is unclear for me, whether the proposed limit covers only out-of-pool fibers or it also includes fibers from the iproto/tx_user pools?

Please, add a user scenario

We can discuss limiting a fiber pool that serves iproto requests or limiting overall amount of fibers or something different, but we should verify each of the variants against some user scenario: is it enough to solve it? Does it solve similar problems of this kind? I think that we should write the scenario down to make the discussion more structured.

I propose to perform a dynamic analysis

The document has two statements:

Fibers allocate two pages of memory, <...>

This means that, by default, each fiber is allocated 524,288 bytes (which is 512 KB) of stack memory.

Assuming that a page is 4KiB, these statements look conflicting. Let's describe things in terms of VMAs and VM pages -- this way it is clear that there are free virtual pages, but the VMA threadhold is reached. These are different things.

Also, I guess that the numbers in the document are obtained using a static analysis of the code. Let's perform some dynamic analysis to verify the code observations. I propose to perform some experiments with varying balance between request processing time (using nanosleep(), for example) and amount of incoming requests to see, where things goes bad, and how the VMA count actually grow/shrink. Let's share the methodic and the scripts as part of the document.

What is going if the limit is reached?

I see at least three options (in context of serving an iproto request, which seems to be our user scenario):

Fail the request (the fail-fast strategy).
Wait until the given resource (VMA count in the case) is available or until the request is timed out.
Continue without guard pages for the fiber stack.

I guess that the RFC assumes fail-fast, but it is not written anywhere in an explicit way.

YAML configuration

Let's also include in the document how the new options are configured in the new YAML configuration based startup flow.

Regarding the idea

I have doubts about the idea of the new static limit, which seems similar to net_msg_max. Let's go from the customer's scenario.

It sees 'cannot allocate memory' error without any details and asked for help. It appears that the default vm.max_map_count value is too low for the given workload. I guess the solution was to inclease this limit¹.

So, the whole problem is in the diagnostic? Maybe we should consider something that would give statistics about fibers: system/user, iproto_pool/tx_user_pool/background and count of VMAs in each of the categories (if possible) or overall? I would also mention the kernel option name (vm.max_map_count) in the error message.

We can also consider exporting the VMAs count (as a percent from all available?) into metrics to let admins setup alerting based on it.

What do you think about reconsidering this task as an activity toward better observabilty?

At least I've googled it around and found that DB2 recommends to set this value to overall memory / 4096. ↩

1 reply

rorororom Apr 28, 2025
Collaborator Author

Thank you for your suggestion. I haven't done dynamic analysis yet, but I've adjusted the RFC.

I think it is necessary to limit only custom fibers created outside of fiber_pool.

But we can also add alerts on fiber and VMA metrics to catch issues early:

Watch tnt_fiber_user_amount and fire a warning when it exceeds, say, 80 % of client_fiber_limit.
Watch tnt_process_vma_usage_pct and warm above 90 % of vm.max_map_count.
These alerts give you time to react, but even if they’re missed, the hard cap remains in place: once client_fiber_limit is reached, any attempt to create the (N + 1)-th fiber will fail-fast with a clear error.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tarantool

RFC: Limiting the creation of client fibers #11390

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Tarantool

RFC: Limiting the creation of client fibers #11390

Uh oh!

Uh oh!

rorororom Apr 14, 2025 Collaborator

Reviewers

Tickets

Context and Terminology

System and User Fibers

Existing Fiber Pools (iproto and tx_user_pool)

Fibers Created Outside Any Pool

To summarize the types of fiber

Goal

Memory Consumption Analysis

Fiber Stack:

Fiber Structure:

Note on External Constraints:

Option name and operating principle

Replies: 2 comments · 4 replies

Uh oh!

nshy Apr 16, 2025 Collaborator

Uh oh!

drewdzzz Apr 17, 2025 Collaborator

Uh oh!

nshy Apr 21, 2025 Collaborator

Uh oh!

drewdzzz Apr 21, 2025 Collaborator

Uh oh!

Totktonada Apr 23, 2025 Maintainer

Regarding the document

Please, add context/terminology information

Please, add a user scenario

I propose to perform a dynamic analysis

What is going if the limit is reached?

YAML configuration

Regarding the idea

Footnotes

Uh oh!

Uh oh!

rorororom Apr 28, 2025 Collaborator Author

rorororom
Apr 14, 2025
Collaborator

Existing Fiber Pools (iproto and `tx_user_pool`)

Replies: 2 comments 4 replies

nshy
Apr 16, 2025
Collaborator

drewdzzz Apr 17, 2025
Collaborator

nshy Apr 21, 2025
Collaborator

drewdzzz Apr 21, 2025
Collaborator

Totktonada
Apr 23, 2025
Maintainer

rorororom Apr 28, 2025
Collaborator Author