Summary
A Regular Expression Denial of Service (ReDoS) vulnerability exists in the file vllm/entrypoints/openai/tool_parsers/pythonic_tool_parser.py
of the vLLM project. The root cause is the use of a highly complex and nested regular expression for tool call detection, which can be exploited by an attacker to cause severe performance degradation or make the service unavailable.
Details
The following regular expression is used to match tool/function call patterns:
r"\[([a-zA-Z]+\w*\(([a-zA-Z]+\w*=.*,\s*)*([a-zA-Z]+\w*=.*\s)?\),\s*)*([a-zA-Z]+\w*\(([a-zA-Z]+\w*=.*,\s*)*([a-zA-Z]+\w*=.*\s*)?\)\s*)+\]"
This pattern contains multiple nested quantifiers (*
, +
), optional groups, and inner repetitions which make it vulnerable to catastrophic backtracking.
Attack Example:
A malicious input such as
[A(A= )A(A=, )A(A=, )A(A=, )... (repeated dozens of times) ...]
or
"[A(A=" + "\t)A(A=,\t" * repeat
can cause the regular expression engine to consume CPU exponentially with the input length, effectively freezing or crashing the server (DoS).
Proof of Concept:
A Python script demonstrates that matching such a crafted string with the above regex results in exponential time complexity. Even moderate input lengths can bring the system to a halt.
Length: 22, Time: 0.0000 seconds, Match: False
Length: 38, Time: 0.0010 seconds, Match: False
Length: 54, Time: 0.0250 seconds, Match: False
Length: 70, Time: 0.5185 seconds, Match: False
Length: 86, Time: 13.2703 seconds, Match: False
Length: 102, Time: 319.0717 seconds, Match: False
Impact
- Denial of Service (DoS): An attacker can trigger a denial of service by sending specially crafted payloads to any API or interface that invokes this regex, causing excessive CPU usage and making the vLLM service unavailable.
- Resource Exhaustion and Memory Retention: As this regex is invoked during function call parsing, the matching process may hold on to significant CPU and memory resources for extended periods (due to catastrophic backtracking). In the context of vLLM, this also means that the associated KV cache (used for model inference and typically stored in GPU memory) is not released in a timely manner. This can lead to GPU memory exhaustion, degraded throughput, and service instability.
- Potential for Broader System Instability: Resource exhaustion from stuck or slow requests may cascade into broader system instability or service downtime if not mitigated.
Fix
- vllm-project/vllm#18454
- Note that while this change has significantly improved performance, this regex may still be problematic. It has gone from exponential time complexity, O(2^N), to O(N^2).
References
Summary
A Regular Expression Denial of Service (ReDoS) vulnerability exists in the file
vllm/entrypoints/openai/tool_parsers/pythonic_tool_parser.py
of the vLLM project. The root cause is the use of a highly complex and nested regular expression for tool call detection, which can be exploited by an attacker to cause severe performance degradation or make the service unavailable.Details
The following regular expression is used to match tool/function call patterns:
This pattern contains multiple nested quantifiers (
*
,+
), optional groups, and inner repetitions which make it vulnerable to catastrophic backtracking.Attack Example:
A malicious input such as
can cause the regular expression engine to consume CPU exponentially with the input length, effectively freezing or crashing the server (DoS).
Proof of Concept:
A Python script demonstrates that matching such a crafted string with the above regex results in exponential time complexity. Even moderate input lengths can bring the system to a halt.
Impact
Fix
References