Skip to content

Integer Overflow Vulnerability in sync_queries() Leading to Buffer Overflow #4717

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Asuk4 opened this issue May 28, 2025 · 0 comments
Open
Labels
Bug Bug in learning semantics, critical by default

Comments

@Asuk4
Copy link

Asuk4 commented May 28, 2025

Describe the bug

Hello VowpalWabbit team!

As part of our ongoing audit of this project, we've identified a critical integer overflow vulnerability in the sync_queries() function that could lead to a buffer overflow. This vulnerability occurs when processing user-controlled data in the all-reduce operation, specifically when the all_reduce_type is set to SOCKET.

Mechanism of Vulnerability

  1. The sync_queries() function calculates total_sum by summing up sizes[i] values without proper overflow checks.
  2. The sizes array values originate from user-controlled unflushed_bytes_count data.
  3. An attacker could manipulate the input to cause total_sum to overflow, resulting in a small wrapped value.
  4. This wrapped value is then used to allocate a buffer via calloc_or_throw().
  5. Subsequent operations attempt to read more data than the allocated buffer can hold, leading to a buffer overflow.

Relevant Code Snippets

kernel_svm.cc - sync_queries()

size_t* sizes = VW::details::calloc_or_throw<size_t>(all.runtime_state.all_reduce->total);
sizes[all.runtime_state.all_reduce->node] = b->unflushed_bytes_count();
VW::details::all_reduce<size_t, add_size_t>(all, sizes, all.runtime_state.all_reduce->total);

size_t prev_sum = 0, total_sum = 0;
for (size_t i = 0; i < all.runtime_state.all_reduce->total; i++)
{
    if (i <= (all.runtime_state.all_reduce->node - 1)) { prev_sum += sizes[i]; }
    total_sum += sizes[i];  // Integer overflow possible here
}

if (total_sum > 0)
{
    queries = VW::details::calloc_or_throw<char>(total_sum);  // Buffer allocation with overflowed size
    // ... subsequent operations using the under-allocated buffer
}

Required Conditions

  • all.runtime_config.selected_all_reduce_type must be set to SOCKET
  • Attacker must be able to control the unflushed_bytes_count values

Suggested Action

The code should be modified to:

  1. Add overflow checks when calculating total_sum
  2. Implement upper bounds validation for the allocation size
  3. Consider using a safer integer type or implementing proper bounds checking
  4. Add validation to ensure the allocated buffer size is sufficient for the actual data size

Example fix:

size_t total_sum = 0;
for (size_t i = 0; i < all.runtime_state.all_reduce->total; i++)
{
    if (i <= (all.runtime_state.all_reduce->node - 1)) { prev_sum += sizes[i]; }
    if (total_sum > SIZE_MAX - sizes[i]) {
        THROW("Integer overflow detected in total_sum calculation");
    }
    total_sum += sizes[i];
}

How to reproduce

Version

9.10.0

OS

Linux

Language

C

Additional context

No response

@Asuk4 Asuk4 added the Bug Bug in learning semantics, critical by default label May 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Bug in learning semantics, critical by default
Projects
None yet
Development

No branches or pull requests

1 participant