Skip to content

Excessive memory footprint in gltf_loader #942

Open
@jherico

Description

@jherico

The current gltf loader code uses threading to load images from disk into memory prior to pushing them to the GPU.

However, it does this by creating a thread-pool of std::thread::hardware_concurrency() size.

auto thread_count = std::thread::hardware_concurrency();
thread_count = thread_count == 0 ? 1 : thread_count;
ctpl::thread_pool thread_pool(thread_count);
auto image_count = to_u32(model.images.size());
std::vector<std::future<std::unique_ptr<sg::Image>>> image_component_futures;
for (size_t image_index = 0; image_index < image_count; image_index++)
{
auto fut = thread_pool.push(
[this, image_index](size_t) {
auto image = parse_image(model.images[image_index]);
LOGI("Loaded gltf image #{} ({})", image_index, model.images[image_index].uri.c_str());
return image;
});
image_component_futures.push_back(std::move(fut));
}

The code attempts to mitigate this later on by limiting the number of staging buffers that will be created to 64 MB at a time:

// Upload images to GPU. We do this in batches of 64MB of data to avoid needing
// double the amount of memory (all the images and all the corresponding buffers).
// This helps keep memory footprint lower which is helpful on smaller devices.
size_t image_index = 0;
while (image_index < image_count)
{
std::vector<core::Buffer> transient_buffers;
auto &command_buffer = device.request_command_buffer();
command_buffer.begin(VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT, 0);
size_t batch_size = 0;
// Deal with 64MB of image data at a time to keep memory footprint low
while (image_index < image_count && batch_size < 64 * 1024 * 1024)
{

but IMO the "damage" is already done. My machine peaks at over 5 GB of RAM usage at loading the large scenes used in performance examples.

While the threading might be useful for speeding up the overall load, consuming all the CPUs on the machine to do so feels excessive, and I would suggest that at least 1 CPU be left in reserve for the main thread.

Additionally, it should be possible to limit the number of images loaded concurrently by using a blocking queue or a (C++) semaphore to block the thread pool from doing more work when there pending images waiting for upload.

Metadata

Metadata

Assignees

No one assigned

    Labels

    frameworkThis is relevant to the framework

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions