Skip to content

Reduce Peak Memory Consumption #16

Open
@william-silversmith

Description

@william-silversmith

This package is currently several times more memory intensive than SciPy. There are a few avenues to reducing memory consumption: .

  • Strip out union-by-size from union-find. Not sure what the theoretical justification is (it's supposed to be faster!), but it seems to be more performant and lower memory. It's possible that the union-by-size feature is more useful for arbitrary graphs but not the structure of graphs that are implied in images.
  • Allow use of uint16_t or uint8_t for output images. It's pretty rare that more than 65k labels are used in typical images. We would need a good way to estimate when this is okay or allow users to specify uint16.
  • As in feat: support arbitrary labels #11, we can use std::unordered_map in union-find which for many images which would sparsely utilize the union-find array would result in large memory reductions. However, for images which densely use it, it would use more. It also supports labels larger than the maximum index of the image. However, it is also slower than the array implementation. We should allow the user to choose which implementation is right for them. (whenever I try this it's embarrassingly slow) this one would be too slow
  • Allocate the output array in the pyx file and pass it by reference to avoid a copy.
  • Is it possible to do this in-place? Might be restricted to data-types uint16 or bigger. (No, you need to be able to check the original labels.)
  • Allow binary images as input and represent them as bit-packed using vector<bool>
  • limit memory used for binary images based on maximum possible number of prospective labels
  • estimate the number of provisional labels before allocating

Metadata

Metadata

Assignees

No one assigned

    Labels

    performanceLower memory or faster computation.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions