Skip to content

bli_pool: fix potential insufficient pool-growing #559

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 12, 2021

Conversation

hominhquan
Copy link
Contributor

  • The current growing mechanism doubles the length of the block_ptrs array
    everytime the array length needs to be re-adjusted, but did not take in account
    the new total number of blocks, that in some cases, doubling the block_ptrs
    array length is not enough.
  • Two illustrating examples are given in the code.
  • This commit fixes it by taking max with num_blocks_new, too.
  • This commit also fixes incidentally the memory corruption of the growing
    process of any empty-, hence ill-initialized pool (block_ptrs array length = 0).

- The current growing mechanism doubles the length of the block_ptrs array
everytime the array length needs to be re-adjusted, but did not take in account
the new total number of blocks, that in some cases, doubling the block_ptrs
array length is not enough.
- Two illustrating examples are given in the code.
- This commit fixes it by taking max with num_blocks_new, too.
- This commit also fixes incidentally the memory corruption of the growing
process of any empty-, hence ill-initialized pool (block_ptrs array length = 0).
@fgvanzee
Copy link
Member

Yep, that's definitely a bug. Great work on tracking this down, @hominhquan.

@fgvanzee fgvanzee merged commit 327481a into flame:master Oct 12, 2021
@hominhquan hominhquan deleted the pool_grow branch October 12, 2021 21:32
fgvanzee pushed a commit that referenced this pull request Sep 10, 2022
Details:
- The current mechanism for growing a pool_t doubles the length of the
  block_ptrs array every time the array length needs to be increased
  due to new blocks being added. However, that logic did not take in
  account the new total number of blocks, and the fact that the caller
  may be requesting more blocks that would fit even after doubling the
  current length of block_ptrs. The code comments now contain two
  illustrating examples that show why, even after doubling, we must
  always have at least enough room to fit all of the old blocks plus
  the newly requested blocks.
- This commit also happens to fix a memory corruption issue that stems
  from growing any pool_t that is initialized with a block_ptrs length
  of 0. (Previously, the memory pool for packed buffers of C was
  initialized with a block_ptrs length of 0, but because it is unused
  this bug did not manifest by default.)
- Co-authored-by: Minh Quan Ho <[email protected]>
- (cherry picked from commit 327481a)
pradeeptrgit pushed a commit to amd/blis that referenced this pull request Nov 13, 2022
Details:
- The current mechanism for growing a pool_t doubles the length of the
  block_ptrs array every time the array length needs to be increased
  due to new blocks being added. However, that logic did not take in
  account the new total number of blocks, and the fact that the caller
  may be requesting more blocks that would fit even after doubling the
  current length of block_ptrs. The code comments now contain two
  illustrating examples that show why, even after doubling, we must
  always have at least enough room to fit all of the old blocks plus
  the newly requested blocks.
- This commit also happens to fix a memory corruption issue that stems
  from growing any pool_t that is initialized with a block_ptrs length
  of 0. (Previously, the memory pool for packed buffers of C was
  initialized with a block_ptrs length of 0, but because it is unused
  this bug did not manifest by default.)
- Co-authored-by: Minh Quan Ho <[email protected]>

Change-Id: Ie4963c56e03cbc197d26e29f2def6494f0a6046d
sireeshasanga pushed a commit to amd/blis that referenced this pull request Oct 11, 2024
* commit '81e10346':
  Alloc at least 1 elem in pool_t block_ptrs. (flame#560)
  Fix insufficient pool-growing logic in bli_pool.c. (flame#559)
  Arm SVE C/ZGEMM Fix FMOV 0 Mistake
  SH Kernel Unused Eigher
  Arm SVE C/ZGEMM Support *beta==0
  Arm SVE Config armsve Use ZGEMM/CGEMM
  Arm SVE: Update Perf. Graph
  Arm SVE CGEMM 2Vx10 Unindex Process Alpha=1.0
  Arm SVE ZGEMM 2Vx10 Unindex Process Alpha=1.0
  A64FX Config Use ZGEMM/CGEMM
  Arm SVE Typo Fix ZGEMM/CGEMM C Prefetch Reg
  Arm SVE Add SGEMM 2Vx10 Unindexed
  Arm SVE ZGEMM Support Gather Load / Scatt. St.
  Arm SVE Add ZGEMM 2Vx10 Unindexed
  Arm SVE Add ZGEMM 2Vx7 Unindexed
  Arm SVE Add ZGEMM 2Vx8 Unindexed
  Update Travis CI badge
  Armv8 Trash New Bulk Kernels
  Enable testing 1m in `make check`.
  Config ArmSVE Unregister 12xk. Move 12xk to Old
  Revert __has_include(). Distinguish w/ BLIS_FAMILY_**
  Register firestorm into arm64 Metaconfig
  Armv8 DGEMMSUP Fix Edge 6x4 Switch Case Typo
  Armv8 DGEMMSUP Fix 8x4m Store Inst. Typo
  Add test for Apple M1 (firestorm)
  Firestorm CPUID Dispatcher
  Armv8 GEMMSUP Edge Cases Require Signed Ints
  Make error checking level a thread-local variable.
  Fix data race in testsuite.
  Update .appveyor.yml
  Firestorm Block Size Fixes
  Armv8 Handle *beta == 0 for GEMMSUP ??r Case.
  Move unused ARM SVE kernels to "old" directory.
  Add an option to control whether or not to use @rpath.
  Fix $ORIGIN usage on linux.
  Arm micro-architecture dispatch (flame#344)
  Use @path-based install name on MacOS and use relocatable RPATH entries for testsuite inaries.
  Armv8 Handle *beta == 0 for GEMMSUP ?rc Case.
  Armv8 Fix 6x8 Row-Maj Ukr
  Apply patch from @xrq-phys.
  Add explicit handling for beta == 0 in armsve sd and armv7a d gemm ukrs.
  bli_error: more cleanup on the error strings array
  Arm SVE Exclude SVE-Intrinsic Kernels for GCC 8-9
  Arm SVE: Correct PACKM Ker Name: Intrinsic Kers
  Fix config_name in bli_arch.c
  Arm Whole GEMMSUP Call Route is Asm/Int Optimized
  Arm: DGEMMSUP `Macro' Edge Cases Stop Calling Ref
  Header Typo
  Arm: DGEMMSUP ??r(rv) Invoke Edge Size
  Arm: DGEMMSUP ?rc(rd) Invoke Edge Size
  Arm: Implement GEMMSUP Fallback Method
  Arm64 Fix: Support Alpha/Beta in GEMMSUP Intrin
  Added Apple Firestorm (A14/M1) Subconfig
  Arm64 8x4 Kernel Use Less Regs
  Armv8-A Supplimentary GEMMSUP Sizes for RD
  Armv8-A Fix GEMMSUP-RD Kernels on GNU Asm
  Armv8-A Adjust Types for PACKM Kernels
  Armv8-A GEMMSUP-RD 6x8m
  Armv8-A GEMMSUP-RD 6x8n
  Armv8-A s/d Packing Kernels Fix Typo
  Armv8-A Introduced s/d Packing Kernels
  Armv8-A DGEMMSUP 6x8m Kernel
  Armv8-A DGEMMSUP Adjustments
  Armv8-A Add More DGEMMSUP
  Armv8-A Add GEMMSUP 4x8n Kernel
  Armv8-A Add Part of GEMMSUP 8x4m Kernel
  Armv8A DGEMM 4x4 Kernel WIP. Slow
  Armv8-A Add 8x4 Kernel WIP

AMD-Internal: [CPUPL-2698]
Change-Id: I194ff69356740bb36ca189fd1bf9fef02eec3803
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants