-
Notifications
You must be signed in to change notification settings - Fork 565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
experiment: Make Tsavorite allocators unmanaged #1112
base: main
Are you sure you want to change the base?
Conversation
* Using NativeMemory alignment APIs
* Make BlittableFrame NativeMemory based
Also remove one unnecessary fixed statement usage for copying memory
…ious RecordInfo issues remain
…cessing invalid queued buffers while log itself is being disposed
* We could return empty sentinel instance?
* Also do over-sized alloc for BlittableFrames
Trying to get tests pass so I can start to properly diagnose what parts of the allocator's record (de)serialization is messing around beyond the allocated bounds that the code explicitly requests for (the code currently is seemingly relying on the pool to allocate much more behind during alignment than what it expects per the "contract", i.e. the length of the memory returned from pool is the smallest segment to fulfill that request) The changes made to the way allocator handles the memory buffers (that were GC tracked arrays before this PR) has "exposed" that the allocators pages were never explicitly unrooted by the usual disposal procedures. By changing to unmanaged buffers, we must of course explicitly free them and this seems to have upsetted asynchronous/multithreaded log tests in couple cases that I could test and diagnose properly. The issue is that Azure device tests seems to now be failing due to this change, i.e. was relying on those buffers to be available after disposal of the log. I currently have no capability to try setup Azure blob storage for testing (I'm broke). Little bit out of the depth in certain parts of the storage logic, so thought to ask first: Is there existing mechanism in the storage peristence layer to either make the tests wait for all log device IO to finish before starting the clean-up logic (in-memory device had blocking logic in its disposal, do we want to build IAsyncDisposable one?), or existing logic/approach to telling the device to "abort everything, the buffers you (the device) have queued from allocator maybe/are bad". This is what I think is currently going wrong in the Azure device tests. cc @badrishc or anyone |
* Avoid temporary buffer and copying, operate on the buffer we get from pool directly. * Pool documentation adjustments
We may (dont recall 100% at the moment) be simply letting the GC handle the clean up of pages, for the reason that it was overhead to track and ensure the actual completion of IO-in-flight, and this was thought to be unnecessary as it was a test-only problem. So we just let the GC handle it. With explicit native memory, things get more complicated. We will need to drain the IO. There is a CompletePending API for the KV, but it is possible other parts of the code do not have similar mechanisms yet? TsavoriteLog uses an cc @TedHartMS |
@PaulusParssinen - can you show an example where this is occurring? |
wip. Testing if we can stop using POH for the various Tsavorite buffers and making the aligned allocations faster and simpler by using the new
NativeMemory.Aligned*
APIs. (i.e. delegate to same allocation logic and APIs as original FASTER C++ impl)Some notes about the progress
ValidOffset
,RequiredBytes
andAvailableBytes
seem to be record allocation concepts and probably don't belong to the aligned memory class. Consider pulling them out while we're here.