gh-xxxxx: Add type-local type cache #130135
Draft
+323
−14
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Adds a new L1 cache for types in the free-threaded build. The L1 cache is a small type-local cache which requires no locking to lookup the values.
The cache only adds objects and never replaces them, meaning that once we've confirmed the name is correct, we know the value. The cache also stores the version it's valid for, guaranteeing that we report the correct version for the lookup.
The cache is stored in the type object and is freed with qsbr when the type object is modified. Objects in the cache are eagerly marked as maybe weakref'd allowing the lookup to do a simple incref.
Unlike the normal method cache we do support probing on this cache. Values that are successfully stored in the l1 cache are not duplicated into the l2 cache reducing pressure that causes thrashing on the l2 cache.
We have some slightly different invariants about what we cache here. We won't cache lookups against types which have an odd tp_getattro. Modules and meta-classes are pretty representative of these types, and those lookups tend to be things which are just going to fill up the cache and we'll just continuously get misses on lookups.
We also stop caching if a type has exhausted its version allocation to prevent repeatedly re-allocating the local type cache on the same type.
This shows pretty good results on the benchmarks (being ~2% faster), but it might be better in real world cases. What is now the l2 cache can see a lot of trashing of values where values bounce in and out of the cache when collisions happen. In the free-threaded build that thrashing now involves taking the type lock to search the MRO which is going to be much more expensive than the thrashing in non-free threaded builds. So the real world impact when lots of methods are contending for cache space may be more dramatic.
This cache might be able to serve other purposes as well. One possibility is that we could use it to provide access to non-deferred objects in the specializing interpreter. We can cache with a "hint" location in the cache and as long as the type isn't modified we'd have safe access to the reference in the cache. I'm not sure if this is actually worth it as it seems like things which we frequently can't defer we can't specialize anyway.
https://github.com/facebookexperimental/free-threading-benchmarking/blob/main/results/bm-20250213-3.14.0a5%2B-68b535b-NOGIL/bm-20250213-vultr-x86_64-DinoV-local_type_cache-3.14.0a5%2B-68b535b-vs-base.md