feat: Initial foyer layer implementation #5906

jorgehermo9 · 2025-03-28T16:48:29Z

I open this as a draft to get some feedback about what I'm doing right now. There are missing tests and documentation, but I would like to get some feedback before working more on this

My main concern is, what would be the scope of this cache layer? Should we only cache read calls? Should we also cache list or stat calls? If so, I could implement that also in this PR, but I want to know if it makes sense first. Right now, only async read calls are cached

I will fix CI, add tests and documentation once this initial feedback!

Thanks a lot~

core/Cargo.toml

core/src/layers/cache.rs

core/Cargo.toml

meteorgan · 2025-03-30T13:09:15Z

Hi, I've got a question: since it's a cache, when do we clear out expired data ? I couldn't spot the code handling that.

jorgehermo9 · 2025-03-30T14:48:40Z

@meteorgan I think there is no concept of TTL in foyer. You specify a max size in bytes of the cache, and it will be filled until the cache is full. If a new key is going yo be inserted with the cache being full, another entry is evicted (see https://foyer.rs/docs/tutorial/in-memory-cache#22-count-entries-by-weight)

meteorgan · 2025-03-31T03:25:38Z

@meteorgan I think there is no concept of TTL in foyer. You specify a max size in bytes of the cache, and it will be filled until the cache is full. If a new key is going yo be inserted with the cache being full, another entry is evicted (see https://foyer.rs/docs/tutorial/in-memory-cache#22-count-entries-by-weight)

What I mean is, if a key gets deleted or updated, we should refresh the cache. Otherwise, we'll end up with stale data.

Xuanwo

Thank you @jorgehermo9 for the work! I believe we are very close to have a working demo.

core/Cargo.toml

core/src/layers/foyer.rs

Xuanwo · 2025-04-15T12:21:15Z

core/src/layers/foyer.rs

+    where
+        Self: Sized,
+    {
+        let mut buf = Vec::new();


cc @MrCroxx, this can be much better if we can know the size before decode.

core/src/layers/foyer.rs

core/src/raw/rps.rs

core/src/layers/foyer.rs

jorgehermo9 · 2025-04-15T12:53:47Z

Thanks for the review @Xuanwo, I was still working on the previous reviews and didn't have all things done, thats why I didnt mentioned you yet. I will work a lot more on this today though :)

Didn't forgot/ignore the previous comments, I have all of them in mind, just had been working slowly on them..

core/src/layers/foyer.rs

jorgehermo9 · 2025-04-15T15:33:56Z

core/src/layers/mod.rs

+pub use self::foyer::CacheKey;
+#[cfg(feature = "layers-foyer")]
+pub use self::foyer::CacheValue;


I have to expose both CacheKey and CacheValue because they are part of public API (generic params of the HybridCache we receive in FoyerLayer::new)

Should we expose those two this way, or do pub mod foyer instead?

Negative. I don't want to expose anything other than FoyerLayer. The HybridCache should be HybridCache<String, Buffer>, which means we don't need any additional types. It's the user's responsibility to use compatible versions of Foyer.

I think we could do that for the CacheKey, but the cache value can't be Buffer because of what you suggested here

As Code should not be implemented for raw Buffer, the cache value should be the CacheValue wrapper. We can't have Buffer as the cache value because it does not implement Code (CacheValue does).

jorgehermo9 · 2025-04-15T15:44:03Z

Hi @Xuanwo, I addressed all the comments I could, but left some doubts on opened threads, could you revisit that?

core/src/layers/foyer.rs

jorgehermo9 · 2025-04-15T18:06:56Z

core/src/layers/foyer.rs

+#[derive(Debug, Clone)]
+pub struct CacheValue(Buffer);
+
+impl Code for CacheValue {


I wonder @MrCroxx how should we take care of CacheKey and CacheValue evolution throught time. Should we take care of breaking changes such including a new mandatory field in the CacheKey struct?

For example, if we add a new mandatory field to CacheKey, previous version's keys in disk cache will fail to deserialize? How should we do things here? We must always do back-compatible changes to both CacheKey and CacheValue?

There is no deserialize for CacheKey. The cache just try to read the key we input. And CacheValue itself is a plain Buffer which contains the raw content.

I think we don't need to concern about the back-compatible.

I think deserialization is done for the cache key.

https://github.com/foyer-rs/foyer/blob/efa47f7c70ed70f0970d2b04ffc8e6422b5c78c9/foyer-storage/src/serde.rs#L178

I think deserialization is done for the cache key.

Good point!

cc @MrCroxx, what happens if the cache key changes? Will Foyer silently ignore it and remove the stale cache value, or will we encounter an error?

jorgehermo9 · 2025-04-15T18:11:35Z

core/src/layers/foyer.rs

+        Self: Sized,
+    {
+        let mut buf = Vec::new();
+        reader.read_to_end(&mut buf).map_err(CodeError::Io)?;


Is it okay to read all contents from reader at once? I don't know how to avoid this copy.
Is there any way to wrap the reader we receive in the parameters and return it instead of copying to an intermediate buffer? Or to at least lazily read it on-demand and not beforehand

I'm afraid of #5906 (comment)

Is it okay to read all contents from reader at once? I don't know how to avoid this copy.

That’s why we need to limit the cache’s read size. The current cache layer doesn’t handle chunking for users; it’s up to users to ensure the read size is optimal.

We will handle this situation with this #5906 (comment) right?

jorgehermo9 · 2025-04-16T12:46:09Z

Hi @Xuanwo, I think I addressed everything (left some comments in a few threads). I plan to address documentation & tests once we are confident about how the final implementation looks like

Thank you very much for your reviews, I found them very learnable.

jorgehermo9 · 2025-04-16T12:50:29Z

core/src/layers/foyer.rs

+    use super::*;
+
+    #[test]
+    fn test_build_cache_key_with_default_args() {


I wonder if it would be better to just have one parametrized test with all these cases instead of 1 test for each case. What is the current approach in the codebase?

jorgehermo9 added 3 commits March 23, 2025 23:15

feat: initial cache-layer implementation

94bace5

feat: futures::executor::block_on workaround

2962245

feat: update

ec65022