-
Notifications
You must be signed in to change notification settings - Fork 591
feat: Initial foyer layer implementation #5906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hi, I've got a question: since it's a |
@meteorgan I think there is no concept of TTL in foyer. You specify a max size in bytes of the cache, and it will be filled until the cache is full. If a new key is going yo be inserted with the cache being full, another entry is evicted (see https://foyer.rs/docs/tutorial/in-memory-cache#22-count-entries-by-weight) |
What I mean is, if a key gets deleted or updated, we should refresh the cache. Otherwise, we'll end up with stale data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jorgehermo9 for the work! I believe we are very close to have a working demo.
where | ||
Self: Sized, | ||
{ | ||
let mut buf = Vec::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @MrCroxx, this can be much better if we can know the size before decode
.
Thanks for the review @Xuanwo, I was still working on the previous reviews and didn't have all things done, thats why I didnt mentioned you yet. I will work a lot more on this today though :) Didn't forgot/ignore the previous comments, I have all of them in mind, just had been working slowly on them.. |
core/src/layers/mod.rs
Outdated
pub use self::foyer::CacheKey; | ||
#[cfg(feature = "layers-foyer")] | ||
pub use self::foyer::CacheValue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to expose both CacheKey
and CacheValue
because they are part of public API (generic params of the HybridCache
we receive in FoyerLayer::new
)
Should we expose those two this way, or do pub mod foyer
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Negative. I don't want to expose anything other than FoyerLayer
. The HybridCache
should be HybridCache<String, Buffer>
, which means we don't need any additional types. It's the user's responsibility to use compatible versions of Foyer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could do that for the CacheKey
, but the cache value can't be Buffer
because of what you suggested here
As Code
should not be implemented for raw Buffer
, the cache value should be the CacheValue
wrapper. We can't have Buffer
as the cache value because it does not implement Code
(CacheValue
does).
Hi @Xuanwo, I addressed all the comments I could, but left some doubts on opened threads, could you revisit that? |
#[derive(Debug, Clone)] | ||
pub struct CacheValue(Buffer); | ||
|
||
impl Code for CacheValue { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder @MrCroxx how should we take care of CacheKey
and CacheValue
evolution throught time. Should we take care of breaking changes such including a new mandatory field in the CacheKey
struct?
For example, if we add a new mandatory field to CacheKey
, previous version's keys in disk cache will fail to deserialize? How should we do things here? We must always do back-compatible changes to both CacheKey
and CacheValue
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no deserialize for CacheKey
. The cache just try to read the key we input. And CacheValue
itself is a plain Buffer
which contains the raw content.
I think we don't need to concern about the back-compatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think deserialization is done for the cache key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think deserialization is done for the cache key.
Good point!
cc @MrCroxx, what happens if the cache key changes? Will Foyer silently ignore it and remove the stale cache value, or will we encounter an error?
Self: Sized, | ||
{ | ||
let mut buf = Vec::new(); | ||
reader.read_to_end(&mut buf).map_err(CodeError::Io)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it okay to read all contents from reader at once? I don't know how to avoid this copy.
Is there any way to wrap the reader
we receive in the parameters and return it instead of copying to an intermediate buffer? Or to at least lazily read it on-demand and not beforehand
I'm afraid of #5906 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it okay to read all contents from reader at once? I don't know how to avoid this copy.
That’s why we need to limit the cache’s read size. The current cache layer doesn’t handle chunking for users; it’s up to users to ensure the read size is optimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will handle this situation with this #5906 (comment) right?
Hi @Xuanwo, I think I addressed everything (left some comments in a few threads). I plan to address documentation & tests once we are confident about how the final implementation looks like Thank you very much for your reviews, I found them very learnable. |
use super::*; | ||
|
||
#[test] | ||
fn test_build_cache_key_with_default_args() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would be better to just have one parametrized test with all these cases instead of 1 test for each case. What is the current approach in the codebase?
Closes #5678
I open this as a draft to get some feedback about what I'm doing right now. There are missing tests and documentation, but I would like to get some feedback before working more on this
My main concern is, what would be the scope of this cache layer? Should we only cache
read
calls? Should we also cachelist
orstat
calls? If so, I could implement that also in this PR, but I want to know if it makes sense first. Right now, only asyncread
calls are cachedI will fix CI, add tests and documentation once this initial feedback!
Thanks a lot~