-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Code Review for blocks/blockstore/bloom_cache.go #3140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @Kubuxu |
Thank you for those remarks, they are very valuable. I will limit the access and rework the logic a bit to make it more safe. |
I rebenchmarked it and RWMutex adds only about 10%, so I will go with that. |
It will be a bit harder but possible to do as I need the writes to go to the new bloom filter, currently we never disable the filter so it isn't a big problem, but we want to do it in future. I need tristate structure: no bloom, bloom available for writing, bloom available for reading and writing I started working on something in the feat/bc/bloom-rework branch. |
@Kubuxu can this be closed? |
Not really. |
Code Review for https://github.com/ipfs/go-ipfs/blob/master/blocks/blockstore/bloom_cache.go
Note: this is minor, because this cache is used correctly in go-ipfs. But the abstraction is one that has rough edges that may hurt people modifying the calling code. Hence worth fixing at some point.
It took me a bit to understand why exiting during rebuilding is correct-- i thought the context canceling would leave the cache in a bad state. This is fine because the bloom filter is not "activated" (b.active is left as 0), so would still need to get "rebuilt" to work.
The code is not the most readable ever, and it could use either some more clear naming, more commenting (i.e. why it's ok to exit here, why active and so on.
I also think it doesn't make sense to use
atomic.StoreInt...
instead of a proper lock abstraction-- it's harder to understand.It's also not fully correct-- you can call
Rebuild
while other goroutines are inhasCached
which is probably wrong. There's no mutual exclusion there and it may answer incorrectly or worse, leave things in a bad state. ProbablyRebuild
should be unexported.For performance you want to avoid locking, which makes perfect sense. But seems this way is not the cleanest, and is not fully correct for the callers.
The way to do that is to do the building on
New
and leave it well constructed after that-- i.e. user cannot call Rebuild, or make it very clear to the caller that it is a destructive operation with data races. (i.e. the caller must enforce their own locking). (Though this is a clunky interface because the design allows calls toHas
while the bloom filter is being built...i'd rename
bloomCached
tonewBloomCachedBS
to make it absolutely clear (go idiomatic) that this must be constructed by callingnewBloomCachedBS
. otherwise it's just a helper that may or may not be called. (i.e. some of thees datastructures are ready to use after allocation -- i.e. no init -- and this is not one of them).I can call
Rebuild
twice in a row with bad consequences (b.rebuildChan is closed twice)I'd try to get rid of the channel being allocated in
Invalidate
. It increases the number of "bad possible states". What i would probably do is makeRebuild
return a channel of when it will be done:In general, with things like this, if calling a weird sequence of exported functions would put the cache in a bad state, then there is a design problem. It's like giving users a device that has to be started by pressing buttons in the right order and if you mess up the device is bricked.
I would probably unexport
rebuild
. if you want to export something that does it, place a mutex around it.In summary, i would:
Rebuild
and rename itRebuild -> build
rebuldChan
from the struct and return it from thebuild
funcbloomCached -> newBloomCachedBS
The text was updated successfully, but these errors were encountered: