Code Review for blocks/blockstore/bloom_cache.go #3140

jbenet · 2016-08-29T04:49:56Z

Code Review for https://github.com/ipfs/go-ipfs/blob/master/blocks/blockstore/bloom_cache.go

Note: this is minor, because this cache is used correctly in go-ipfs. But the abstraction is one that has rough edges that may hurt people modifying the calling code. Hence worth fixing at some point.

It took me a bit to understand why exiting during rebuilding is correct-- i thought the context canceling would leave the cache in a bad state. This is fine because the bloom filter is not "activated" (b.active is left as 0), so would still need to get "rebuilt" to work.
The code is not the most readable ever, and it could use either some more clear naming, more commenting (i.e. why it's ok to exit here, why active and so on.
I also think it doesn't make sense to use atomic.StoreInt... instead of a proper lock abstraction-- it's harder to understand.
It's also not fully correct-- you can call Rebuild while other goroutines are in hasCached which is probably wrong. There's no mutual exclusion there and it may answer incorrectly or worse, leave things in a bad state. Probably Rebuild should be unexported.
For performance you want to avoid locking, which makes perfect sense. But seems this way is not the cleanest, and is not fully correct for the callers.
The way to do that is to do the building on New and leave it well constructed after that-- i.e. user cannot call Rebuild, or make it very clear to the caller that it is a destructive operation with data races. (i.e. the caller must enforce their own locking). (Though this is a clunky interface because the design allows calls to Has while the bloom filter is being built...
i'd rename bloomCached to newBloomCachedBS to make it absolutely clear (go idiomatic) that this must be constructed by calling newBloomCachedBS. otherwise it's just a helper that may or may not be called. (i.e. some of thees datastructures are ready to use after allocation -- i.e. no init -- and this is not one of them).
I can call Rebuild twice in a row with bad consequences (b.rebuildChan is closed twice)
I'd try to get rid of the channel being allocated in Invalidate. It increases the number of "bad possible states". What i would probably do is make Rebuild return a channel of when it will be done:
```
func (b *bloomcache) Rebuild(ctx context.Context) (done <-chan bool) { ... }
```
In general, with things like this, if calling a weird sequence of exported functions would put the cache in a bad state, then there is a design problem. It's like giving users a device that has to be started by pressing buttons in the right order and if you mess up the device is bricked.
I would probably unexport rebuild. if you want to export something that does it, place a mutex around it.
In summary, i would:
- rework the locking/active state
- OR (the easy way) unexport Rebuild and rename it Rebuild -> build
- remove the rebuldChan from the struct and return it from the build func
- rename bloomCached -> newBloomCachedBS

The text was updated successfully, but these errors were encountered:

jbenet · 2016-08-29T04:50:04Z

cc @Kubuxu

Kubuxu · 2016-08-29T13:49:15Z

Thank you for those remarks, they are very valuable.

I will limit the access and rework the logic a bit to make it more safe.
I didn't use lock on the active as it was introducing major performance hit when I was benchmarking it (about +50%, the bloom code is just fast), looking closer at the code I see a way to remove the possible edge case races without negatively influencing performance.

Kubuxu · 2016-08-30T14:30:42Z

I rebenchmarked it and RWMutex adds only about 10%, so I will go with that.

Kubuxu · 2016-08-30T14:33:59Z

It will be a bit harder but possible to do as I need the writes to go to the new bloom filter, currently we never disable the filter so it isn't a big problem, but we want to do it in future.

I need tristate structure: no bloom, bloom available for writing, bloom available for reading and writing

I started working on something in the feat/bc/bloom-rework branch.

whyrusleeping · 2017-09-03T00:20:49Z

@Kubuxu can this be closed?

Kubuxu · 2017-09-03T05:57:45Z

Not really.

Kubuxu self-assigned this Aug 29, 2016

em-ly added the need/community-input Needs input from the wider community label Aug 29, 2016

whyrusleeping added the help wanted Seeking public contribution on this issue label Sep 14, 2016

flyingzumwalt added the status/deferred Conscious decision to pause or backlog label Sep 26, 2016

Stebalien unassigned Kubuxu Apr 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code Review for blocks/blockstore/bloom_cache.go #3140

Code Review for blocks/blockstore/bloom_cache.go #3140

jbenet commented Aug 29, 2016

jbenet commented Aug 29, 2016

Kubuxu commented Aug 29, 2016

Kubuxu commented Aug 30, 2016 •

edited

Loading

Kubuxu commented Aug 30, 2016 •

edited

Loading

whyrusleeping commented Sep 3, 2017

Kubuxu commented Sep 3, 2017

Code Review for blocks/blockstore/bloom_cache.go #3140

Code Review for blocks/blockstore/bloom_cache.go #3140

Comments

jbenet commented Aug 29, 2016

Code Review for https://github.com/ipfs/go-ipfs/blob/master/blocks/blockstore/bloom_cache.go

jbenet commented Aug 29, 2016

Kubuxu commented Aug 29, 2016

Kubuxu commented Aug 30, 2016 • edited Loading

Kubuxu commented Aug 30, 2016 • edited Loading

whyrusleeping commented Sep 3, 2017

Kubuxu commented Sep 3, 2017

Kubuxu commented Aug 30, 2016 •

edited

Loading

Kubuxu commented Aug 30, 2016 •

edited

Loading