Skip to content

DavidFluck/rust-bloom-filter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rust Bloom Filter

Introduction

A Bloom filter is a space-efficient, probabilistic set data structure. False positives are possible, but false negatives are not, making it useful as a negative cache.

A Bloom filter consists of a bit array of length m and a number, k, of hash functions, h (in practice the same hash function with a different seed, typically 0 to k - 1). To insert a given key K, we perform h(K) k - 1 times. For each h(K), we constrain it modulo m, which gives us a value in the range [0, m - 1]. (This technically introduces some bias, but in practice, it's rarely an issue.) We set this bit in the bit array for each iteration until we complete k - 1 rounds. The result is a bit vector with up to k bits set.

Lookup is almost identical to insertion, except we leave the bits alone and only check if each one is set. For each k, we again perform h(K), each time checking if that corresponding bit is set. If it is, we continue; if not, we break out of the loop and return a falsey value, meaning the key is definitely not in the set. If we don't fail, however, we can say that the key is probably in the set. We can't say it with certainty because it's possible that one or more other keys set the same bits as the given key. This probability changes based on the values for m and k, and there is a way to work out the optimal values for a desired false positive rate.

Features

Future Work

So far, we can only use strings as hash keys. I would like to support hashing any generically encodable type.

is_member returns either true or false. I would like to wrap this in an Option<T>, or perhaps something a bit more complex, such as a type that can represent either "not in the set" or "probably in the set, with this probability of a false positive".

About

Basic bloom filter written in Rust.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages