public class BlockSplitBloomFilter extends Object implements BloomFilter
BloomFilter.Algorithm, BloomFilter.Compression, BloomFilter.HashStrategy| Modifier and Type | Field and Description |
|---|---|
static double |
DEFAULT_FPP |
static int |
HEADER_SIZE |
static int |
LOWER_BOUND_BYTES |
static int |
UPPER_BOUND_BYTES |
| Constructor and Description |
|---|
BlockSplitBloomFilter(byte[] bitset)
Construct the Bloom filter with given bitset, it is used when reconstructing
Bloom filter from parquet file.
|
BlockSplitBloomFilter(int numBytes)
Constructor of block-based Bloom filter.
|
BlockSplitBloomFilter(int numBytes,
int maximumBytes)
Constructor of block-based Bloom filter.
|
BlockSplitBloomFilter(int numBytes,
int minimumBytes,
int maximumBytes,
BloomFilter.HashStrategy hashStrategy)
Constructor of block-based Bloom filter.
|
| Modifier and Type | Method and Description |
|---|---|
boolean |
canMergeFrom(BloomFilter otherBloomFilter)
Determines whether a given Bloom filter can be merged into this Bloom filter.
|
boolean |
equals(Object object)
Compare this Bloom filter to the specified object.
|
boolean |
findHash(long hash)
Determine whether an element is in set or not.
|
BloomFilter.Algorithm |
getAlgorithm()
Return the algorithm that the bloom filter apply.
|
int |
getBitsetSize()
Get the number of bytes for bitset in this Bloom filter.
|
BloomFilter.Compression |
getCompression()
Return the compress algorithm that the bloom filter apply.
|
BloomFilter.HashStrategy |
getHashStrategy()
Return the hash strategy that the bloom filter apply.
|
long |
hash(Binary value)
Compute hash for Binary value by using its plain encoding result.
|
long |
hash(double value)
Compute hash for double value by using its plain encoding result.
|
long |
hash(float value)
Compute hash for float value by using its plain encoding result.
|
long |
hash(int value)
Compute hash for int value by using its plain encoding result.
|
long |
hash(long value)
Compute hash for long value by using its plain encoding result.
|
long |
hash(Object value)
Compute hash for Object value by using its plain encoding result.
|
void |
insertHash(long hash)
Insert an element to the Bloom filter, the element content is represented by
the hash value of its plain encoding result.
|
void |
merge(BloomFilter otherBloomFilter)
Merges this Bloom filter with another Bloom filter by performing a bitwise OR of the underlying bitsets
|
static int |
optimalNumOfBits(long n,
double p)
Calculate optimal size according to the number of distinct values and false positive probability.
|
void |
writeTo(OutputStream out)
Write the Bloom filter to an output stream.
|
public static final int LOWER_BOUND_BYTES
public static final int UPPER_BOUND_BYTES
public static final int HEADER_SIZE
public static final double DEFAULT_FPP
public BlockSplitBloomFilter(int numBytes)
numBytes - The number of bytes for Bloom filter bitset. The range of num_bytes should be within
[DEFAULT_MINIMUM_BYTES, DEFAULT_MAXIMUM_BYTES], it will be rounded up/down
to lower/upper bound if num_bytes is out of range. It will also be rounded up to a power
of 2. It uses XXH64 as its default hash function.public BlockSplitBloomFilter(int numBytes,
int maximumBytes)
numBytes - The number of bytes for Bloom filter bitset. The range of num_bytes should be within
[DEFAULT_MINIMUM_BYTES, maximumBytes], it will be rounded up/down
to lower/upper bound if num_bytes is out of range. It will also be rounded up to a power
of 2. It uses XXH64 as its default hash function.maximumBytes - The maximum bytes of the Bloom filter.public BlockSplitBloomFilter(int numBytes,
int minimumBytes,
int maximumBytes,
BloomFilter.HashStrategy hashStrategy)
numBytes - The number of bytes for Bloom filter bitset. The range of num_bytes should be within
[minimumBytes, maximumBytes], it will be rounded up/down to lower/upper bound if
num_bytes is out of range. It will also be rounded up to a power of 2.minimumBytes - The minimum bytes of the Bloom filter.maximumBytes - The maximum bytes of the Bloom filter.hashStrategy - The adopted hash strategy of the Bloom filter.public BlockSplitBloomFilter(byte[] bitset)
bitset - The given bitset to construct Bloom filter.public void writeTo(OutputStream out) throws IOException
BloomFilterwriteTo in interface BloomFilterout - the output stream to writeIOExceptionpublic void insertHash(long hash)
BloomFilterinsertHash in interface BloomFilterhash - the hash result of element.public boolean findHash(long hash)
BloomFilterfindHash in interface BloomFilterhash - the hash value of element plain encoding result.public static int optimalNumOfBits(long n,
double p)
n: - The number of distinct values.p: - The false positive probability.public int getBitsetSize()
BloomFiltergetBitsetSize in interface BloomFilterpublic long hash(Object value)
BloomFilterhash in interface BloomFiltervalue - the value to hashpublic boolean equals(Object object)
BloomFilterequals in interface BloomFilterequals in class Objectpublic BloomFilter.HashStrategy getHashStrategy()
BloomFiltergetHashStrategy in interface BloomFilterpublic BloomFilter.Algorithm getAlgorithm()
BloomFiltergetAlgorithm in interface BloomFilterpublic BloomFilter.Compression getCompression()
BloomFiltergetCompression in interface BloomFilterpublic long hash(int value)
BloomFilterhash in interface BloomFiltervalue - the value to hashpublic long hash(long value)
BloomFilterhash in interface BloomFiltervalue - the value to hashpublic long hash(double value)
BloomFilterhash in interface BloomFiltervalue - the value to hashpublic long hash(float value)
BloomFilterhash in interface BloomFiltervalue - the value to hashpublic long hash(Binary value)
BloomFilterhash in interface BloomFiltervalue - the value to hashpublic boolean canMergeFrom(BloomFilter otherBloomFilter)
BloomFiltercanMergeFrom in interface BloomFilterotherBloomFilter - The Bloom filter to merge this Bloom filter with.public void merge(BloomFilter otherBloomFilter) throws IOException
BloomFiltermerge in interface BloomFilterotherBloomFilter - The Bloom filter to merge this Bloom filter with.IOExceptionCopyright © 2024 The Apache Software Foundation. All rights reserved.