How to construct the Filter

Published by Mario Oettler on

Last Updated on 3. May 2023 by Martin Schuster

BIP157 doesn’t explicitly say which filters are suitable. The standard implementation, though, is BIP158 which describes Golomb Coded Sets.

In this topic, you learn what a Golomb Coded Set is. Creating such a set consists of three basic steps.

Step 1 – Hashing: The full node hashes the transactions to the range [0,F), where F = N * M.

N: number of items,

M: False-Positive parameter.

BIP158 proposes SipHash as hash function. In a Golomb Coded Set it is possible but not necessary to use cryptographic hash functions like SHA-256.

Step 2 –Truncating: The full node then truncates the result to a length of 64 bit. The easiest way would be using modulo. But since this involves “expensive” division operation, the BIP-authors chose another way. They multiply the result with a factor Fa and cut the redundant bits. This is faster than the modulo operation.

Step 3 – Filling. In this step, a technique called Golomb-Rice-Coding is applied by the filter creators. This method is space-efficient in order to store data if they are uniformly distributed. First, the data are sorted. Then the difference between the sorted values is calculated. Only the difference is stored in the filter. You can find further details about Golomb Rice here: https://giovanni.bajo.it/post/47119962313/golomb-coded-sets-smaller-than-bloom-filters

Block-filters have a link to their ancestor, similar like blocks in a blockchain. This is achieved by hash pointers. Each filter contains the hash of the previous filter. Using hash-pointers makes it easier to verify filters and detect splits in a blockchain (forks).

Categories: