## SPV Bloom filter construction, are false positives taken from the blockchain or just due to matching with other addresses

5

It is my understanding that the false positive addresses generated by an SPV bloom filter do not need to necessarily be on the blockchain (ie. be addresses that have been used already). Is that a correct way of thinking about SPV bloom filters?

5

Bloom filters are probabilistic, each attempted match you make with it has a specific chance of being a false positive. The rate of false positives is determined by the construction of the filter (how wide it is and how many elements have been added to it). A transaction either probably matches, or certainly doesn't match a given bloom filter.

BIP37 does not include Bitcoin addresses in the filter matching system, instead for every transaction in the block a given filter is matched with the TXID, every output script, every input script, and every data element in the input script. If any element matches the transaction is sent to the client, false positive or not is up to the client to decide.

It is my understanding that the false positive addresses generated by an SPV bloom filter do not need to necessarily be on the blockchain (ie. be addresses that have been used already).

BIP37 SPV filtering will only ever return either transactions from a block (including the merkle tree path to the block header), or transactions from their memory pool which match the given filter. False positives aren't created out of nowhere, they just happen to be other transactions which fit a criteria given by the client.

The intent was originally that clients would tune the level of false positive transactions to obscure their real intention from the peer they are talking to, but in practice this appears to be completely ineffective.

1

false positives addresses ... do not need to necessarily be on the blockchain

You can think of bloom filter matching like a function.

def matches(tx, filter):
...


You could make a transactions that returns true for this function with a given filter, which also was not meant to return true for that transaction, so false positives can exist outside of the blockchain data. But to see it in practice, a false positive will almost always come from the blockchain/p2p network.