The use of Bloom filter in SPV nodes

9

I know what Bloom Filter (BF) is, and I'm aware of this post and this paper. But I don't understand how it's used in bitcoin.

I can imagine that an SPV node encodes the transaction(s) it's interested in, into a BF and sends it to full node. What I don't understand is that what does the full node do after that?

Does it traverse through all the previous transactions and check which one is in the BF and sends it to the SPV node?

Also, it's not clear how private the approach would be, because at the end the full node retrieves what the SPV node wants and sends it to the node, so it knows what it's interested in.


If the goal is to preserve the privacy of the SPV's query, then some sophisticated approaches such as private information retrieval or oblivious ram must be used but they are not efficient for the bitcoin setting.

4xx

Posted 2017-07-30T16:20:16.060

Reputation: 91

Answers

5

I can imagine that an SPV node encodes the transaction(s) it's interested in, into a BF and sends it to full node. What I don't understand is that what does the full node do after that?

The full node goes through every transaction on the blockchain. (Or, at least, every transaction that happened after the wallet was created.) It checks the following things against the bloom filter.

  1. Test the hash of the transaction itself.
  2. For each output, test each data element of the output script. This means each hash and key in the output script is tested independently. Important: if an output matches whilst testing a transaction, the node might need to update the filter by inserting the serialized COutPoint structure. See below for more details.
  3. For each input, test the serialized COutPoint structure.
  4. For each input, test each data element of the input script (note: input scripts only ever contain data elements).
  5. Otherwise there is no match.

(Source.)

If it matches, it's sent to the SPV node. The SPV node discards the false positives.

If the goal is to preserve the privacy of the SPV's query, then some sophisticated approaches such as private information retrieval or oblivious ram must be used but they are not efficient for the bitcoin setting.

I don't know of any private protocol that doesn't require the full node to scan through the entire blockchain for every request.

If you know of one, let me know. :) Always interested to hear about that kind of thing.

Nick ODell

Posted 2017-07-30T16:20:16.060

Reputation: 27 521

Thanks for the answer and source. In the links I provided in my question, it's said the Bloom Filter is also used to hide (to some extent) the query of SPV nodes but it's not clear to me how it's achieved. – 4xx – 2017-07-30T20:57:40.523

2Bloom filters naturally have false positives. If a key matches a bloom filter, you can't tell if it matches because that key is in the user's wallet, or because it's a random false positive. However, there are ways to undermine this, like checking both the key and the pubkeyhash, as the paper you linked describes. – Nick ODell – 2017-07-30T21:00:11.493