## What indexes does a Bitcoin Core node maintain to serve Bloom-filtered requests and SPV peers?

0

The Wiki makes this statement:

It [getdata message] can be used to retrieve transactions, but only if they are in the memory pool or relay set - arbitrary access to transactions in the chain is not allowed to avoid having clients start to depend on nodes having full transaction indexes (which modern nodes do not). [my emphasis]

https://en.bitcoin.it/wiki/Protocol_documentation#getdata

Yet "Connection Bloom filtering" (BIP-37) seems to require just such an index system:

The filter can be tested against arbitrary pieces of data, to see if that data was inserted by the client. Therefore the question arises of what pieces of data should be inserted/tested.

To determine if a transaction matches the filter, the following algorithm is used. Once a match is found the algorithm aborts.

An ordered list of criteria is then given.

It appears that a node supporting such a filter would also need to maintain an index (or several), or risk denial of service attacks.

I saw this question:

70% of nodes accept Bloom filters, despite DoS attack vector?

I saw this question:

Is there a way to index transactions so that filterload commands can be answered without iterating through all transactions in a block?

the only answer to which implies, without stating explicitly, that no indexes are created at all.

What indexes does a node create for the sole purpose of supporting Bloom filtering for SPV nodes?

If none are created and transactions are filtered on the fly, how can this statement in the linked question above be true?

Thus you can easily trigger exactly the same DoS attack by just using regular getdata requests on large blocks over and over. You don't need Bloom filtering. If you don't want to actually download the blocks just don't TCP ACK the packets and then FIN after a few seconds .... the data will all have been loaded and be sitting in the send buffers.

2

Bitcoin Core maintains an index of blocks and their locations on disk. When someone requests a block, it pulls the block from disk, and if they used BIP 37, it will run the block through the filter. The block index is required for normal node operation; there are no other indexes created. The only "index" like thing is the mempool, and that is maintained solely in memory. That is where transactions come from if they are requested; confirmed transactions will not be in the mempool and cannot be relayed.

For BIP 37, things are only tested against the filter prior to being relayed to a node that had set a filter. The only things that will ever be relayed are unconfirmed transactions and entire blocks.

If none are created and transactions are filtered on the fly, how can this statement in the linked question above be true?

The statement is true because a node can set a filter and request blocks and unconfirmed transactions from the node. The node will then pass those blocks and transactions through the filter before relaying them.

Fragmenting transactions and running the pieces through a Bloom filter as BIP-37 recommends consumes node resources, doesn't it? If a node just sends without filtering, then those resources need not be consumed. Doesn't filtering transactions on the fly open a DoS attack vector that doesn't exist when blocks and mempool are sent without filtering? – Rich Apodaca – 2017-08-13T18:08:32.773

1Yes, it does consume resources and can potentially lead to DoS attacks. There are also ways to mitigate DoS attacks such as rate limiting. That is why most developers consider BIP 37 to be bad, but so far, it is the only thing that supports lightweight wallets in a decentralized manner. – Andrew Chow – 2017-08-13T18:18:11.667

The design of BIP 37 is more or less incompatible with any kind of indexing in any case, as the serving node is required to continually modify the filter in order on a transaction by transaction basis. – G. Maxwell – 2018-07-31T03:37:47.130