Is the SPV client model scalable?



SPV clients typically connect to full node peers and set filters on what information they download. How many SPV peers can a single full node support, typically?

Is it reasonable to expect millions of SPV node users connecting to ~6,000 full nodes?


Posted 2015-05-11T14:31:25.240

Reputation: 13 123



Generally speaking, BIP37 bloom filtering SPV has atrocious scaling though it is hard to say exactly how poor it is in the real works.

  • Every peer must sync the entire block chain from the last they had contact with the network, in the worst case this is approximately 50GB. The node must load every single block from disk, filter it to the clients specifications and return the result. The amount will grow until the end of Bitcoin or time, whichever comes first. Without some protocol changes pruning is incompatible with BIP37 SPV as it expects all blocks to be available at all hosts.

  • For every wallet which is synchronized, each incoming block and transaction must be individually filtered. This involves a non negligible amount of CPU time, and must be done for each peer in turn for every inventory item.

    • It is unclear exactly how much CPU power an average node has, but at least some portion of nodes are running on bottom of the barrel hardware like a Raspberry Pi which is very unlikely to be able to happily serve more than a couple of peers at a time.

      Similarly a large portion of listening nodes are hosted on budget VPS providers which offer a single shared core and extremely poor performance. OVH has 300, Hetzner has 300 Digital Ocean has 124, CloudAtCost is truly at the bottom of the pile with 6.

  • BIP37 is vulnerable to trivial denial of service attacks, demonstration code is available that is able to cripple nodes by fast inventory requests through filters causing continuous disk seek and high CPU usage. It is tempting to say that clients could use proof of work (but this is impossible on a battery powered device like a phone) or micropayments (impossible if a node doesn't know it has received money yet) but neither really offers a clear solution.

  • There's also likely a lot less than 6000 listening nodes, some are pruned which is fairly useless for a SPV client to sync with, many are listening on both IPv4 and IPV6 and are duplicated in that number as a result. The real number of actual nodes is probably more along the lines of 5000, and the number of them not on dialup speed uplinks is substantially lower.

Is it reasonable to expect millions of SPV node users connecting to ~6,000 full nodes?

Absolutely not, even just for the reason that the code defaults to a maximum of 117 incoming connections which would be around half a million total available sockets in the network (the majority of which are already consumed today).


Posted 2015-05-11T14:31:25.240

Reputation: 12 846