Wouldn't bitcoin benefit from distributing partial blockchains over each node?

3

1

Why doesn't each node store only a portion of the blockchain?

For example, if you had 100 nodes, each node would store 10% of the blockchain. This would amount to significant overlap of each aspect of the blockchain, and as nodes came and left there would be increased or decreased sharing of the blockchain.

Is it infeasible? If so, why? I am interested in pursuing this as a project because I think it could solve a lot of bitcoin's scalability/decentralization issues.

From my understanding of the blockchain, let's assume a particular node starts at the genesis block. Let's assume another node stores 1% of the blockchain past the genesis block. Couldn't multiple node verify block hashes for their shared 9% at a rate of once every 10 minutes? It wouldn't really use that much data -- you could hash the 9% of the blockchain to a particular value, produce a public key, and see if the other nodes can match a private key to said public key.

A related question: Is this what Electrum does? Or, does Electrum simply store the entire blockchain on multiple servers that you can connect to?

2

Why doesn't each node store only a portion of the blockchain?

This has been proposed before and is possible with Bitcoin, but it's not clear how it would be executed. The blocks aren't an important part of a running node, for most people you can throw them out as soon as you have processed them into your UTXO database.

The problem comes when deciding which parts to store, and signalling to other people in the network that you have a certain subset of blocks. You can't advertise a list of hashes as that would be massive and inefficient (many megabytes for each peer), and making a deterministic random selection is extremely problematic. Choosing ranges of blocks is not ideal because the sizes are not consistent, 1,000 blocks at the beginning of the chain is servers orders of magnitude smaller than ones at the tip. In a naive implementation of random selections peers attempting to sync with the network may need to make thousands of connections to find a single peer that has a specific block, which is obviously not feasible.

From my understanding of the blockchain, let's assume a particular node starts at the genesis block. Let's assume another node stores 1% of the blockchain past the genesis block. Couldn't multiple node verify block hashes for their shared 9% at a rate of once every 10 minutes? It wouldn't really use that much data -- you could hash the 9% of the blockchain to a particular value, produce a public key, and see if the other nodes can match a private key to said public key.

This is not necessary and highly vulnerable to Sybil attacks, you don't need to verify blocks once you have seen them. Pruned nodes work like this today, only they throw away every block and store absolutely nothing more than they are specified to sore. They don't advertise which blocks they have because there is no mechanism for that.

You might want to read up on how the autoprune patch works and get a better grasp on the threat models that need to be tackled here. You've missed some of the operation slightly, note that pruning for nodes nodes does not operate in the way described in the white paper, the UTXO is a newish concept added in 0.8.0 era Bitcoin Core.

A related question: Is this what Electrum does?

Not at all, Electrum servers are a full node and an extremely heavy address based index, almost doubling the storage size. The client is light weight but the sever most definitely is not. There's no sensible way of maintaining these indexes in a sharded way, though you could break addresses into multiple pools and hope that people maintained enough of each shard for clients to be able to request subscriptions for all of their addresses. Ideally systems would be designed that don't require full indexes as they are getting increasingly unwieldy to work with.

"The problem" - I don't think that's really a problem as much as simply a design choice that hasn't been made yet. You could easily split the blockchain up into approximate 50-100MB chunks and identify them by hashing the entire chunk. – B T – 2017-12-27T02:28:20.087

That’s not verifying, you need to be sure there’s not something invalid in the rest of the information. – Anonymous – 2017-12-27T02:29:08.873

Right, well I wasn't talking about verifying. I was talking about sharing information about what parts of the blockchain you have on-hand. In response to your second paragraph – B T – 2017-12-27T07:32:23.750

I think I may not quite be responding in the full context of the OP. – B T – 2017-12-27T07:46:58.387

0

Something like this has been implemented and released in v0.11.0 in Bitcoin Core client. Its called pruning and you can now use the -prune option specify how much blockchain data you want to store.

-prune=N: where N is the number of MB to allot for raw block & undo data


This does not work as yet in terms of wallet so when pruning you cannot use the wallet in the core client. It's also stops relaying though they are working on ways to incorporate that to allow nodes to still relay blocks.

This does not mean that nodes store different part of the blockchain though.

Note that a new node in pruning mode will still download and work its way through all blocks, but will not keep all old blocks.

It's my understanding that this will be updated to support using the wallet in pruning mode in future. EDIT: Apparently master branch of bitcoin core already supports using the wallet in pruning mode.

1Master at least supports a wallet and basic pruned mode. – Anonymous – 2015-09-21T16:03:53.477

This answer isn't correct. Pruning is about forgetting old, irrelevant data, and has been described as early as the whitepaper. The question asked about sharing the load of the relevant data between different nodes, which would require significant algorithmic innovations. – Meni Rosenfeld – 2015-09-21T18:58:48.570

@MeniRosenfeld Worthwhile to note that the client modes described in the whitepaper don't really function like that at all in the real world, I'm not convinced it's a thing to be pointing people to that information anymore. – Anonymous – 2015-09-22T12:04:40.527