Why is pruning not considered already at the moment?

14

1

I understand that Bitcoin scales in several meanings (Scalability), and pruning is one important concept of it (SE Question). I also understand that a "Simplified Payment Verification" (SPV) needs to trust a lot the peer he is obtaining the block chain from.

A very conservative pruning (e.g. transactions older than six month and spent) wouldn't do much harm, especially when it would be only a configuration option for bitcoin-qt. That way the default is the full node, but it's easy to have a "small node".

But I don't see it coming anyhow soon. Is there a reason? Is it so important to have full nodes at the moment that the devs say "either you go for all or nothing"? Or is the development effort the bottleneck? IMHO a large network of non-SPV nodes is more important than a small network of full nodes.

Edit: Let's put it more concrete: Is there a major security issue of not having the complete transaction history of the world back to the genesis block?

Borph

Posted 2013-05-24T11:44:27.517

Reputation: 405

Answers

12

Pruning is being considered, in fact, it was taken into account when designing the 0.8 database format. The unspent transaction outputs (which is the only essential piece of data necessary for validation) are already kept in a separate database, so technically removing old blocks is perfectly possible. It'll likely require some small changes to make sure the code doesn't break when block data that doesn't exist anymore is requested, but this is easy.

The reason it's not implemented is because of the effect on the network as a whole. If a large amount of nodes starts pruning old block data away, it will become harder for new nodes starting up to find the historic data to verify. This is not a problem as such - I expect enough copies will remain that this is not a true problem - but we need a discovery mechanism so nodes don't need to arbitrarily try peers until they happen to reach one that has the blocks they need. In fact, there's a discussion going on on the bitcoin-development mailing list about this right.

EDIT: Pruning was implemented in Bitcoin Core 0.11, and is fully functional since 0.12.

Pieter Wuille

Posted 2013-05-24T11:44:27.517

Reputation: 64 874

2So the current problem to solve is on the network level! Nodes have find nodes - with certain information! DHT sound very promising to me to solve this. – Borph – 2013-05-31T08:53:07.247

Extra question: Imagine we really would loose the historic data. Would it stop Bitcoin? Or would be a, hum, "consensus on the assets, although we don't know how we came here". – Borph – 2013-05-31T08:56:56.120

Presumable, nodes offering full history would be more expensive to maintain than nodes that have been pruned. What is the incentive for anyone to run nodes with full history? – snapfractalpop – 2015-03-26T17:29:09.687

What is the incentive to make anyone's full node accessible to others at all? – Pieter Wuille – 2015-06-25T08:58:06.623

-1

AFAIK, all the transaction outputs are purged from the database when they are spent. Not from the bock database, though, but it just does not make too much sense to remove it from the block database, since it would only worsen the node's performance.

And you cannot purge unspent outputs, now matter how old they are, for quite an obvious reasons.

Gigi

Posted 2013-05-24T11:44:27.517

Reputation: 616

I think you might be misunderstanding the question. It does make sense to remove some/all transactions from the client's copy of the blockchain (i.e. "block database") as it would save a lot of disk space for the clients. The protocol specifically allows for these partially purged clients (SPV clients), that's why the protocol uses a merkle root of the transactions and not a simple list of the transactions. But SPV isn't part of the standard client at this time and this question is really about why that doesn't seem to be a priority. – David Ogren – 2013-05-24T15:16:33.360

Well, considering that the entire blockchain is now barely 8GB, we obviously have a different understanding of what a lot of disk space is. Moreover, if you remove transactions from your database, you won't be able to serve blocks anymore, so you won't be able to act as an actual node. – Gigi – 2013-05-24T15:36:47.693

That 8GB isn't a lot of block space for current desktops and laptops might not be a bad answer. (I somewhat agree with that. Although it's really the downloading and verifying that seems to cause the most issues, not the disk space itself.) My comment was more that your answer was addressing something not asked, i.e. purging from the lookup databases when the question was pretty specifically about the blockchain itself and SPV clients. (Also, someone who uses a laptop with a single 128GB SSD might disagree with your assessment that 8GB isn't a lot.) – David Ogren – 2013-05-24T15:59:37.493

Oh, I thought SPV clients are even just interested in the headers and their transactions, opposed to a pruning node which has all unspent transactions. – Borph – 2013-05-24T16:01:03.047

@Gigi: yes, when you throw away old spent transaction, the world would forget about them when really all nodes behave like this. But what is considered to be "unable to act as a node"? – Borph – 2013-05-24T16:03:01.367

At least to my understanding, SPV just means that the client can request transaction details "on demand" rather than downloading and verifying them all upfront. An SPV client might just keep their own transactions, but they really could keep any subset. – David Ogren – 2013-05-24T16:05:15.500

@Borph, by "unable to act as a node" I meant that the main feature of a bitcoin node is serving blocks for the P2P network - full blocks, and all of them, starting from #1. I understand that in theory you can serve only partial blocks (thus skipping the spent transaction), but the problem is that the current protocol does not support it. And if you want to change the protocol and then make sure that all the nodes are upgraded, before you start serving them only partial blocks... well that will surely take awhile; june 2025 would be my rough estimate :) – Gigi – 2013-05-24T17:38:45.937