How is difficulty calculated?

63

21

Can anyone explain me in the plain English how difficulty is calculated. I have a very approximate understanding that it is calculated based on the amount of hash power in all the bitcoin community over a specific period of time. But this is very vague.

Also I understand it can change very rapidly. Can it only increase? Is there any formula how to calculate it or predict it?

Thanks for a detailed answer, Meni Rosenfeld. Just to make sure I got everything right. I am summing up all the time, it took to generate the last 2016 blocks. And then apply the formula.

Salvador Dali

Posted 2012-12-18T18:30:56.720

Reputation: 3 160

2I think follow up questions are better as comments to the answer. Basically yes, but no summing is actually needed - you can just take the timestamps of the last block and of the one 2016 blocks before, and subtract. – Meni Rosenfeld – 2012-12-19T14:18:30.203

Answers

79

The Bitcoin difficulty started at 1 (and can never go below that). Then for every 2016 blocks that are found, the timestamps of the blocks are compared to find out how much time it took to find 2016 blocks, call it T. We want 2016 blocks to take 2 weeks, so if T is different, we multiply the difficulty by (2 weeks / T) - this way, if the hashrate continues the way it was, it will now take 2 weeks to find 2016 blocks.

For example, if it took only 10 days it means difficulty is too low and thus will be increased by 40%.

The difficulty can increase or decrease depending on whether it took less or more than 2 weeks to find 2016 blocks. Generally, the difficulty will decrease after the network hashrate drops.

If the correction factor is greater than 4 (or less than 1/4), then 4 or 1/4 are used instead, to prevent the change to be too abrupt.

There is a bug in the implementation, due to which the calculation is based on the time to find the last 2015 blocks rather than 2016. Fixing it would require a hard fork and is thus deferred for now.

It is possible to give a rough estimate for the next difficulty change, based on the time to find the recent blocks. Nobody can make longer-term predictions for the future difficulty reliably, but anyone is free to speculate based on exchange rate trends, Moore's law and other hardware advances.

Meni Rosenfeld

Posted 2012-12-18T18:30:56.720

Reputation: 19 132

Is that bug still there? Shouldn't there be a GitHub issue for that so that people can keep track of the status... With such bugs in the protocol, it will be hard for the Bitcoin ecosystem to move away from the situation in which the Satoshi client devs set all the rules. – Steven Roose – 2013-06-11T22:23:06.347

1@StevenRoose: AFAIK it is, but I will leave it to people who are more involved with the core code to comment... This is adequate for a separate SE question. – Meni Rosenfeld – 2013-06-12T15:19:32.937

3Good answer, but one small, but capital point is eluded: how do the nodes in the network agree on what is the difficulty ? – deadalnix – 2014-03-14T18:35:56.613

6@deadalnix: The difficulty of a block is a deterministic calculation based on the data of the previous blocks. All nodes independently do the same calculation and get the same result. – Meni Rosenfeld – 2014-03-15T21:36:15.107

@MeniRosenfeld thank you for that answer. But to me, it just move the problem around. As I understand you answer, the computation is made from the timestamp available in the blocks. But then, how does a p2p network agree on the timestamp ? – deadalnix – 2014-03-16T01:18:52.630

3@deadalnix: The timestamp is a part of the block, which means that whoever found the block decides what to put in it. The timestamp must be no sooner than the median of the past 11 blocks. Also, if a node receives a block with a timestamp more than 2 hours in the future it will reject it and not propagate it. – Meni Rosenfeld – 2014-03-16T16:10:23.307

1References to the Bitcoin specifications or codebase would be much appreciated. Revising your answer is strongly preferred over shoving stuff into the comments. – Indolering – 2014-07-18T18:25:45.933

@Indolering: The stuff in the comments is not part of the core answer. They are replies to specific followup questions. – Meni Rosenfeld – 2014-07-20T04:38:26.167

@Meni, so if I want to calculate the round completion should I do CURRENT_BLOCK_COUNT_IN_THIS_ROUND/2015 instead of CURRENT_BLOCK_COUNT_IN_THIS_ROUND/2016 ??? – mixdev – 2014-08-05T03:08:13.183

1@mixdev: The "round" (difficulty retarget) is done every 2016 blocks. But the recalculation currently is not prev_targetTIME_FOR_LAST_2016_Blocks/1209600, it's prev_targetTIME_FOR_LAST_2015_Blocks/1209600. – Meni Rosenfeld – 2014-08-05T09:35:28.060

So this basically means that errors will add up as every time only the last epoch time is considered? This would mean that stating in 2140 the last bitcoin will be mined is wrong as it could be years or even decades before that. – tobi – 2018-06-05T08:51:21.977

@tobi: The off-by-one bug is responsible for blocks being found 0.05% slower on average than the "official" 2016 per two weeks. There is a separate issue that causes another 0.05% slowdown. Together they amount to only 1 month slowdown over a century. However, there is a separate effect that causes blocks to be found faster due to the increase in hashrate. This is more significant and the last coin will be mined probably 1-2 years too early because of this. – Meni Rosenfeld – 2018-06-05T09:50:02.670

@tobi: The total amount of speedup due to hashrate increase can be calculated as roughly log (difficulty) * 2 weeks. This is the main reason block halving parties are at a different date each time - November in 2012, July in 2016 etc. – Meni Rosenfeld – 2018-06-05T09:53:02.120

yes I was referring to the speedup due to increasing hashrate and not the 2015 blocks bug – tobi – 2018-06-05T12:24:46.420

2@tobi: Oh ok. Previous comments were about the bug, and "errors" hinted that it's the bug, so I assumed we're talking about that. So, yes. If we assume that around 2140 the hashrate will be around *1B what it is now, the schedule will be ahead by 96 weeks, or almost two years. But there is yet another effect - a delay caused by the fact that in the beginning, the difficulty was 1 even though the hashrate wasn't enough to justify it. – Meni Rosenfeld – 2018-06-05T13:15:37.803

16

Meni's answer is good. I just want to give some practical detail method about difficulty calculation, perhaps helpful for future views of this question's answer.

Let's take a look at Satoshi's genesis block header (part of related info):

$ bitcoin-cli getblockhash 0
000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f

$ bitcoin-cli getblockheader 000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f
{
  ...
  "height": 0,
  ...
  "bits": "1d00ffff",
  "difficulty": 1,
  ...
}

As we can see above, the genesis block has a '1' difficulty and '1d00ffff' bits. The bitcoin bits means the 'target' hash value, the new generated block must meet a condition: block header's double SHA-256 hash value must less than this 'target' value.

The '1d00ffff' bits value in genesis block means the 'target' value:

[0x00000000,0xffff,{0x00..0x00}]
                   {0x00..0x00} at above has 26 bytes 0x00.

Then, to find a new block, you must search that 32 bits nNonce value (and nTimes and the hashMerkleRoot also) until the block hash value has 4 bytes zero leading. By the way, the nNonce is one of the fields in block header structure:

 struct header_structure{            // BYTES   NAME
     uint32_t nVersion;              // 4       version
     uint8_t hashPrevBlock[32];      // 32      previous block header hash
     uint8_t hashMerkleRoot[32];     // 32      merkle root hash
     uint32_t nTime;                 // 4       time
     uint32_t nBits;                 // 4       target
     uint32_t nNonce;                // 4       nonce
 };

Because SHA-256 algorithm (as well as any cryptographically secure hash algorithm) produces output that will appear like an uniformly random sequence, the practical 'trial and error' method is the only way to find a new block to meet the condition. The probability to find a block with the 4 bytes zero leading hash value is 1/(2^32), that means the average 'trial and error" numbers are exactly 2^32 (i.e. 4G).

For human easy understanding about this 'target' hash value, We define the term 'difficulty', which means the average 'trial and error" numbers to find a block to meet the 'target' condition. And we define the 'difficulty' unit: 1 'difficulty' = 4G hashes

Then, till today, the bitcoin blockchain height reach 501509, let's take a look at its header:

$ bitcoin-cli getblockheader 0000000000000000006c5532f4fd9ee03e07f94df165c556b89c495e97680147
{
  ...
  "height": 501509,
  ...
  "bits": "18009645",
  "difficulty": 1873105475221.611,
  ...
}

The block 501509's bits = 0x18009645, it's the compact format of 256 bits integer, its 256 bits format is:

[0x00000000,0x00000000,0x009645,{0x00..0x00}]
                                {0x00..0x00} at above has 21 bytes 0x00.
that is  0x009645 * (256 ^ 21) 
The genesis block's target is  ( 0x00ffff * 256 ^ 26 )which is the difficulty unit '1.0'.
So, the difficulty 
= (0x00ffff * 256 ^ 26)/ (0x009645 * 256 ^ 21)
= 65535/38469 * (256^5)
= 1.703579505575918 * 2^40
= 1873105475221.611

So far, you have all the detail about how to calculate the 'difficulty'. In some cases, we also use the simple format 1.7T to say the difficulty, in above example:

 (1.703579505575918 * 2^40) = 1.703579505575918T
 1T = 2^40 = 1024^4

gary

Posted 2012-12-18T18:30:56.720

Reputation: 411

1d is 29 in Dec(not 26). SHS is SHA – Boris Ivanov – 2018-08-06T18:31:37.997

thanks @BorisIvanov, the typo error SHS has been fixed. But 1d indeed means 26 bytes zero tail instead of 29, please read the example detail showed above. – gary – 2018-08-07T08:13:25.227

ah yeah. Significand – Boris Ivanov – 2018-08-07T15:29:10.150

3

I would like to give my 2 cents here, by expliciting the relationship between the probability of mining a block given the current target t and the corresponding difficulty d as it is calculated in bitcoin core.

So cryptographic hash functions are idealized by the random oracle abstraction [https://en.wikipedia.org/wiki/Random_oracle]. We can therefore model the output of the doubleSHA256 hash function used in PoW as a uniform variable in the space {0,1}^256, i.e. arrays of 256 bits. Thus the probability of a single hash h being a valid hash is:

p = P(h < t) = t /( 2^{256} - 1 )

On the other hand d is calculated as follows, just as @gary explained before only transformed into decimals:

d = ( (2^{16} - 1) * 2^{8*26} ) / t = ( (2^{16} -1) * 2^{208} ) / t

The implementation is in [https://github.com/bitcoin/bitcoin/blob/master/src/rpc/blockchain.cpp], line 60, function GetDifficulty. Actually if someone can explain how exactly the code maps to the formula above, that would be helpful. Combining those two formulas we obtain:

d = ( (2^{16} -1) * 2^{208} ) / ( p * (2^{256} - 1) ) ~ 2^{-32} / p

Analyzing this last expression the difficulty is the ratio between the probability of obtaining a hash lower than 2^{224} (which is the lowest decimal number that has a binary representation using 256 bits starting with 32 zero bits) and the probability of obtaining a valid hash based on the current target t. This is a direct implication of defining, in the genesis block, as difficulty 1 the one associated to the hexadecimal target 0x1d00ffff, expressed in what I think is called the 32-bit compact form for 256-bit numbers.

A nice question I believe is why this specific compact form was chosen for representing the target.

giancarloGiuffra

Posted 2012-12-18T18:30:56.720

Reputation: 31

Upvoted! The compact form provides 3 most significant bytes for the target, in the min difficulty the 3 most significant bytes are 00ffff. – James C. – 2019-01-27T14:48:54.383