## The bitcoin mining algorithm from a programmer's viewpoint

26

25

This page: Blocks said mining is actually to solve a mathematical problem, but reading Block hashing algorithm doesn't give much help. I also tried reading bitcoind source code, but reading code takes much more time than reading documentation:)

And I have written a simple json-rpc client to call getwork() method to fetch the "data", but what should I do next to this "data"?

Anyone could explain the mining process in programmer's view?

1but what should I do next to this "data"? Here's a short reference implementation. https://github.com/jgarzik/pyminer/blob/master/pyminer.py – Nick ODell – 2013-08-11T02:43:50.643

Also, explain the mining process in programmer's view is a pretty broad question. What specific problem are you trying to solve? – Nick ODell – 2013-08-11T02:45:04.100

@NickODell Thanks for your code! Now I understand how to create new blocks. But is each submitted block valid (bitcoin network will send bitcoin to this person)? I found Bitcoin wiki said only 6 blocks will be created every hour, Assume that 100 people get the block header by using getwork at same time with same difficulty, I think much more than 6 blocks will be created by these people. Am I right? – Mark Ma – 2013-08-11T05:12:09.657

Answering in order: No. No. – Nick ODell – 2013-08-11T05:53:44.273

possible duplicate of I am unable to figure the getwork api

– cdecker – 2013-08-12T20:58:47.447

@NickODell Sorry for my delay, I have something emergent these days. If not every submitted work is valid, how can bitcoin network tell which submitted solution is valid? Thanks – Mark Ma – 2013-08-15T05:14:47.147

37

The Mining Algorithm is as follows:

• Step 0 - Retrieve the hash of the previous block from the network.

• Step 1 - Gather a list of potential transactions known as a "block". This list of transactions comes from the peer-to-peer bitcoin network.

• Step 2 - Calculate a hash for a block of potential transactions along with a random number.
• Step 3 - If the hash is more than the currently set difficulty level, then you have mined that block. If not, start over from Step 1. Any additions to the list of transactions from step 1 along with change in the random number from Step 2 mean that there's a chance that the criterion will be met in the next go around.

From a programmer's view, the pseudo code might look something like this:

P := The hash of the previously mined block
B := A block of transactions
H := A hash function
D := Difficulty Level

0 Retreive P
1 Construct/Modify B
2 IF H(P, B, Some Random Number) > D END
3 GOTO 1


I should warn you that there are a few inaccuracies in that description, but for the most part, that should be good enough. And here are a few more useful clarifications:

What's a hash?

A hash is a function that converts data into a number within a certain range. The hash has the property that knowing it's output is essentially unpredictable (within the given range). The specific hash function used for bitcoin mining is SHA256 applied twice.

How does the difficulty level work?

This unpredictable nature of the hash function means that putting in random data (the transaction + the random number) will essentially produce a random number within a certain range. Further restricting the range of the desired output affects how likely one is to find it in a single round. This creates a way to probabilistically determine how often a solution will be found based on the number of times the algorithm can be run on the network. Specifically, when you hear the term "gigahashes" or "terahashes", this refers to the number of times Step 3 can be run. As the number of hashes per second across the entire network grows, the network automatically raises the difficulty such that a solution will be found within about 10 minutes.

What happens when a block is mined?

When a block is mined, the miner sends the block to all other miners on the network as evidence that it has found it. This block contains a list of transactions, the found hash, the specific random number, and a reference to the previous hash. As each miner receives the newly mined block, it removes all transactions that it is currently mining that exist within the block (because they've already been confirmed in the block chain) and broadcasts the block to other miners that do the same thing. The propagation happens pretty quickly.

Note: the original miner of the block gets a "miners' fee", which is a reward consisting of any unspent coins from transactions in addition to a "coinbase" reward. The coinbase reward started out at 50 bitcoins and halves after every 210,000 blocks (about once every 4 years). The coinbase reward will eventually get so small that it will be miniscule compared to miners's fees.

1

The reward is halved every 210000 blocks.

-> Just saw that now and think it is neat: http://bitcoinclock.com/

– Murch – 2013-08-22T13:47:19.603

1Oh, and the Difficulty is adjusted every 2016 blocks by assuming that the network will continue to operate with the average hash rate of said last 2016 blocks and setting the new difficulty such that this hashrate would result in an approximately 10 minute block cycle. – Murch – 2013-08-22T13:52:25.190

3I could be wrong, but I believe that the block also contains the hash from the last block. And that is how blocks are chained backwards, leading to the term blockchain. – Tarandeep Gill – 2014-02-03T04:31:09.360

@Murch, the difficulty is actually adjusted every block based on how long it took to find the previous block, and at this point, the difficulty can only go up. Every 2016 blocks, the difficulty is adjusted with respect to the last 2016 blocks and it can go either up or down. This helps guard against sudden drops in computing power, which would otherwise result in the time to find a block increasing to more than 10 minutes. – John Henry – 2014-04-30T12:56:46.910

@Tarandeep-Gill, that's correct! The hash of the previous block is hashed along with the list of transactions. This actually makes me think that my explanation of hashing is a bit incomplete -- The specific hash function is not SHA256 applied twice as stated, but rather that function applied to various parts of the block (including transactions and the hash of the previous block) in different ways. This is one of those "inaccuracies" I warned you about. I wonder if you can find the others... – John Henry – 2014-04-30T13:05:15.017

Are we talking about network difficulty or some other difficulty? Because the network difficulty does not change every block, it exclusively changes every 2016 blocks. See for example: https://en.bitcoin.it/wiki/Difficulty#How_often_does_the_network_difficulty_change.3F

– Murch – 2014-04-30T13:10:49.113

I think omitting the hash of the previous block from the explanation is a bad idea even if it makes the answer easier to read. People could be like "Aha" that's where the name blockchain comes from when they read this answer as suggested by @TarandeepGill – Emre Kenci – 2015-04-05T18:48:06.803

@AntonA., I think it's important not to conflate answers that don't necessarily have anything to do with each other. While I agree that this information is useful, the term "blockchain" is not specifically mentioned in the question or my answer. This information my be more useful here: http://ethereum.stackexchange.com/questions/459/what-is-a-blockchain-what-is-the-concept-behind-it

– John Henry – 2017-02-13T16:07:40.687

1

The purpose of solving a 'puzzle' is to (a) delay the mining of the block to avg 10 minutes and (b) to incur real-world costs for mining a block (spending cpu power, thus energy). The costs are there to prevent a Sybil attack (putting many miner machines at work to do a 51% attack).

The delay is put in to allow a good block to propagate around the globe to all other miners, without giving the miner who just minted the new block a headstart advantage. For that, the blocktime (10 minutes) needs to be order of magnitudes larger then the propagation delay (a few seconds).

So the kind of puzzle is in a sense irrelevant, it could as well be a giant Sudoku.

0

Any hash is a valid hash. The question is if you're hash meets our criteria. What you hash is actually a couple of things (we'll come back to this) that lined up together make a string that is a string of a specific length. You then hash that entire string. Think of your resulting hash as a number. What we want is that resulting number to be less than a target number. So it's like rolling a billion sided die, and coming up with a number under the target number. That target number is considered "difficulty". As more people are rolling the die, we lower what that target number is to reduce the likelihood any one die-roll will hit.

So importantly, some of those things that we line up in the string that we hash are things that we're allowed to adjust. Hashing isn't exactly like rolling the die because if we has the same thing over and over we get the same result over and over. However even a minor adjustment in what we hash can have a major impact on what the resulting hash is. The primary item that we can manipulate is called the "nonce". Basically we pick a nonce to use in the string, hash the whole string, and see what we get. If the hash isn't a hit, we modify the nonce and try again. The nonce itself isn't very large so there's a limited number of variations of the nonce that you can try. The next item we can change is a timestamp. We're allowed to wiggle the timestamp too. Every wiggle of the timestamp allows us through the entire set of possible nonce values all over again. We repeat this process of exhausting the nonce possibilities, then wiggling the timestamp over and over until the resulting hash is lower than the difficulty target number.

When that happens, we then proclaim to the world that this timestamp plus this nonce works to solve the block. Others validate it to be true and it's added to the block chain. A block is considered "Validated" once it's a certain number of blocks "deep" in the block chain, meaning it's a historical block compared to the current block. Validation is a little bit of a misnomer here because it's not that the block isn't already known to be valid. What we're validating is the proof of work, meaning once that historical block is buried far enough the amount of effort involved to create that history is insurmountable for someone else to try to create a different variation of the history. They would have to create their own version of that block (the only reason being to write in their own version of the transactions, i.e. steal coins) then solve it themselves, then solve the next block and the one after and so on and "catch up" with everybody else. This means they would have to out-race the world in the dice rolling game. Maybe once in the history of the universe someone might get lucky on the nonce with two or three blocks back to back, but with the 120 blocks that most mining pools and exchanges require now? Not going to happen, ever.