There is a good explanation by Craig Gidney here (he also has other great content, including a circuit simulator, on his blog).

Essentially, Grover's algorithm applies when you have a function which returns `True`

for one of its possible inputs, and `False`

for all the others. The job of the algorithm is to find the one that returns `True`

.

To do this we express the inputs as bit strings, and encode these using the $|0\rangle$ and $|1\rangle$ states of a string of qubits. So the bit string `0011`

would be encoded in the four qubit state $|0011\rangle$, for example.

We also need to be able to implement the function using quantum gates. Specifically, we need to find a sequence of gates that will implement a unitary $U$ such that

$U | a \rangle = - | a \rangle, \,\,\,\,\,\,\,\,\,\,\,\,\, U | b \rangle = | b \rangle $

where $a$ is the bit string for which the function would return `True`

and $b$ is any for which it would return `False`

.

If we start with a superposition of all possible bit strings, which is pretty easy to do by just Hadamarding everything, all inputs start off with the same amplitude of $\frac{1}{\sqrt{2^n}}$ (where $n$ is the length of the bit strings we are searching over, and therefore the number of qubits we are using). But if we then apply the oracle $U$, the amplitude of the state we are looking for will change to $-\frac{1}{\sqrt{2^n}}$.

This is not any easily observable difference, so we need to amplify it. To do this we use the *Grover Diffusion Operator*, $D$. The effect of this operator is essentially to look at how each amplitude is different from the mean amplitude, and then invert this difference. So if a certain amplitude was a certain amount larger than the mean amplitude, it will become that same amount less than the mean, and vice-versa.

Specifically, if you have a superposition of bit strings $b_j$, the diffusion operator has the effect

$D: \,\,\,\, \sum_j \alpha_j \, | b_j \rangle \,\,\,\,\,\, \mapsto \,\,\,\,\,\, \sum_j (2\mu \, - \, \alpha_j) \, | b_j \rangle$

where $\mu = \sum_j \alpha_j$ is the mean amplitude. So any amplitude $\mu + \delta$ gets turned into $\mu - \delta$. To see why it has this effect, and how to implement it, see these lecture notes.

Most of the amplitudes will be a tiny bit larger than the mean (due to the effect of the single $-\frac{1}{\sqrt{2^n}}$), so they will become a tiny bit less than the mean through this operation. Not a big change.

The state we are looking for will be affected more strongly. Its amplitude is a lot less than the mean, and so will become a lot greater the mean after the diffusion operator is applied. The end effect of the diffusion operator is therefore to cause an interference effect on the states which skims an amplitude of $\frac{1}{\sqrt{2^n}}$ from all the wrong answers and adds it to the right one. By repeating this process, we can quickly get to the point where our solution stands out from the crowd so much that we can identify it.

Of course, this all goes to show that all the work is done by the diffusion operator. Searching is just an application that we can connect to it.

See the answers to other questions for details on how the functions and diffusion operator are implemented.

https://www.quantamagazine.org/how-pi-connects-colliding-blocks-to-a-quantum-search-algorithm-20200121/ and https://arxiv.org/pdf/1912.02207.pdf – Condo – 2020-09-30T17:08:48.783