This phenomenon is sometimes known as a *discretization of errors*. It is a property of certain error correcting codes that allows it to work. It is described (somewhat briefly) in Section 10.2 of Nielsen and Chuang.

Suppose that we have an arbitrary error that affects just one qubit, and suppose that we represent this error by a channel $\Phi$ mapping one qubit to one qubit. Such a channel can be expressed in Kraus form as
$$
\Phi(\rho) = A_1 \rho A_1^{\dagger} + \cdots + A_m \rho A_m^{\dagger}
$$
for some choice of Kraus operators $A_1,\ldots,A_m$. (For a qubit channel we can always take $m = 4$ if we want, but this doesn't matter for this answer.)

Each of the Kraus operators $A_k$ can be expressed as a linear combination of Pauli operators, because the Pauli operators form a basis for the space of 2 by 2 complex matrices:
$$
A_k = a_k I + b_k X + c_k Y + d_k Z.
$$
If you now expand out the Kraus representation of $\Phi$ above, you will obtain a messy expression where $\Phi(\rho)$ looks like a linear combination of operators of the form $P_i \rho P_j$ where $i,j\in\{1,2,3,4\}$ and $P_1 = I$, $P_2 = X$, $P_3 = Y$, and $P_4 = Z$.

Now imagine that you have a quantum error correcting code that protects against an $X$, $Y$, or $Z$ error on one qubit. The usual way this works is that some extra qubits in the 0 state are tacked on to the encoded data and a unitary operation is performed that reversibly computes into these extra qubits a syndrome describing which error occurred, if any, and which qubit was affected.

Supposing that the arbitrary error $\Phi$ happened on the first qubit for simplicity, after the syndrome computation you will end up with a state that looks like a linear combination of terms like this:
$$
P_i |\psi\rangle \langle \psi| P_j \otimes |P_i\: \text{syndrome}\rangle\langle P_j\:\text{syndrome}|.
$$
The assumption here is that $|\psi\rangle$ represents the encoded data without any noise, $P_i$ and $P_j$ act on the first qubit, and that "$P_i$ syndrome" and "$P_j$ syndrome" refer to the standard basis states that indicate that these errors have occurred on the first qubit. (The situation is similar for the error affecting any other qubit; I'm just trying to keep the notation simple by assuming the error happened to the first qubit.)

Now the key is that you *measure* (with respect to the standard basis) the syndrome to see what error occurred, and all of the cross terms disappear because of the measurement. You are left with a probabilistic mixture of states that look like
$$
P_i |\psi\rangle \langle \psi| P_i \otimes |P_i\: \text{syndrome}\rangle\langle P_i\:\text{syndrome}|.
$$
The error is corrected and the original state is recovered. In effect, by measuring the syndrome, you "project" or "collapse" the error to a Pauli operator.

Let me acknowledge that this answer was mostly cut-and-paste from one of my previous answers (where it seems that it did not actually answer the question).