Let me show an example for grouping for this Hamiltonian:

$$H = 5 \cdot XI + 3 \cdot XZ - 2 \cdot YI + 1.5 \cdot IY$$

The expectation value:

$$\langle H \rangle = 5 \cdot \langle XI \rangle + 3 \cdot \langle XZ \rangle - 2 \cdot \langle YI \rangle + 1.5 \cdot \langle IY \rangle$$

Here I will group them in this way: the first group $XI$ and $XZ$, the second group $YI$ and $IY$. Note that (it is important) the members of the same group should commute with each other. Also, I should mention that this is not the only way of grouping.

For the first circuit:

\begin{align}
&\langle X I \rangle = p(\text{00 or 01 measurements}) - p(\text{10 or 11 measurement})
\\
&\langle X Z \rangle = p(\text{00 or 11 measurements}) - p(\text{10 or 01 measurement})
\end{align}

For the second circuit:

\begin{align}
&\langle Y I \rangle = p(\text{00 or 01 measurements}) - p(\text{10 or 11 measurement})
\\
&\langle I Y \rangle = p(\text{00 or 10 measurements}) - p(\text{01 or 11 measurement})
\end{align}

where $p$ denotes a probability of a measurement outcome described in parenthesis. The main idea here is: For a given Pauli term $P$ the expectation value is equal to:

$$\langle P \rangle = p_+ - p_-$$

where $p_+$ ($p_-$) is the probability of having an eigenstate that has eigenvalue $+1$ ($-1$). More details about this can be found in this answer about expectation value estimation. About why $HS^{\dagger}$ is applied in the second circuit can be understood from this answer.

Thank you Davit. That is a very clear explanation! Essentially we group a set of operators together, if there is one unitary operation to rotation all of them to be diagonal (consists of only Z and I). I have a second question, is that a way, either on qiskit, or any other manually written algorithm, that we can group a given arbitrary set of operators? Many thanks! – fagd – 2020-09-26T21:19:05.037

@fagd, you are welcome :). I don't know about the existence of software solutions for this problem. – Davit Khachatryan – 2020-09-26T21:30:13.423