First, the operators $a$ and $b$ ($a^{\dagger}$ and $b^{\dagger}$) are the *annihilation* (*creation*) operators of the two photonic modes in your problem. For an introduction to the subject I recommend you to look for some decent lecture notes on quantum optics. A well readable introductory book is Mark Fox's "Quantum Optics -- An Introduction" and a more advanced read is Grynberg, Aspect and Fabre's "An Introduction to Quantum Optics". But the parts you are interested in are possibly well explained in Nielsen and Chuang. Also this document I found while googling might be of interest.

But back to the original question: why does the paper you mention and the Wikipedia article define the beamsplitter as
\begin{align}
U = \mathrm{e}^{i \theta (b^{\dagger} a + b a^{\dagger})}
\end{align}
with a plus sign, contrary to Nielsen and Chuang?

The truth is, both operators are "correct", they just describe polarizing and non-polarizing beamsplitters. The beamsplitters usually encountered in labs are polarizing, so that the reflected photon obtains an additional phase shift.

This is actually explained quite neatly in Box 7.3 of Nielsen and Chuang, where they show that there is an isomorphism between the transformation of two photonic modes and $SU(2)$. The plus convention corresponds to a Pauli $X$ rotation, where the minus convention corresponds to a Pauli $Y$ rotation. As pointed out before, there is an additional phase shift, embodied by the *phase operator*
\begin{align}
S = \begin{bmatrix} 1 & 0 \\ 0 & i \end{bmatrix}
\end{align}
that can be used to relate the two transformations. For that see Page 51 of "Introduction to Optical Quantum Information Processing" by Kok and Lovett.