It's mostly about simplicity and adopted convention. In the end, this is basically the same question as "why should I pick a universal set of gates A rather than a universal set B?" (see here). Experimentalists would pick the universal set they have available. Theorists just pick something that they like to work with, and eventually a convention is adopted. But it doesn't matter which convention they adopt because any universal set is easily converted into any other universal set, and it is (or should be) understood that the quantum circuits describing algorithms are not what you actually want to run on a quantum computer: you need to recompile them for the available gate set and optimise based on the available architecture (and this process is unique to each architecture).

You could use operations such as $\sqrt{X}$, but they are a little bit more fiddly because of all the imaginary numbers that appear. Or there's $\sqrt{Y}$ which gives an even more direct comparison to $H$, avoiding imaginary numbers.

One of the main purposes of $H$ in a quantum circuit is to prepare uniform superpositions: $H|0\rangle=(|0\rangle+|1\rangle)/\sqrt{2}$. But $\sqrt{Y}$ also does this: $\sqrt{Y}|1\rangle=(|0\rangle+|1\rangle)/\sqrt{2}$. When you start combining multiple Hadamards on unknown input states (i.e. the Hadamard transform), it has a particularly convenient structure
$$
H^{\otimes n}=\frac{1}{\sqrt{2^n}}\sum_{x,y\in\{0,1\}^n}(-1)^{x\cdot y}|x\rangle\langle y|.
$$

The Hadamard gives you some very nice inter-relations (reflecting basis changes between pairs of mutually unbiased bases),
$$
HZH=X\qquad HXH=Z \qquad HYH=-Y.
$$
It also enables relations between controlled-not and controlled phase, and between controlled-not in two different directions (swapping control and target). There are similar relations for $\sqrt{Y}$:
$$
\sqrt{Y}Z\sqrt{Y}^\dagger=YZ=iX \qquad \sqrt{Y}X\sqrt{Y}^\dagger=YX=-iZ\qquad \sqrt{Y}Y\sqrt{Y}^\dagger=Y
$$
Part of this looking (slightly) nicer is because, as stated in the question, $H^2=\mathbb{I}$.

One way that many courses introduce the basic idea of quantum computation, and interference, is to use the Mach-Zehnder interferometer. This consists of two beam splitters which, mathematically, should be described by $\sqrt{X}$ (or $\sqrt{Y}$ would do). Indeed, this is important for a first demonstration because of course these operations are "square root of not", which you can prove is logically impossible classically. However, once that initial introduction is over, theorists will often substitute the beam splitter operation for Hadamard, just because it makes everything slightly easier.

1Why are you talking about $\pi/2$ rotations about the $X$ basis? What you want is a $\pi/2$ rotation about the $Y$ axis, which indeed acts

almostlike a Hadamard, as it also maps between X and Z eigenstates. – Norbert Schuch – 2018-08-06T09:53:39.943@NorbertSchuch Thank you. I just checked and it you are right. Do you mind writing an answer where you talk about the comparison between Hadamard and $\frac{\pi}{2}$ rotation about $Y$? – Ntwali B. – 2018-08-06T14:52:03.123

I don't see how this would make sense. On the one hand, this is not the question. On the other hand, take the answer of DaftWullie and strip the part about $\sqrt{X}$ not being real, and you probably get what I would write. – Norbert Schuch – 2018-08-06T22:18:53.947