My question is actually related to the addition of probabilities. I am reading on computational learning theory from Tom Mitchell's machine learning book.
In chapter 7, when proving the upper bound of probabilities for $\epsilon$ exhausted version space (theorem 7.1), it says that the probability that at least one hypothesis out of the $k$ hypotheses in the hypotheses space $|H|$ being consistent with m training examples is at most $k(1- \epsilon)^m$.
I understand that the probability of a hypothesis, $h$, consistent with m training examples is $(1-\epsilon)^m$. However, why is it possible to add the probabilities for $k$ hypotheses? And might the probability be greater than 1 in this case?