## Convert a PAC-learning algorithm into another one which requires no knowledge of the parameter

2

This is part of the exercise 2.13 in the book Foundations of Machine Learning (page 28). You can refer to chapter 2 for the notations.

Consider a family of concept classes $$\left\{\mathcal{C}_{s}\right\}_{s}$$ where $$\mathcal{C}_{s}$$ is the set of concepts in $$\mathcal{C}$$ with size at most $$s.$$ Suppose we have a PAC-learning algorithm $$\mathcal{A}$$ that can be used for learning any concept class $$\mathcal{C}_{s}$$ when $$s$$ is given. Can we convert $$\mathcal{A}$$ into a PAC-learning algorithm $$\mathcal{B}$$ that does not require the knowledge of $$s ?$$ This is the main objective of this problem.

To do this, we first introduce a method for testing a hypothesis $$h,$$ with high probability. Fix $$\epsilon>0, \delta>0,$$ and $$i \geq 1$$ and define the sample size $$n$$ by $$n=\frac{32}{\epsilon}\left[i \log 2+\log \frac{2}{\delta}\right].$$ Suppose we draw an i.i.d. sample $$S$$ of size $$n$$ according to some unknown distribution $$\mathcal{D}.$$ We will say that a hypothesis $$h$$ is accepted if it makes at most $$3 / 4 \epsilon$$ errors on $$S$$ and that it is rejected otherwise. Thus, $$h$$ is accepted iff $$\widehat{R}(h) \leq 3 / 4 \epsilon$$

(a) Assume that $$R(h) \geq \epsilon .$$ Use the (multiplicative) Chernoff bound to show that in that case $$\mathbb{P}_{S \sim D^{n}}[h \text { is accepted}] \leq \frac{\delta}{2^{i+1}}$$

(b) Assume that $$R(h) \leq \epsilon / 2 .$$ Use the (multiplicative) Chernoff bounds to show that in that case $$\mathbb{P}_{S \sim \mathcal{D}^{n}}[h \text { is rejected }] \leq \frac{\delta}{2^{i+1}}$$

(c) Algorithm $$\mathcal{B}$$ is defined as follows: we start with $$i=1$$ and, at each round $$i \geq 1,$$ we guess the parameter size $$s$$ to be $$\widetilde{s}=\left\lfloor 2^{(i-1) / \log \frac{2}{\delta}}\right\rfloor .$$ We draw a sample $$S$$ of size $$n$$ (which depends on $$i$$ ) to test the hypothesis $$h_{i}$$ returned by $$\mathcal{A}$$ when it is trained with a sample of size $$S_{\mathcal{A}}(\epsilon / 2,1 / 2, \widetilde{s}),$$ that is the sample complexity of $$\mathcal{A}$$ for a required precision $$\epsilon / 2,$$ confidence $$1 / 2,$$ and size $$\tilde{s}$$ (we ignore the size of the representation of each example here). If $$h_{i}$$ is accepted, the algorithm stops and returns $$h_{i},$$ otherwise it proceeds to the next iteration. Show that if at iteration $$i,$$ the estimate $$\widetilde{s}$$ is larger than or equal to $$s,$$ then $$\mathbb{P}\left[h_{i} \text { is accepted}\right] \geq 3 / 8$$

Question (a) and (b) are easy to prove, but I have trouble with the question (c). More specifically, I don't know how to use the condition that $$\widetilde{s} \geq s$$. Can anyone help?