I will try to answer the question in a lesser mathematical (and hopefully correct way).

NOTE: I have used $V_{\pi}$ and $v_{\pi}$ interchangeably.

We start from LHS:

$$\max_s \Bigl\lvert \mathbb{E}_{\pi} \left[ G_{t:t+n} \mid S_t = s \right] - v_{\pi}(s) \Bigr\rvert$$

This can be written in terms of trajectories. Say the probability of observing a $n$ step trajectory (for a $n$ step return) is $p_j^{s}$ from state $S_t = s$ . Thus we can write the expected return as sum of returns from all trajectories multiplied with the probability of the trajectory:

$$\mathbb{E}_{\pi} [G_{t:t+n}|S_t = s] = \sum_j p_j^sG_{t:t+n}^j = \sum_j p_j^s [R_{t+1}^j + \gamma R_{t+2}^j.....\gamma^{n-1}R_{t+n}^j + \gamma^n V_{t+n-1}(S_{t + n})^j]$$

We use @Dennis's terminology for $n$ step rewards i.e

$R_{t:t+n}^j \doteq R_{t + 1}^j + \gamma R_{t + 2}^j + \dots + \gamma^{n - 1} R_{t + n}^j$.

Now we know $v_{\pi}(s)$ is nothing but $\mathbb{E}_{\pi} [G_{t:t+n}^{\pi}|S_t = s]$ where I have used $G_{t:t+n}^{\pi}$ to denote that the returns are actual returns if we have evaluated the policy completely (the value functions are consistent with the policy) for every state(using infinite episodes maybe) i.e $G_{t:t+n}^{\pi} = R_{t:t+n} + \gamma^n V_{\pi}(S_{t + n})$.

So now if we evaluate the equation:

$$\max_s \Bigl\lvert \mathbb{E}_{\pi} \left[ G_{t:t+n} \mid S_t = s \right] - v_{\pi}(s) \Bigr\rvert = \max_s \Bigl\lvert \mathbb{E}_{\pi} \left[ G_{t:t+n} \mid S_t = s \right] - \mathbb{E}_{\pi} [G_{t:t+n}^{\pi}|S_t = s]\Bigr\rvert$$

which is further written in form of trajectory probabilities (for easier comprehension):

$$\max_s \Bigl\lvert \sum_j p_j^s(R_{t:t+n}^j+\gamma^n V_{t+n-1}^j(S_{t + n})) - \sum_j p_j^s(R_{t:t+n}^j+\gamma^n V_{\pi}^j(S_{t + n})) \Bigr\rvert$$

Now the equation can be simplified by cancelling the reward terms ($R_{t:t+n}^j$) as they are the same for a trajectory, and thus is common to both the terms we get:
$$\max_s \Bigl\lvert \gamma^n\sum_j p_j^s( V_{t+n-1}^j(S_{t + n})-V_{\pi}^j(S_{t + n}))\Bigr\rvert$$

This is basically the expectation of the deviation of each and every state at the $n$ th step (from its actual value $V_{\pi}$,), starting from $S_t = s$ and multiplied by a discount factor.

Now using the identity $E[X] \leq \max X$ we get:

$$\max_s \Bigl\lvert \gamma^n\sum_j p_j^s( V_{t+n-1}^j(S_{t + n})-V_{\pi}^j(S_{t + n}))\Bigr\rvert \leq \max \Bigl\lvert \gamma^n( V_{t+n-1}^j(S_{t + n})-V_{\pi}^j(S_{t + n}))\Bigr\rvert$$

Which can finally be written as:
$$\max_s \Bigl\lvert \mathbb{E}_{\pi} \left[ G_{t:t+n} \mid S_t = s \right] - v_{\pi}(s) \Bigr\rvert \leq \max \Bigl\lvert \gamma^n( V_{t+n-1}^j(S_{t + n})-V_{\pi}^j(S_{t + n}))\Bigr\rvert$$

Now the RHS is true for only those states reachable from $S_t = s$ via a trajectory, but since it has a maximizing operation we can include the whole state space in the $\max$ operation without any problem to finally write:

$$\max_s \Bigl\lvert \mathbb{E}_{\pi} \left[ G_{t:t+n} \mid S_t = s \right] - v_{\pi}(s) \Bigr\rvert \leq \max_s \Bigl\lvert \gamma^n( V_{t+n-1}(s)-V_{\pi}(s))\Bigr\rvert$$

which concludes the proof.

Your bullet points explaining the "trick" make intuitive sense. Could you provide a link to someone showing the formal steps? – Philip Raeisghasem – 2019-03-18T09:13:10.397

@PhilipRaeisghasem I don't know of any such links by heart, so can't provide one, sorry :( – Dennis Soemers – 2019-03-18T19:18:00.933

I'd argue that, if it's that hard to find this information, then including it in the post would greatly increase the post's value. Or maybe as a separate, linked question? – Philip Raeisghasem – 2019-03-18T19:50:23.123

@PhilipRaeisghasem Absolutely, either would be fine... maybe separate question better, since this one's already fairly long. I'm currently working towards a paper submission deadline though, so I really don't personally feel like working out the math and writing it all down any time soon. – Dennis Soemers – 2019-03-18T20:23:01.803

Can we just say that s_t+1 is a subset of s therefore its max over s_t+1 must <= max over s? – Phizaz – 2019-08-29T07:50:09.277

1@Phizaz Hmmm not exactly. Close to that, I'd say "given a completely free, unrestricted choice for $S_t$, the set of possible states $S_{t+n}$ is a subset of the set of possible states $S_t$". Very often it will not be a proper subset though, very often they'll be equal. But even then, a very important other point is that the

probability distributionsover those sets will be different. When the choice of $S_t$ is unrestricted, you can assign full probability mass to whichever is "best". Even if that best state may still be possible for $S_{t+n}$, it may no longer have full probability mass – Dennis Soemers – 2019-08-29T08:15:20.5701@DennisSoemers I agree. However, will we not be able to prove without the probability mass argument? – Phizaz – 2019-08-30T12:59:05.470

@Phizaz Ah yes I think you're completely right. Intuitively it didn't "feel" right to me to omit that... but I guess math doesn't care about my feelings :) – Dennis Soemers – 2019-08-30T13:24:33.780