The value $Q(s', ~\cdot~)$ should always be implemented to simply be equal to $0$ for any terminal state $s'$ (the dot instead of an action as second argument there indicates that what I just wrote should hold for **any action**, as long as $s'$ is terminal).

It is easier to understand why this should be the case by dissecting what the different terms in the update rule mean:

$$Q(s, a) \gets \color{red}{Q(s, a)} + \alpha \left[ \color{blue}{r + \gamma Q(s', a')} - \color{red}{Q(s, a)} \right]$$

In this update, the red term $\color{red}{Q(s, a)}$ (which appears twice) is our old estimate of the value $Q(s, a)$ of being in state $s$ and executing action $a$. The blue term $\color{blue}{r + \gamma Q(s', a')}$ is a different version of estimating the same quantity $Q(s, a)$. This second version is assumed to be slightly more accurate, because it is not "just" a prediction, but it's a combination of:

- something that we really observed: $r$, plus
- a prediction: $\gamma Q(s', a')$

Here, the $r$ component is the immediate reward that we observed after executing $a$ in $s$, and then $Q(s', a')$ is everything we expect to still be collecting afterwards (i.e., after executing $a$ in $s$ and transitioning to $s'$).

Now, suppose that $s'$ is a terminal state, what rewards do we still expect to be collecting in the future within that same episode? Since $s'$ is terminal, and the episode has ended, there can only be one correct answer; we expect to collect exactly $0$ rewards in the future.