Not necessarily it depends on the function of the problem space for both the GANs.
A real world example: a batter's reaction time and a pitchers max speed are actual bounded values based on genetics and physics. If the max speed a pitcher can pitch is greater than the max reaction time a human needs to effectively hit against them they will permanently be a better pitcher because the threshold of reaction time.
We don't yet know if a maximum threshold on realistic fake image generation exists or if a threshold on detection exists.
As both reach near perfect accuracy it could be that the amount of nodes needed to detect a nearly perfect generated image from real image is more neurons than atoms in the universe, or conversely the amount of nodes needed to generated a nearly perfect image could reach impossible port potions we won't know until we continue to make better and better networks that close in on the boundary of generation and detection of real vs fake images from a neural network.
Let's imagine this problem one adversary edits an image with a colored line of pixels the goal hide the line by editing the image, the student is responsible with finding the line after the adversary changes the image. The problem can become infinitely difficult change all pixels to be the color of the line. The line is impossible to find the adversary always wins, if it finds this solution meaning it is in its reachable problem space based on its hardware capabilities and its learning model.
Deep fake detection is not bound to fail because the limit on the effectiveness of a generative model may have a steeper limit than a discriminator at near optimal performance. I have not seen any paper about this specifically and in fact I believe the discriminator has a more difficult job in most cases I just disagree with deciding at this moment that the detectors are doomed. The combination of creating real images motion and sound perfectly in sink is not a trivial problem, in some scenarios it is basically impossible.