If you want to understand relativity, read Einstein1,2, not a book about relativity authored by a professor who think's he's got it. If you want to understand Alan Turing's test for intelligence in the context of human dialog, read Turing.3 Interpretations can be worse than worthless. They are often misleading. If the principles seem too thick, read it over again until you get it.
In the case of Turing's test for intelligence in the context of human dialog, to understand it fully, the following background is assumed when Turing wrote, which, if you read his 1950 article, will become apparent.
- How Turing's completeness theorem responds to Kurt Gödel's second incompleteness theorem
- The strategy of a controlled test
- The difference between (a) hearing and speaking and (b) listening and wittily responding — This is particularly pertinent today because the chat-bots do (a) and could be anywhere from 5 to 500 years away from doing (b). To reach (c) deeply comprehending and responding with inspiration, AI researchers must go beyond modelling the human mind and approach the challenge of modelling the minds of people like Gödel, Einstein, and Turing. Whether that will ever occur is yet to be revealed.
The specific requirements of the Imitation Game, Alan Turing's subtitle above the description of his thought experiment, are a matter of record.
Specific Requirements [Excerpt from Actual Article]
[The imitation game] is played with three people, a man (A), a woman (B), and an interrogator (C) who may be of either sex. The interrogator stays in a room apart front the other two. The object of the game for the interrogator is to determine which of the other two is the man and which is the woman. He knows them by labels X and Y, and at the end of the game he says either "X is A and Y is B" or "X is B and Y is A." The interrogator is allowed to put questions to A and B thus:
C: Will X please tell me the length of his or her hair?
Now suppose X is actually A, then A must answer. It is A's object in the game to try and cause C to make the wrong identification. His answer might therefore be:
"My hair is shingled, and the longest strands are about nine inches long."
In order that tones of voice may not help the interrogator the answers should be written, or better still, typewritten. The ideal arrangement is to have a teleprinter communicating between the two rooms. Alternatively the question and answers can be repeated by an intermediary. The object of the game for the third player (B) is to help the interrogator.
The best strategy for her is probably to give truthful answers. She can add such things as "I am the woman, don't listen to him!" to her answers, but it will avail nothing as the man can make similar remarks.
We now ask the question, "What will happen when a machine takes the part of A in this game?" Will the interrogator decide wrongly as often when the game is played like this as he does when the game is played between a man and a woman? These questions replace our original, "Can machines think?"
There have been thousands of critiques of both Einstein's relativity and Turing's test, none of which add much value. Study the thinking of great contributors through their own words and all the refuse that follows will be interesting primarily in its lack of greatness.
Secondary Questions in This Thread
What requirements if any must the evaluator fulfill in order to be qualified to give the test?
The interrogator (C) is not an evaluator. Evaluation would be an attempt to be objective, however the premise of Turing's thought experiment is that the interrogator provide her or his subjective judgment. From a statistics point of view, the interrogator should be selected randomly from the population of the world that shares a spoken language with (A) and (B).
Must there always be two participants in the conversation (one human and one computer) or can there be more?
There must be exactly two to fit the scenario described by Alan Turing. (See below for more detail.)
Are placebo tests (where there is not actually a computer involved) allowed or encouraged?
One could test all kinds of things, and researchers do, however, that would be outside of the scope of Turing's thought experiment.4
Can there be multiple evaluators? If so does the decision need to be unanimous among all evaluators in order for the machine to have passed the test?
What would reveal the most information to those that sponsor an actual Imitation Game would be a double blind fully randomized test where (A), (B), and (C) are pulled from as random a sample of those men, women, or software systems of the type under test that can converse in a common language, and the test would be run many times with random selections from the samples.
Unanimity, evaluation, additional complexity, and communication other than that which was specified by the test would only frustrate the cause, if one sticks with Turing's original intention regarding the question, "Can computers think?"
Other Views of Intelligence
Turing, as did René Descartes, who stated that machines will never pass a less controlled version of Turing's Imitation Game, saw intelligence through the lens of dialog. Others have considered other kinds of dialog and other contexts than dialog. I addressed this in another question:
Can a brain be intelligent without a body?
References and Footnotes
 Relativity: The Special and the General Theory by Albert Einstein, 1916
 The Principle of Relativity by Albert Einstein and Francis A. Davis, 1923
 A. M. Turing (1950) Computing Machinery and Intelligence. Mind 49: 433-460. https://www.csee.umbc.edu/courses/471/papers/turing.pdf
 Turing's 1950 article did not recommend that his thought experiment should be embodied and used in commercial validation of future AI systems. Alan Turing was, however, concerned with practical computing at one specific point in his career. That was when the Nazis had overrun France, were pulverizing his homeland from the air, and had sunk a significant portion of the English Navy from below, with the help of Enigma cryptography.