The answer to your question is "In principle, yes" - in it's most general form, EQ testing is just a specific case of the Turing test ("How would you feel about ... ?").
To see why meaningful EQ tests might be difficult to achieve, consider the following two possible tests:
At one extreme of complexity, the film 'Blade Runner' famously shows a test to distinguish between human and android on the basis of responses to emotionally-charged questions.
If you tried asking these questions (or even much simpler ones) to a modern chatbot, you'd likely quickly conclude that you were not talking to a person.
The problem with assessing EQ is that the more emotionally sophisticated the test, the more general the AI system is likely have to be, in order to turn the input into a meaningful representation.
At the other extreme from the above, suppose that an EQ test was phrased in an extremely structured way, with the structured input provided by a human. In such a case, success at an 'EQ test' is not really grounded in the real-world.
In an essay entitled "The ineradicable Eliza effect and its dangers", Douglas Hofstadter gives the following example, in which the ACME program is claimed (not by Hofstadter) to 'understand' analogy.
Here the computer learns about a fellow named Sluggo taking his wife Jane and
his good buddy Buck to a bar, where things take their natural course and Jane
winds up pregnant by Buck. She has the baby but doesn't want it, and so, aided
by her husband, she drowns the baby in a river, thus "neatly solving "the problem"
This story is presented to ACME in the following form:
ql: (neglectful-husband (Sluggo))
q2: (lonely-and-sex-starved-wife (Jane-Doe))
q3: (macho-ladykiller (Buck-Stag))
q4: (poor-innocent-little-fetus (Bambi))
q5: (takes-out-to-local-bar (Sluggo Jane-Doe Buck-Stag))
q11: (neatly-solves-the-problem-of (Jane-Doe Bambi))
q12: (cause (ql0 q11))
Suppose the program were to be asked if Jane Doe's behavior was moral. Complex compound emotional concepts such as 'neglectful', 'lonely' and 'innocent' are here simply predicates, not available to the AI for deeper introspective examination. They could just as easily be replaced by labels such as as 'bling-blang-blong15657'.
So in one sense, the absence of success at an EQ test with any depth is indicative of the general problem currently facing AI: the inability to define (or otherwise learn) meaningful representations of subtle complexities of the human world, which is a lot more complex than being able to recognize videos of cats.