It should be possible to train an agent using some variant of DQN to beat a random agent around 100% of the time within a few thousand games.
It may require one or two more advanced techniques to get the learning time down to a low number of thousands. However, if your agent is winning ~50% of games against a random agent, something has gone wrong, since that is the performance you would expect of another random agent. Even simple policies, such as always play in same column, will beat a random agent a significant fraction of the time.
First thing to consider is that there are too many states in Connect 4 to use tabular Q learning. You have to use some variant of DQN. As a grid-based board game where winning patterns can repeat, some form of convolutional neural network (CNN) for the Q function is probably a good start.
I think for a first step, you should double-check that you have implemented DQN correctly. Check the TD target formula is correct, and that you have implemented experience replay. Ideally you will also have a delayed-update target network for calculating the TD targets.
As a second step, try some variations of hyper-parameters. The learning rate, exploration rate, size of replay table, number of games to play before starting learning etc. A discount factor $\gamma$ slightly below 1 can help (despite this being an episodic problem) - it makes the agent forget more of the initial bias for early time steps.
Or should this not matter – if the Q Agent is playing enough games, it doesn't matter how good/bad its opponent is?
Up to a point this is true. It is hard to learn against a perfect agent in Connect 4, because it always wins as player one, which means all policies are equally good and there is nothing to learn. Other than that, if there is a way to win, eventually a Q learning agent with exploration should find it.
Against a random agent, you should be seeing some improvement if your agent is correctly set up for the problem, after a few thousand games. As it happens I am currently training Connect 4 agents using variants of DQN for a Kaggle competition, and they consistently beat random agents with 100% measured success rate after 10,000 training games. I have added a few extras to my agents in order to achieve this - there are some discussions of approaches in the forums at https://www.kaggle.com/c/connectx