I'm trying to understand what would be the best neural network for implementing a XOR gate. I'm considering a neural network to be good if it can produce all the expected outcomes with the lowest possible error.
It looks like my initial choice of random weights has a big impact on my end result after training. The accuracy (i.e. error) of my neural net is varying a lot depending on my initial choice of random weights.
I'm starting with a 2 x 2 x 1 neural net, with a bias in the input and hidden layers, using the sigmoid activation function, with a learning rate of 0.5. Below my initial setup, with weights chosen randomly:
The initial performance is bad, as one would expect:
Input | Output | Expected | Error (0,0) 0.8845 0 39.117% (1,1) 0.1134 0 0.643% (1,0) 0.7057 1 4.3306% (0,1) 0.1757 1 33.9735%
Then I proceed to train my network through backpropagation, feeding the XOR training set 100,000 times. After training is complete, my new weights are:
And the performance improved to:
Input | Output | Expected | Error (0,0) 0.0103 0 0.0053% (1,1) 0.0151 0 0.0114% (1,0) 0.9838 1 0.0131% (0,1) 0.9899 1 0.0051%
So my questions are:
Has anyone figured out the best weights for a XOR neural network with that configuration (i.e. 2 x 2 x 1 with bias) ?
Why my initial choice of random weights make a big difference to my end result? I was lucky on the example above but depending on my initial choice of random weights I get, after training, errors as big as 50%, which is very bad.
Am I doing anything wrong or making any wrong assumptions?
So below is an example of weights I cannot train, for some unknown reason. I think I might be doing my backpropagation training incorrectly. I'm not using batches and I'm updating my weights on each data point solved from my training set.
((-9.2782, -.4981, -9.4674, 4.4052, 2.8539, 3.395), (1.2108, -7.934, -2.7631))