Applying neural network for simple x^2 function for demonstration purpose



I have tried to train a neural network for a simple x^2 function

  1. I developed training data in excel. First column (X) is =RANDBETWEEN(-5,5) i.e random integer between -5 and 5
  2. Second column simply squares first column
  3. And third column, my output 'y' column is 0 or 1. 0 if second column is less than 12.5 else 1

I made 850 training examples and used the first column as 'X' and third column as 'y'

However I am only able to get a training accuracy of 63%!

Where could I have gone wrong? I changed input_layer to 1 and tried hidden units between 5 and 35. Tried regularization lambda 0 to 2 but still only 63% accuracy! Where could I have gone wrong?

My predict function is p = 1 if h2(i)>0.5 else 0.

Any help will be much appreciated! :-)

I also noticed that my neural network's output is 0.3XXX for all training is this possible??


Posted 2016-03-14T14:33:11.313

Reputation: 89

Question was closed 2016-03-18T13:20:58.333

What is the architecture of your neural network? How many layers, what type of activation's, number of nodes, etc... – Armen Aghajanyan – 2016-03-14T15:03:30.207

I used input layer of 1 unit, one hidden layer of 15 units (tried up to 25 units) and output layer of 1 unit. For activation I used the sigmoid function. – Vin – 2016-03-14T15:43:00.570

Is the sigmoid activation function applied to the output layer as well? – Armen Aghajanyan – 2016-03-15T00:20:33.047

Yes for the output layer as well – Vin – 2016-03-15T00:45:45.770

Thanks Armen. I will try without the sigmoid today evening. However in my training data set the output y is not linear, I have converted y to 0 or 1 based on whether the linear output is greater than or less than 12.5; don't you think it should work for such a case with sigmoid function? – Vin – 2016-03-15T02:10:03.867

1Can you post the code somewhere? Have you scaled the input data to -1, 1? How did you initialize the weights and what learning rates did you try? Can you plot the learning curves - if they don't decrease, learning rate might be too low, if they jump around a lot, learning rate might be too high. You should definitely use the sigmoid function also on the output, don't remove it. – stmax – 2016-03-15T07:59:12.837

I'm voting to close this question as off-topic because we generally close questions as not useful to future readers if they were ultimately due to a typo or other local error – Sean Owen – 2016-03-18T13:20:58.333



I re-implemented your set-up in python using keras. I used a hidden layer size of 25, and all my activations were sigmoid's. I got to an accuracy of 99.88%. Try running your algorithm for a greater amount of epochs. Use binary cross entropy as the loss function and try decreasing the learning rate of your gradient descent algorithm. This should help increase your accuracy. My only explanation for the poor performance would be that you are getting stuck at a local minimum, if that is the case different initiations of your weights should fix that problem.

Armen Aghajanyan

Posted 2016-03-14T14:33:11.313

Reputation: 609

Thanks Armen! I implemented my code in octave and i am not aware of keras and epochs, i will try and explore these concepts. – Vin – 2016-03-15T08:26:18.363

@NeilSlater Done – Armen Aghajanyan – 2016-03-15T20:30:36.730

Please find my code here, still not been able to solve in Octave.

Main program @ Cost function @ Predict function @ Sigmoid function @

Please see where I could have gone wrong

– Vin – 2016-03-16T16:13:02.040


Problem solved! There was mistake in my cost formula...lambda was not multiplied with both theta components due to a missing bracket! Resolved that and things working fine now. :-)


Posted 2016-03-14T14:33:11.313

Reputation: 89