Is it possible to train the neural network to solve math equations?



I'm aware that neural networks are probably not designed to do that, however asking hypothetically, is it possible to train the deep neural network (or similar) to solve math equations?

So given the 3 inputs: 1st number, operator sign represented by the number (1 - +, 2 - -, 3 - /, 4 - *, and so on), and the 2nd number, then after training the network should give me the valid results.

Example 1 (2+2):

  • Input 1: 2; Input 2: 1 (+); Input 3: 2; Expected output: 4
  • Input 1: 10; Input 2: 2 (-); Input 3: 10; Expected output: 0
  • Input 1: 5; Input 2: 4 (*); Input 3: 5; Expected output: 25
  • and so

The above can be extended to more sophisticated examples.

Is that possible? If so, what kind of network can learn/achieve that?


Posted 2016-08-02T21:37:32.420

Reputation: 9 163

A related video NeuroSAT: An AI That Learned Solving Logic Problems.

– kenorb – 2019-08-04T12:10:36.920


This might be relevant -

– GaneshTata – 2019-08-19T13:49:04.327



Yes, it has been done!

However, the applications aren't to replace calculators or anything like that. The lab I'm associated with develops neural network models of equational reasoning to better understand how humans might solve these problems. This is a part of the field known as Mathematical Cognition. Unfortunately, our website isn't terribly informative, but here's a link to an example of such work.

Apart from that, recent work on extending neural networks to include external memory stores (e.g. Neural Turing Machines) tend to use solving math problems as a good proof of concept. This is because many arithmetic problems involve long procedures with stored intermediate results. See the sections of this paper on long binary addition and multiplication.


Posted 2016-08-02T21:37:32.420

Reputation: 336

for me not just neural nets but neural architectures with external memory. Architectures like NTM and DNC can use to solve algorithms like shortest path because of they do have the ability to execute the iterative process by keeping track of what was being done (No catastrophic forgetting). But for me using just suervised learning is simply wrong as mentioned in the second answer. – Shamane Siriwardhana – 2018-04-10T13:46:18.490


Not really.

Neural networks are good for determining non-linear relationships between inputs when there are hidden variables. In the examples above the relationships are linear, and there are no hidden variables. But even if they were non-linear, a traditional ANN design would not be well suited to accomplish this.

By carefully constructing the layers and tightly supervising the training, you could get a network to consistently produce the output 4.01, say, for the inputs: 2, 1 (+), and 2, but this is not only wrong, it's an inherently unreliable application of the technology.


Posted 2016-08-02T21:37:32.420

Reputation: 1 221

Well, that's probably because they are universal approximators. They'll never get an exact value for anything and we actually don't want NNs to predict an exact value (that would be over fitting). So in a way, they are working the way they designed to work. – Sarvagya Gupta – 2020-01-20T15:30:29.253


1) It is possible! In fact, it's an example of the popular deep learning framework Keras. Check out this link to see the source code.

2) This particular example uses a recurrent neural network (RNN) to process the problem as a sequence of characters, producing a sequence of characters which form the answer. Note that this approach is obviously different from how humans tend to think about solving simple addition problems, and probably isn't how you would ever want a computer to solve such a problem. Mostly this is an example of sequence to sequence learning using Keras. When handling sequential or time-series inputs, RNNs are a popular choice.


Posted 2016-08-02T21:37:32.420

Reputation: 151


Yes - it would seem that it is now possible to achieve more is required from the example you've given this paper describes a DL solution to a considerably harder problem - generating the source code for a program described in natural language.

Both of these can be described as regression problems (i.e. the goal is to minimize some loss function on the validation set), but the search space in the natural language case is much bigger.


Posted 2016-08-02T21:37:32.420

Reputation: 6 685


There's the fairly well established field of automated theorem proving. This most likely encompasses solving equations, but doesn't necessarily involve AI. This post from the Cross Validated stackexchange has some more information on the topic.


Posted 2016-08-02T21:37:32.420

Reputation: 107