27 Is it possible to train the neural network to solve math equations? 2016-08-02T21:37:32.420

25 Can neural networks be used to prove conjectures? 2018-08-04T09:57:21.870

18 How does one start learning artificial intelligence? 2017-05-24T19:07:38.987

15 How to choose an activation function? 2018-07-09T00:06:57.810

14 What are the mathematical prerequisites for an AI researcher? 2018-07-30T19:37:58.987

13 How should I get started with artificial intelligence? 2017-06-27T03:58:31.947

13 Is there any scientific/mathematical argument that prevents deep learning from ever producing strong AI? 2018-06-15T11:40:24.003

12 What is the Bellman operator in reinforcement learning? 2019-03-06T14:07:16.067

11 What sort of mathematical problems are there in AI that people are working on? 2019-06-21T09:37:47.403

10 Why do activation functions need to be differentiable in the context of neural networks? 2016-12-21T23:26:55.103

9 Is the mean-squared error always convex in the context of neural networks? 2017-08-22T14:26:51.633

9 How do we prove the n-step return error reduction property? 2018-12-08T05:24:56.380

8 How does the forget layer of an LSTM work? 2019-12-23T05:17:18.317

7 How can I start learning mathematics for machine learning? 2017-05-09T04:57:47.583

7 Which areas of applied math are relevant to AI? 2018-10-15T17:55:35.607

7 Why exactly do neural networks require i.i.d. data? 2019-02-23T13:30:07.443

7 Why is the derivative of the activation functions in neural networks important? 2019-08-14T22:30:05.013

6 What are the mathematical prerequisites to be able to study general artificial intelligence? 2018-05-04T14:15:54.140

6 What makes multi-layer neural networks able to perform nonlinear operations? 2018-08-19T16:46:51.133

6 Why is the log probability replaced with the importance sampling in the loss function? 2018-08-23T07:17:42.697

6 How is local minima possible in gradient descent? 2019-04-23T19:12:53.727

6 What is the mathematical definition of an activation function? 2020-01-10T10:25:38.987

6 Can we get the inverse of the function that a neural network represents? 2020-01-18T13:41:14.617

5 Is recursion used in practice to improve performance of AI systems? 2016-08-07T03:49:20.143

5 What is a weighted average in a non-stationary k-armed bandit problem? 2018-01-18T18:32:34.333

5 How could an AI be used to improve the teaching and learning of mathematics? 2018-06-18T11:01:14.053

5 Is there a limit of minimum error for a particular training dataset in artificial Neural Network? 2018-10-28T09:37:14.010

5 Are on-line backpropagation iterations perpendicular to the constraint? 2019-03-23T16:03:49.737

5 What is "conditioning" on a feature? 2019-11-14T11:57:21.740

5 Can deep learning be used to help mathematical research? 2020-05-21T15:40:18.533

4 Why does the cost function contain a 2 at the denominator? 2017-02-23T14:41:58.773

4 Why is the denominator ignored in the Bayes' rule? 2017-12-02T14:55:15.743

4 How good is AI in math? 2017-12-07T02:19:36.767

4 Viola Jones Algorithm 2018-01-09T09:48:31.757

4 Defining formula for fuzzy equation 2018-06-04T05:58:41.443

4 Which functions can be activation functions? 2018-06-05T11:01:24.343

4 Why is the derivative 0 if the policy is deterministic? 2018-09-06T12:44:46.330

4 Why does the "reward to go" trick in policy gradient methods work? 2018-12-20T01:00:04.310

4 What are the main benefits of using Bayesian networks? 2019-02-18T11:53:46.140

4 How are filters weights updated for a CNN? 2019-05-20T01:40:49.383

4 Which function $(\hat{y} - y)^2$ or $(y - \hat{y})^2$ should I use to compute the gradient? 2019-05-31T12:54:50.300

4 How is G(z) related to x in GAN proof? 2019-06-15T23:16:55.513

4 How would an AI work out this question? 2019-07-23T19:08:31.140

4 Is it ok to struggle with mathematics while learning AI as a beginner? 2019-07-31T06:52:04.033

4 How can I determine the mathematical relation between the input and output variables? 2019-08-30T18:04:12.953

4 Which linear algebra book should I read to understand vectorized operations? 2019-10-28T15:52:31.180

4 Why does a Lipschitz continuous discriminator in GANs assure statistical boundedness? 2019-11-12T15:57:44.070

4 What do the subscripts mean in $N_{t,n,\sigma,L}$? 2019-11-13T08:03:14.713

4 Is there a mathematical formula that describes the learning curve in neural networks? 2020-01-09T14:55:30.020

4 Mathematical foundations of the ability to learn 2020-02-04T16:39:45.563

4 How can a single sample represent the expectation in gradient temporal difference learning? 2020-04-26T09:37:48.353

4 How is the Jacobian a generalisation of the gradient? 2020-05-13T02:18:59.550

3 What does the argmax of the expectation of the log likelihood mean? 2018-01-28T11:15:09.723

3 Understanding a few terms in Andrew Ng's definition of the cost function for linear regression 2018-02-15T16:31:48.240

3 How to calculate gradient of filter in convolution network 2018-04-13T05:50:47.720

3 AI applications of the Fibonacci series 2018-06-09T14:36:56.427

3 How does one even begin to mathematically model an AI algorithm? 2018-07-19T13:57:43.527

3 Are there any discount-factors based on branching factors? 2018-08-17T20:47:17.640

3 Is known math really enough for AI 2018-10-04T18:07:47.730

3 Which neural network should I use to approximate a specific function? 2018-12-03T16:28:25.627

3 Can we define the AI singularity mathematically? 2019-02-17T23:18:35.250

3 What characteristics make it difficult for a Neural Network to approximate a function? 2019-03-09T05:25:36.703

3 Why is the max a non-expansive operator? 2019-03-14T20:50:46.340

3 Standard deviation of the total input to a neuron 2019-04-30T14:51:25.057

3 How can I derive the rotation matrix from the axis-angle rotation vector? 2019-08-18T17:33:51.073

3 What are the differences between stability and convergence in reinforcement learning? 2019-09-19T04:49:46.070

3 What does the Markov assumption say about the history of state sequences? 2019-11-20T14:30:06.803

3 How does the memory mechanism (reading and writing) work in a neural Turing machine? 2019-12-25T12:16:34.820

3 What does the notation sup dist mean in distributional RL? 2020-01-06T18:56:44.160

3 What is the difference between the notations $\|x\|_1, \|x\|_2$ and $|x|$? 2020-01-26T08:13:10.457

3 In the policy gradient equation, is $\pi(a_{t} | s_{t}, \theta)$ a distribution or a function? 2020-02-21T16:23:15.443

3 Interpretation of inverse matrix in mean calculation in Gaussian Process 2020-03-14T21:59:55.010

3 Why does variational auto-encoder use the reconstruction loss? 2020-03-26T05:22:29.500

3 Is maximum likelihood estimation meaningless for a dataset of only outliers? 2020-04-04T05:12:56.393

2 Are FFNN (MLP) Lipschitz functions? 2016-09-10T10:05:34.707

2 Problems getting ADADELTA to converge 2017-08-23T23:40:45.230

2 Can you help me understand how weight normalization works? 2018-10-03T11:45:23.277

2 Reward-related formulation in reinforcement learning 2018-10-19T09:34:21.753

2 Is there a way of representing the minimax algorithm mathematically? 2018-10-26T01:45:21.197

2 Solving equations using reinforcement learning 2018-11-05T00:24:30.483

2 Is there a mathematical example for Conditional Random Fields? 2018-12-13T09:44:01.557

2 Choice of fuzzification function 2018-12-22T15:46:44.227

2 Calculating tangent vector of curve s(P,$\alpha$) at given point $\alpha$ = 0 2019-02-27T13:30:04.210

2 Why is MSE used over other quadratic loss functions? 2019-03-03T08:39:20.207

2 Which matrix represents the similarity between words when using SVD? 2019-03-15T17:44:50.580

2 What does the formula $1-\sum_i(e_i-a_i)^2$ mean in this NEAT Python API? 2019-04-19T04:37:06.437

2 Should I use the hyperbolic distance loss in the case of Poincarè Disk Model? 2019-05-03T14:11:27.203

2 Is the Markov property assumed in the forward algorithm? 2019-06-16T09:48:45.293

2 How can I learn tensors for deep learning? 2019-07-20T09:21:30.297

2 What is the meaning of the words 'bias' and 'variance' in RL? 2019-07-25T08:17:00.343

2 Why is the expectation calculated over finite number of points drawn from a probability distribution? 2019-07-26T09:39:21.027

2 What is probability distribution in machine learning? 2019-11-28T03:58:29.183

2 What is the neuron-level math behind backpropagation for a neural network? 2020-01-06T21:08:59.913

2 How does the update rule for the one-step actor-critic method work? 2020-01-22T21:10:17.997

2 What is the mean in the variational auto-encoder? 2020-02-04T09:17:01.797

2 How is the expected value in the loss function of DQN approximated? 2020-02-27T21:41:46.513

2 Expected duration in a state 2020-03-02T06:15:33.723

2 Is there a possibility that there is no relationship between some inputs and outputs? 2020-03-03T17:34:45.410

2 In a single neuron output layer should the output be a scalar? 2020-03-17T18:51:19.347