11 What are the state-of-the-art results on the generalization ability of deep learning methods? 2019-11-15T09:22:00.470

8 What are the learning limitations of neural networks trained with backpropagation? 2016-08-03T18:05:24.997

8 Why can neural networks generalize at all? 2019-12-31T23:12:42.903

7 Can we teach an artificial intelligence through sentences? 2018-08-16T14:57:13.340

7 Can neural networks with a sigmoid as the activation function of the output layer approximate continuous functions? 2020-03-23T08:43:19.880

6 Are there any rules of thumb for having some idea of what capacity a NN model needs to have for a given problem? 2020-02-24T20:00:46.967

5 Is there a way of converting a neural network to another one that represents the same function? 2017-10-27T12:44:01.850

5 Are PAC learning and VC dimension relevant to machine learning in practice? 2019-12-29T07:19:22.457

5 How does size of the dataset depend on VC dimension? 2020-04-16T22:33:56.440

4 Why does estimation error increase with $|H|$ and decrease with $m$ in PAC learning? 2019-09-16T10:51:21.577

4 In deep learning, do we learn a continuous distribution based on the training dataset? 2019-09-18T15:05:43.263

4 Are PAC learnability and the No Free Lunch theorem contradictory? 2020-02-02T18:22:27.487

4 Mathematical foundations of the ability to learn 2020-02-04T16:39:45.563

4 How to estimate the capacity of a neural network? 2020-02-06T01:16:47.363

4 What are some resources on computational learning theory? 2020-04-17T16:04:40.127

3 How can generalization error be estimated? 2016-08-02T16:16:46.797

3 What is the relation between the definition of learnability of Vapnik and Gold and learnability of neural networks? 2018-02-01T10:17:51.863

3 Batch PTA stopping condition 2019-02-06T17:48:39.643

3 How to show Sauer's Lemma when the inequalities are strict or they are equalities? 2019-10-17T00:43:11.233

3 What is the maximum number of dichotomies in a square? 2019-11-04T12:05:06.523

2 How Dempster-Shafer theory work in AI? 2018-11-25T06:56:11.300

2 Understanding the equation of the empirical error 2019-10-11T02:47:47.147

2 Convert a PAC-learning algorithm into another one which requires no knowledge of the parameter 2020-01-16T04:57:48.913

2 How can we prove this inequality, related to the generalization error, without using the Rademacher complexity? 2020-01-16T11:01:20.123

2 A problem about the relation between 1-oracle and 2-oracle PAC model 2020-01-16T12:44:11.397

2 How can I show that the VC dimension of the set of all closed balls in $\mathbb{R}^n$ is at most $n+3$? 2020-01-16T13:25:46.480

2 How does the number of stacked LSTM layers or units in each layer affect the model complexity? 2020-03-11T14:02:00.837

2 Are No Free Lunch theorem and Universal Approximation theorem contradictory in the context of neural networks? 2020-03-26T13:08:10.587

2 How to prove $\mathcal H$ with VC dimension $d$ shatter all subsets with size less than $d-1$? 2020-03-28T21:12:24.790

2 How can neural networks approximate any continuous function but have $\mathcal{VC}$ dimension only proportional to their number of parameters? 2020-04-13T00:39:51.170

2 An infinite VC dimensional space vs using hierarchical subspaces of finite but growing VC dimensions 2020-04-16T02:57:49.810

2 A model for each sub-problem vs one model for the whole problem 2020-04-27T14:03:34.240

2 Is there any practical application of knowing whether a concept class is PAC-learnable? 2020-05-17T07:28:32.527

1 Minimum number of perceptrons for an n-bit truth table? 2018-04-12T05:12:18.157

1 What is the difference between a learning algorithm and a hypothesis? 2019-11-24T15:33:56.943

1 Why does the discrepancy measure involve a supremum over the hypothesis space? 2020-02-10T22:43:40.260

1 Can feature engineering change the selection of the model according to the minimum description length? 2020-04-09T12:25:06.790

1 Understanding relation between VC Symmetrization Lemma and Generalization Bounds 2020-04-14T18:15:05.967

1 What do we mean by saying "VC dimension gives a LOOSE, not TIGHT bound"? 2020-04-19T08:53:20.820

1 What is the relationship between PAC learning and classic parameter estimation theorems? 2020-04-26T02:31:21.613

1 Why is probability that at least one hypothesis out of $k$ being consistent with $m$ training examples $k(1- \epsilon)^m$? 2020-04-29T03:06:50.460

1 How estimate the minimum size of an autoencoder to overfit the training data? 2020-05-19T11:04:24.550

1 VC Dimension of Reinforcement Learning (RL) 2020-06-19T12:00:12.957

1 Does this $\max$ mean that we need to maximize the regret in this regret formula? 2020-07-28T17:56:25.200

1 What is the representational capacity of a learning algorithm? 2020-08-10T23:36:07.480

0 What are the prior beliefs in a neural network? (if any) 2020-04-12T03:44:31.903

0 How can a machine learning problem be reduced as a communication problem? 2020-04-23T02:24:37.860