What are the mathematical prerequisites needed to understand research papers on neural networks?


I know we have developed some mathematical tools to understand deep neural networks, gradient descent for optimization, and basic calculus. Recently, I encountered arxiv paper that describes higher mathematics for neural networks, such as functional analysis. For example, I remember universal approximation theorem was proved with the Hann-Banach theorem, but I lost the link of that article, so I need to find similar papers or articles to develop my understanding of neural networks mathematically (like with functional analysis, in short, I need to learn more advanced math for research), can you suggest some books or arxiv papers or articles or any other source that describes mathematics for deep neural networks?

dato nefaridze

Posted 2020-09-05T12:32:16.680

Reputation: 464

Question was closed 2020-09-05T19:45:02.860

To me, it's not really clear what you're asking. What kind of "mathematical tools" are you really looking for? Are you trying to understand why neural networks can approximate continuous functions? It's not clear what your level of understanding of neural networks is and what you want to understand that you don't understand. – nbro – 2020-09-05T19:47:01.520

@nbro that is a good point, for example, I want to learn mathematics on the research level. I see many AI labs output brilliant papers and I think they must be using some advanced mathematics, it won't be just random calculation, the should think ahead, so I want to learn mathematics on the research level. is it clear, sorry if It is not clear now, please say and i will go more in details. – dato nefaridze – 2020-09-05T19:49:54.993

Can you please provide an example of paper that you want to understand? There are so many mathematical topics that it's difficult to know them all and so it's difficult to answer your question. However, if you share with us the specific paper or papers that you want to understand, maybe we can provide more details about the type of mathematics that you need to understand those papers. Moreover, I suggest that you edit your post to include these details and also to include a description of your current knowledge of NNs. What do you know? What's your level? – nbro – 2020-09-05T21:01:55.863

@nbro that is exactly my problem, I don't have a mentor to help, so I want to get into the research, but before I get into it, I want to learn enough mathematics for that, so I was wondering what advanced mathematical topics I should know to really be able to make a research by myself. i know mathematical analysis(for multiple and single variables) and I was wondering what should I have to learn next ( I also know stats and linear algebra) i think researchers should know a lot of math, but I don't know what math, because I am confused. – dato nefaridze – 2020-09-05T21:05:43.353

We already have a similar question then: https://ai.stackexchange.com/q/7352/2444, although it is not specific to neural networks. I don't know the quality of the answers, but I think you can trust this answer, given that it was written by someone with a PhD in AI. Anyway, if you think that this question is a duplicate of the just mentioned one, let me know and I will mark your question as a duplicate.

– nbro – 2020-09-05T21:09:05.207

I've changed the title of your post. Let me know if the question in the title is your actual question or not. – nbro – 2020-09-05T21:11:34.553

@nbro when I saw steward calculus I immediately shut down the page, it will never be enough for neural networks, guys do functional analysis and even that is not enough – dato nefaridze – 2020-09-05T21:12:11.997

Please, provide an example of a paper that you want to understand, otherwise, it's really difficult to answer your question, because there are research papers that involve neural networks but that use only them in a superficial way (in the sense that they are used to solve a task and there is no theory e.g. about their approximation capabilities involved). I think you're looking for the mathematical prerequisites that help you understand the universal approximation theorems, given that you mention that, but it's not clear. – nbro – 2020-09-05T21:14:05.223

@nbro I don't have a specific paper, if I had I will google what math is involved in that, but I don't want to learn math fragment by fragment. suppose the question is: I want to be in top 10 researchers in the AI field, which mathematical fields I have to master(or at leas know)? – dato nefaridze – 2020-09-05T21:17:13.353

That's difficult to answer because you don't necessarily need big math skills to be an important researcher. It depends on what you research. Also, there are many things in AI, not just neural networks. The short answer to your question is that: you don't need to know all the math to be a good researcher. For example, if you do research on variational inference, you definitely need to know a good dose of calculus, statistics, and probability theory. So, it's difficult to say exactly what you need to learn because there are so many topics and you will almost surely only focus on 1-2 of them. – nbro – 2020-09-05T21:21:38.400

So, please, ask a more specific question, and you will get more useful answers. We cannot answer your question if you don't tell us what you are interested in. If you say you're interested in becoming an AI researcher, that doesn't mean much, because one AI researcher could be doing research on reinforcement learning, another in expert systems, and another in variational inference. Nobody does research on "AI". Typically, people do research on specific topics or subfields of AI. – nbro – 2020-09-05T21:29:51.250



Knowing you want to focus on the theory, I think that a good choice is Deep Learning Book from Ian Goodfellow et al., which is publicly available. It has three main parts. On the first one the author presents the math/ statistic tools that will be needed to understand the following parts. On the second part, the author explains the current state of the art in Deep Learning and on the last part more advanced topics are introduced. The author also uses references so as to facilitate extra resources to dive more into the theory.

On the other hand I strongly recommend you Google Scholar, there you have plenty of information given by articles/ papers of the current state of the art techniques in the field. There you can also find papers related on what that you mentioned about the Hahn-Banach Theorem.

Javier TG

Posted 2020-09-05T12:32:16.680

Reputation: 11