How should I interpret the weights file of the Leela Zero neural network?

1

I am trying to understand the NN (Neural Network) architecture given at https://github.com/leela-zero/leela-zero/blob/next/training/caffe/zero.prototxt.

So, I downloaded the NN weights from https://zero.sjeng.org/. However, I am not sure how to interpret the network weight file.

1What exactly are you trying to understand from this?? What you seeing are the different values of various weights of the nodes in the network. Beyond that, it really doesn't make much sense to try to interpret outside of the problem and architecture. – hisairnessag3 – 2019-08-12T11:04:51.360

1

do you have any idea how to interpret https://github.com/leela-zero/leela-zero/blob/next/src/Network.cpp#L253-L255 from the weight hash file ?

– kevin – 2019-08-12T11:50:53.490

@kevin Hi Kevin! Please, if you have another question, ask a new question on the website. Anyway, on this website, we focus more on answering theoretical and philosophical AI questions. I suggest you ask this type of questions (that involve implementation) on Data Science SE.

– nbro – 2019-08-13T21:57:09.327

1

How should I interpret the weights file of the Leela Zero neural network? ... However, I am not sure how to interpret the network weight file.

The image shown in the question has a set of network parameters listed as a series of floating point values in ASCII or utf-8 on the right. The equivalent hexadecimal values for those characters are shown on the left. There is no point in viewing these numbers at all. They are not in a form that can be readily interpreted by the human brain. Even for the simplest cases, one would need eidetic memory, extremely high math aptitude, and decades of training to derive any meaningful information from such a multidimensional array of floating point numbers.

They represent the result of convergence during network training, and their values were determined so that the optimal signal strengths pass from cell to cell in subsequent layers during the exercising of what was learned in its intended use.

I don't see anything that will help you understand the leela-zero design at that the URL given in the question. The layer definitions would need to be generalized before it could become clear what approach was taken to arrive there.