There are of course a number of ways this can be done, such as majority voting or some other rule-based algorithm, however it can also be done through supervised learning since you have some labels for the trees.

I would make the input space of my model the normalized frequency of the categories for a tree. This means you will need a dictionary of possible categories for the nodes, usually obtained from your training set. Then you can tabulate the frequency of instances.

For example if we have a website with the following node classes:

- News: 5
- Opinions: 9
- About: 1

Then we can formulate our input vector as $[0.33, 0.6, 0.067]$.

You can then train this model using your already labeled trees. The model will then be capable of classifying future trees in this same way.

To determine the top $K$ classes for a tree you will need a model which can do this (most can). If you use K-NN (different K) then you can pick the $K$ closest neighbourhoods. With Random Forests or Naive Bayes you can pick the $K$ classes with the highest probabilities.

To consider the fact that you have a list of $K$ categories for each node you can add a **weighting** when calculating the normalized frequencies. For example let's say we have 3 classifications and the following webpages (nodes).

- Page 1: News, Opinion, Commentary
- Page 2: News, Advertisement, Opinion
- Page 3: Commentary, News, Adverisement
- Page 4: News, Opinion, Advertisement

Then the input vector can be calculated by awarding 3 points to the first category, 2 to the next and 1 to the last. This results in $[0.49, 0.21,0.17, 0.17]$. Alternatively, if you have a probability for these classifications you can use that as the weighting factor.

Do you have any websites which you or some other people have already assigned a label for the overall tree? – JahKnows – 2018-05-11T07:50:58.350

Yes I have labelled data for domains. – Samyak Jain – 2018-05-11T07:54:07.460

And you can already automatically label the nodes (webpages) of your tree? – JahKnows – 2018-05-11T07:57:24.740

Yes I have a model that assigns categories to a node from a predefined set of categories and also I have labeled data for the domains , where each label is domain category. As of now I need some suggestions how to assemble node categories to generate an overall category. – Samyak Jain – 2018-05-11T08:00:27.543