1

## Problem

I have the following directed tripartite graph $$G(E\cup V\cup P, A)$$, where there is a many-to-one symmetric relationship between the subsets V and E - $$e\in E,v\in V,[e, v]\in A \iff [v, e]\in A$$ - and a many-to-many relationship between the subsets P and V. All edges $$[x, y]\in A$$ have a weight $$w_{xy}$$ which determines the portion of score that will be propagated from node x to y. $$\forall_{x\in G}\sum_{y\in G}w_{xy}=1\\\ [x, y] \notin A \iff w_{xy}=0$$ I got the following naive equations, where $$SP_t(p)$$ is the Score of "p" at iteration "t" (same for SV and SE), $$SP_0(p)$$ is the starting score of "p": $$SP_{t+1}(p)=SP_0(p)+\sum_{v\rightarrow p}SV_t(v)w_{vp},\\\ SV_{t+1}(v)=SV_0(v)+\sum_{p\rightarrow v}SP_t(p)w_{pv}+\sum_{e\rightarrow v}SE_t(e)w_{ev},\\\ SE_{t+1}(e)=SE_0(e)+\sum_{v\rightarrow e}SV_t(v)w_{ve}.\\\ (1)$$ I want to compute the scores of each node in a way that I can rank those nodes within their subset. If a node's neighbors are high ranked, then this node is also high ranked.
1) How can I prove that iteration will converge to $$SP_t(p)-SP_{t+1}(p)\leqslant\epsilon$$ (same for SV and SE) for a given threshold $$\epsilon\in R^+$$ in a viable time (that is, at max in linear time complexity)?

### Generalization

I think what I am trying to do here is resolve a kind of random walk over G, the scores are random variables since a node's score depends on other nodes pointing to it. This problem can be modeled as a Markov Chain, I need to find the fraction of time the random walker spends at each node in G (it is the normalized score).
2) Let "s" be the vector of score of all nodes in $$G$$ and $$W^{|G|\times |G|}$$ the transition matrix of the markov chain, I want to find a "s" such that $$sW=s$$. It is equivalent to find the convergence of system (1) when $$\epsilon =0$$. Am I right? If so, how can I prove that "s" exists and can be found in a viable time?

### Related algorithms

Searching I found the algorithms PageRank and Generalized Co-HITS that solves similar problems, PageRank was designed for unipartite graphs and Generalized Co-HITS designed for bipartite graphs.

The PageRank(PR) of a page "p" is given by $$PR_{t+1}(p)=(1-a)\frac 1n+a\sum_{u\rightarrow p}PR_t(u)\frac 1{d_u^+}$$ - where "p" and "u" are nodes, "n" is the # of nodes, "a" is the "damping factor", $$d_u^+$$ is the outdegree of "u".
3) I see that I could use it but I am not sure if it will compute an accurate score to rank nodes within their subsets, because PR will rank all nodes within the superset P + V + E. Am I right?

Generalized Co-HITS looks like PR. Consider just two subsets P and V which makes a bipartite graph: $$SP_{t+1}(p)=(1-a)SP_0(p)+a\sum_{v\in V}SV_t(i)w_{vp},\\\ SV_{t+1}(v)=(1-b)SV_0(v)+b\sum_{p\in P}SP_t(p)w_{pv}.\\\ (2)$$ I tried to extend it to a tripartite graph based on system (1). I substituted SV in SP and SE: $$SP_{t+1}(i)=SP_0(i)a+(1-a)b\sum_{j\in V}W_{ji}^{vp}SV_0(j)+(1-a)(1-b)\left[\sum_{m\in P}\left(\sum_{j\in V}W_{mj}^{pv}W_{ji}^{vp}\right)SP_t(m)+\sum_{n\in E}\left(\sum_{j\in V}W_{nj}^{ev}W_{ji}^{vp}\right)SE_t(n)\right],\\\ SV_{t+1}(j)=SV_0(j)b+(1-b)\left(\sum_{m\in P}W_{mj}^{pv}SP_t(m)+\sum_{n\in E}W_{nj}^{ev}SE_t(n)\right),\\\ SE_{t+1}(k)=SE_0(k)c+(1-c)b\sum_{j\in V}W_{jk}^{ve}SV_0(j)+(1-c)(1-b)\left[\sum_{m\in P}\left(\sum_{j\in V}W_{mj}^{pv}W_{jk}^{ve}\right)SP_t(m)+SE_t(k)\sum_{j\in V}W_{kj}^{ev}W_{jk}^{ve}\right].\\\ (3)$$ and got a weird system. Here I swapped "a", "b", and "c" (all acts like damping factors) positions, now they multiply the starting score. $$W_{}^{pv}$$ is the weight matrix from P to V. But I changed the end of 3rd equation: $$\sum_{n\in E}\left(\sum_{j\in V}W_{nj}^{ev}W_{jk}^{ve}\right)SE_t(n) \equiv SE_t(k)\sum_{j\in V}W_{kj}^{ev}W_{jk}^{ve}$$ because of the many-to-one symmetric relationship between V and E, there is no path $$e_1\rightarrow v_1\rightarrow e_2$$ (but it may happen in paths like $$e_1\rightarrow v_1\rightarrow p_1\rightarrow v_2\rightarrow e_2$$), I have doubts if it keeps all properties of system (2) still valid.

4) Is system (3) true, the properties of system (2) are still valid in system (3)?

5) What is the goal of that "damping factor"? I think it is just there to weight the contribution coming from propagation and the contribution of starting score to the final score. So a = 0.15 means 15% of final score came from starting score and 85% came from propagation. Thinking better, it is the union of two disjoint events, event 1 with 0.15 probability and event 2 with 0.85, where event 1 is the random walker arriving at a node coming by a meta-edge and event 2 is it coming by this node's neighbors. PageRank's authors calls that meta-edge a jump or teleportation, a web surfer may start browsing from any webpage. Am I right? and why other similar algorithms still use that damping factors? PageRank was modeled for the Web.

Should I split those questions in several posts? Maybe I am missing something obvious, I am still learning. Thanks in advance.