Data visualization of frequencies of state transitions (possibly in R?)

3

4

I am working on some experimental data, which can be of types A, B and C. Now I observe this data for 5 time points, and I can see them move between A to B, B to C,... etc. I see such transitions for a number of independent data points, and I have the cumulative frequencies from all data.

For example, I have: $$ Period A B C \newline 1 4 4 2 2 1 2 7 3 0 1 9 4 10 0 0 5 8 1 1 $$

I DO know the transitions from one state to another, for example from A->B, B->C so on and so forth. For example I know that from Period 1, (all A's went to C. Among the missing B's one went to A, and rest to C.) I was thinking of what would be the best way to visually represent this time wise transitions from one state to another. I was thinking that there might be some better way than just having a transition matrix, maybe something that looks like a Markov Chain but which could accommodate all the 5 periods of transitions in a succinct way? I myself work on a statistical software called STATA, which has limited graphical applications. IS there something on other software packages (R maybe?) which can help me in this?

  • Sorry for the hack representation of the data matrix.

Juanito

Posted 2016-04-14T18:59:44.113

Reputation: 105

1Is the first line correct or should that also add up to 10? And is my understanding correct that for example in line 3 you don't know where the singleton B came from? – Jan van der Vegt – 2016-04-19T07:42:24.450

1I'm not clear on your data, so its hard to suggest solution. I understand you have 5 "snapshots in time". So do you have, say 20 items that you are observing and from the first line, 4 are in state A, 4 are in state B, 3 are in state C? Then for period 2, only 1 is in state A, 2 are in state B and 7 in state C? If this is true, do you have more granular data? do you know the order that states change from and to, is the state transition matrix well established. – Marcus D – 2016-04-19T11:16:24.633

@JanvanderVegt Yes, I will edit to make it add up to 10. Also, I DO know what transitions where, so I know the flow from A->B, B->C etc – Juanito – 2016-04-19T18:43:25.137

@MarcusD I have edited the question to answer what you asked. I can track each data point, through the states for each point in time. There is no state transition matrix as such, as the probability from going from state A to B, is not fixed apriori. Let me know if that answers your questions. If not, direct me to where you would need more clarification, and I would be glad to do that. – Juanito – 2016-04-19T18:51:55.747

1

This post I made in stack overflow some time ago may be of interest to you: http://stackoverflow.com/questions/32633507/r-need-help-on-multi-state-markov-and-block-bootstrap-please/32694008#32694008

– tguzella – 2016-04-19T21:14:09.357

1What sort of analysis do you want to do in the end? I have an idea, but it could be completely off track, depending on what analysis you are doing. – Marcus D – 2016-04-19T21:15:45.773

I have the state transitions under 4 different treatments, and I am interested to see if the state transitions in different periods look different by treatments. (From just eyeballing the data, I think there is a difference. ) I am thinking of setting up the comparison exercise by first having a nice graph of chart of the period wise transitions by treatments. I have not thought of any statistical tests for significance yet, but I am guessing that comes later. Feel free to share what you have in mind. – Juanito – 2016-04-19T21:28:27.187

Answers

3

How about a Sankey diagram with time on the x-axis and flow width representing state transition frequency. Here is a SO discussion on implementing Sankey diagrams in R. enter image description here

One possible R package is {riverplot}... here is code showing the first transition in your data:

library(riverplot)
nodes <- as.character(sapply(1:2, FUN = function(n){paste0(LETTERS[1:3],n)}))
edges <- list(A1=list(C2=4), B1=list(A2=1,C2=1,B2=2), C1=list(C2=2))
r <- makeRiver( nodes, edges, node_xpos= c( 1,1,1 ,2,2,2),
                node_labels= c( A1= "A", B1= "B", C1= "C", A2="A",B2="B",C2="C" ))
plot( r )

Will produce this: enter image description here

Brandon Loudermilk

Posted 2016-04-14T18:59:44.113

Reputation: 1 206

2

If you have the data in the form of a table of transition counts: $$ Transition Period 1 Period 2 Period 3 Period 4 \newline A->A 0 0 0 8 A->B 0 0 0 1 A->C 4 1 0 1 B->A 1 0 1 0 B->B 2 0 0 0 B->C 1 1 0 0 C->A 0 0 9 0 C->B 0 0 0 0 C->C 2 7 0 0 $$ Then a possible visualization is an area plot. The following chart was produceds in Excel (use Charts/Area button on the Insert ribbon). This chart accurately captures all transitions that occurred in each period. Shaded areas of different colors represent the relative frequencies of transitions by origin-destination pair.

user3605620

Posted 2016-04-14T18:59:44.113

Reputation: 121

This looks very promising! Could you kindly tell me what software+command you used to generate this, and help me in how to read the plot. For example, what does the orange/ green band mean? – Juanito – 2016-04-22T00:16:38.083

I have added more detail to my answer. – user3605620 – 2016-04-22T01:45:12.537

0

I'm not sure if this is the type of analysis you are after, but you mention that the visual side is restricted in STATA. A colleague wrote a blog that utilised neo4j to read web data into a graph database, and d3js to display the data graphically.

I realise you don't have web data as such, but your data can be stored in a graph database, but I guess when I was asking about what types of analysis you were planning on doing, I was asking were you needing a qualitative or quantitative direction. But it seems like you are still in the process of working that out. The nice thing with neo4j is that you can pull the data into R and do any sort of analytics you want on it.

Marcus D

Posted 2016-04-14T18:59:44.113

Reputation: 571

I am not looking for a quantitative or qualitative direction per se. I would know what to do with the data, and what regressions to run. I am looking for a good way to show graphically how the state transitions differ across treatments. For example, Tguzella's comment on my post is the closest I have received to what I am looking for. – Juanito – 2016-04-20T18:18:45.117

I'd certainly look at my suggestion of neo4j/d3js then, as it will show graphically how your various states differ. – Marcus D – 2016-04-20T18:26:56.613