## Parsing data from a string

2

I think this is something that experienced programmers do all the time. But, given my limited programming experience, please bear with me.

I have an excel file which has particular cell entries that read

[[{"from": "4", "response": true, "value": 20}, {"from": "8", "response": true, "value": 20}, {"from": "9", "response": true, "value": 20}, {"from": "3", "response": true, "value": 20}], [{"from": "14", "response": false, "value": 20}, {"from": "15", "response": true, "value": 20}, {"from": "17", "response": false, "value": 20}, {"from": "13", "response": true, "value": 20}]]

Now, for each such entry I want to take the information in each of the curly brackets and make a row of data out of it. Each such row would have 3 columns. For example, the row formed from the first entry within curly brackets should have the entries "4" "true" and "20" respectively. The part I posted should give me 6 such rows, and for n such repetitions I should end up with a matrix of 6n rows, and 4 columns ( an identifier, plus the 3 columns mentioned).

What would be most efficient way to do this? By "doing this" I mean learning the trick, and then implementing it. I have access to quite a few software packages(Excel, Stata, Matlab, R) in my laboratory, so that should not be an issue.

1

This format is called "JSON". Curly braces form objects, while square ones represent arrays. Most programming languages have libraries for parsing JSON (e.g. for R see this question), so parsing it becomes trivial task.

– ffriend – 2014-11-14T23:46:31.913

3I think this would be more appropriate at StackOverflow. It's not related directly to data science, but just parsing a common data format. – Sean Owen – 2014-11-15T14:20:32.547

2

If you have R is quite simple

1. Copy the lines into a file, let's say: "mydata.json"

2. Be sure you have installed the rjson package

 install.packages("rjson")

3. Import your data

 library("rjson")
json_data <- fromJSON(file="mydata.json")


0

If these data are available in the actual excel spreadsheet cells (ie, before you export them to the JSON format provided in your question), you can use the following to get them into R:

1. highlight the region of interest within excel
2. copy it to the clipboard (eg. Ctrl-C)
3. At an R prompt type:

d <- read.delim('clipboard')


The data will now be available as a data.frame in R.

d
from response value
1    4     TRUE    20
2    8     TRUE    20
3    9     TRUE    20
4    3     TRUE    20
5   14    FALSE    20
6   15     TRUE    20
7   17    FALSE    20
8   13     TRUE    20