Best way to aggregate/group JSON data?

Eluros

Member
Jul 7, 2008
177
0
0
Hey, all,

Let's say I have some locally hosted/downloaded JSON data like the following:

Code:
 user: {
     "firstName": "John",
     "lastName": "Smith",
     "age": 25,
     "address":
     {
         "streetAddress": "21 2nd Street",
         "city": "New York",
         "state": "NY",
         "postalCode": "10021"
     },
     "phoneNumber":
     [
         {
           "type": "home",
           "number": "212 555-1234"
         },
         {
           "type": "fax",
           "number": "646 555-4567"
         }
     ]
 } , 
user: {
     "firstName": "Jane",
     "lastName": "Smith",
     "age": 26,
     "address":
     {
         "streetAddress": "29 2nd Street",
         "city": "New York",
         "state": "WA",
         "postalCode": "90961"
     },
     "phoneNumber":
     [
         {
           "type": "home",
           "number": "212 555-5566"
         }
     ]
 }

Now, let's say I wanted to aggregate/group that data to answer questions like the following:
1. How many distinct user.phonenumber.home OR user.phonenumber.fax are there?
2. For each distinct user, how many phonenumber entities do we have?
3. How many distinct user entities do we have?
4. For each distinct user.age, how many user entities have that user.age?


This isn't the exact dataset/queries I have, but it hopefully captures the idea. If I have a bunch of locally hosted/downloaded JSON data I want to aggregate/report on, what's the best way to do so? I'm not finding a clear solution after ~2 hours of research. I can try to learn D3 or something if I need to, but I'd prefer to find a more simple approach.

If there was a clever way to de-nest the data and get it into a standardized CSV format, that would be ideal, but I haven't been able to find a good solution, so far.

Thanks, all!
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,631
4,559
75
Locally hosted as in just for you, not client-side? I think this might be a job for MongoDB. Here's an example of Mongo handling "Model Tree Structures with Nested Sets".
 

Cogman

Lifer
Sep 19, 2000
10,284
138
106
It depends on what you want to do with the data. If you want to consume it once and throw it away, then I would go with some dynamic scripting language to chew threw it in whatever manner makes sense. If, on the other hand, you want to do multiple queries against it at various points in time, then Ken's suggestion of putting it into mongodb makes a lot more sense (though, you could also do this by chewing through the data once and throwing it into whatever db you have available)

For example, in ruby to do what you want it would be something like the following
Code:
require 'json'

data = JSON.parse(THE JSON)

data.each do |user|
  phone_numbers = user["phoneNumber"]
  phone_numbers do |number|
    return number["number"] if number["type"].downcase == "home"
  end
end