• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Database Representation of Undirected Graph (Peer Nodes)

telstar1

Golden Member
I'm trying to figure out the best way to represent a network of nodes in an undirected graph.
Each node can have n peers.

I'd like to be able to derive the following:

1) Count all nodes in a given network
2) Retrieve all nodes closer than X steps away from a given node
3) Retrieve all nodes further than X steps away from a given node

What's the best way to represent this?
 
database as in sql database?
a table of nodes, an intersection table from node to node.

#1: trivial
#2: a crapload of joins
#3: a crapload of joins, followed by selecting the ones not in #2

if you want something like a C struct, I'd say give each node an array of all adjacent nodes, then breadth-first search for #2 and #3. for #1 you'd need to either do a complete breadth-first search, or have the # of nodes already stored.
 
Yeah ... SQL database.

Thanks for the response. I guess I'm worried that the load on my database is going to bog down performance terribly.
 
Originally posted by: telstar1
Yeah ... SQL database.

Thanks for the response. I guess I'm worried that the load on my database is going to bog down performance terribly.

Yea... I would imagine that if you have a fairly complex network, it would be VERY resource-intensive.
 
I've been trying to figure out how Friendster and Ringo handle this problem. Seems to me that they must limit the depth of their searches to whatever their systems are able to handle.

For instance, on their system, I'm linked directly to 4 friends ... and it lists me as having 1333 people in my personal network, yet one of my friends that I'm directly linked to has almost 10,000 people in his personal network. (He's directly linked to 15 friends so his reach would be greater). Since my first step goes to him, I'd end up going one less step than he would, so I'd end up with a smaller overall list.

I just don't see how their system will scale over time though. Eventually, it seems like it'd collapse under the weight of the network and trying to maintain all of the associations.
 
This type of problem sounds like it would scale very well with clusters... and let's say you have an average of 10 friends... first time gives you 10, next gives you 100, then 1000, then 10000. Based on some of the stuff I do at work, going 4 levels deep shouldn't take long at all. Generating a full diagram of the network, on the other hand, would 😉
 
Guess I'll run some simulatons and see how slow things get with 10,000 nodes at a depth of 5 or so.....
I was talking to somebody from work today, and they said they'd keep the whole representation in memory, and just read and write the entire blob to memory whenever there's a change to the network ... don't even do the processing using SQL.

I hadn't thought about that.

 
Originally posted by: telstar1
Originally posted by: Peter
3 is (1-2). 1 is trivial, 2 is the interesting thing.

Care to elaborate?

#3 is just #1 minus #2 (-> very simple to find, if you have the first two). #1 is trivial, and #2 is the one that is hard (because once you solve 2, you solved 3)
 
Oh ... I wasn't reading 1-2 as subtraction. 🙂
I thought it was some notation I didn't understand.
(I hate it when I make things harder than they need to be)
Thanks for the clarification.
 
Back
Top