Database Representation of Undirected Graph (Peer Nodes)

telstar1 · Aug 8, 2003

I'm trying to figure out the best way to represent a network of nodes in an undirected graph.
Each node can have n peers.

I'd like to be able to derive the following:

1) Count all nodes in a given network
2) Retrieve all nodes closer than X steps away from a given node
3) Retrieve all nodes further than X steps away from a given node

What's the best way to represent this?

CTho9305 · Aug 8, 2003

database as in sql database?
a table of nodes, an intersection table from node to node.

#1: trivial
#2: a crapload of joins
#3: a crapload of joins, followed by selecting the ones not in #2

if you want something like a C struct, I'd say give each node an array of all adjacent nodes, then breadth-first search for #2 and #3. for #1 you'd need to either do a complete breadth-first search, or have the # of nodes already stored.

telstar1 · Aug 8, 2003

Yeah ... SQL database.

Thanks for the response. I guess I'm worried that the load on my database is going to bog down performance terribly.

CTho9305 · Aug 8, 2003

Originally posted by: telstar1
Yeah ... SQL database.

Thanks for the response. I guess I'm worried that the load on my database is going to bog down performance terribly.

Yea... I would imagine that if you have a fairly complex network, it would be VERY resource-intensive.

telstar1 · Aug 8, 2003

I've been trying to figure out how Friendster and Ringo handle this problem. Seems to me that they must limit the depth of their searches to whatever their systems are able to handle.

For instance, on their system, I'm linked directly to 4 friends ... and it lists me as having 1333 people in my personal network, yet one of my friends that I'm directly linked to has almost 10,000 people in his personal network. (He's directly linked to 15 friends so his reach would be greater). Since my first step goes to him, I'd end up going one less step than he would, so I'd end up with a smaller overall list.

I just don't see how their system will scale over time though. Eventually, it seems like it'd collapse under the weight of the network and trying to maintain all of the associations.

CTho9305 · Aug 8, 2003

This type of problem sounds like it would scale very well with clusters... and let's say you have an average of 10 friends... first time gives you 10, next gives you 100, then 1000, then 10000. Based on some of the stuff I do at work, going 4 levels deep shouldn't take long at all. Generating a full diagram of the network, on the other hand, would 😉

telstar1 · Aug 9, 2003

Guess I'll run some simulatons and see how slow things get with 10,000 nodes at a depth of 5 or so.....
I was talking to somebody from work today, and they said they'd keep the whole representation in memory, and just read and write the entire blob to memory whenever there's a change to the network ... don't even do the processing using SQL.

I hadn't thought about that.

Peter · Aug 9, 2003

3 is (1-2). 1 is trivial, 2 is the interesting thing.

telstar1 · Aug 9, 2003

Originally posted by: Peter
3 is (1-2). 1 is trivial, 2 is the interesting thing.

Care to elaborate?

CTho9305 · Aug 9, 2003

Originally posted by: telstar1

Originally posted by: Peter
3 is (1-2). 1 is trivial, 2 is the interesting thing.

Click to expand...

Care to elaborate?

#3 is just #1 minus #2 (-> very simple to find, if you have the first two). #1 is trivial, and #2 is the one that is hard (because once you solve 2, you solved 3)

telstar1 · Aug 9, 2003

Oh ... I wasn't reading 1-2 as subtraction. 🙂
I thought it was some notation I didn't understand.
(I hate it when I make things harder than they need to be)
Thanks for the clarification.

Database Representation of Undirected Graph (Peer Nodes)

telstar1

Golden Member

CTho9305

Elite Member

telstar1

Golden Member

CTho9305

Elite Member

telstar1

Golden Member

CTho9305

Elite Member

telstar1

Golden Member

Peter

Elite Member

telstar1

Golden Member

CTho9305

Elite Member

telstar1

Golden Member

TRENDING THREADS