Social website "friends lists". How is the data handled? Table design...

TechBoyJK · Apr 8, 2011

How would you think that busy sites like facebook and myspace manage each users' friends list. With the sites having millions if not hundreds of millions of users, and some users have thousands of friends, one giant table with all friend connections seems like a bad idea.

Each user get it's own table? Then you could potentially end up with millions of tables. That doesn't seem like good design either.

Granted, a table would only really need to be a few columns.

record id (bigINT), member_id(int), friend_id(int),

If you had 100million users and on average 100 friends per user, that's 10billion records. But it'd be 10billion small, tiny records with only integer based columns. While that'd be a huge record set, you could probably do a lot with caching, and creating different views.

Or, just do a table for each letter of the alphabet. If your last name begins with T, you go int the T table for friends. Then that table only contains the friend records of members who's last name begins with T. That would only require 26 tables, and could then cut 10billion records into tables with only 390million records.

DBA's, how would you approach this?

Crusty · Apr 8, 2011

A join table w/ the proper indexes.

Markbnj · Apr 8, 2011

Crusty said:
A join table w/ the proper indexes.

Yeah, I can't think of a better way than that. I have no idea how many friends the average Facebook user has, but I bet it is in the mid-double digits. Assuming 500m users, if it were only 10 average friends per user you'd have 5b rows. That's not a small table in anyone's world, so I'm assuming they do some sort of horizontal partitioning.

brandonb · Apr 8, 2011

One large table as you described above.

Index on member_id.

Relational databases can create binary trees to find the data quickily and will do the "table for each letter of the alphabet" stuff internally with it's leafs and nodes and branches and roots, etc.

WannaFly · Apr 8, 2011

I know SQL 2005+ can do database table partitions, but I'm no sure if performance is a reason to use them or not. Just thought I'd mention it.

KIAman · Apr 8, 2011

I would personally make 1 table with index on each column. Let the efficiency come from the queries and let the RDMS handle load balance.

Ka0t1x · Apr 8, 2011

Crusty said:
A join table w/ the proper indexes.

If you do anything else other than this... you'll run into problems coding it.

With the proper setup.. caching, indexes, etc. You'll never know the table was that large.

Also you have to realize.. the scripting could be caching your friends list as well. As you'd be the only one to modify it, and then it would modify your cache, then database.

Edit: With the example of facebook you could end up with fragmentation based on region (country code), etc. But again their coding framework could involve all of the checks for this, and adjust accordingly.

PhatoseAlpha · Apr 8, 2011

Hm. Would each 'friendship' be two entries then? A->B, & B->? Seems kind of wasteful.

beginner99 · Apr 8, 2011

http://www.developer.com/open/article.php/3894566/Inside-Facebooks-Open-Source-Infrastructure.htm

Not really an answer but interesting non the less.

It also gives you a small hint on how this could be achieved much more efficiently than with a RDBMS, namely with a graph database. But I don't know if the actually do that.

Then the hardware behind FB is ridiculous. And the DB sure somewhat split up (clustered, whatever).

http://www.mysql.com/products/cluster/

Ka0t1x · Apr 9, 2011

Actually I forgot they use a form of NoSQL.

http://cassandra.apache.org/

GaryJohnson · Apr 9, 2011

PhatoseAlpha said:
Hm. Would each 'friendship' be two entries then? A->B, & B->? Seems kind of wasteful.

I think you could keep each as a snigle entry, but you'd need two indexes.

Social website "friends lists". How is the data handled? Table design...

TechBoyJK

Lifer

Crusty

Lifer

Markbnj

Elite Member <br>Moderator Emeritus

brandonb

Diamond Member

WannaFly

Platinum Member

KIAman

Diamond Member

Ka0t1x

Golden Member

PhatoseAlpha

Platinum Member

beginner99

Diamond Member

Ka0t1x

Golden Member

GaryJohnson

Senior member

TRENDING THREADS