Teraflop computing

ATSToffen

Junior Member
Jul 16, 2011
6
0
0
How many CPU must be connected and in which way to achieve 1 TFlop of computing power and how max RAM could be installed on that system?

Just curious how costly would be that system and what this beast would be able to do?

I think Intel is yet to release its 1TF 80 cores processor?

Note:
I asked this just for information and curiousty.
 

Mopetar

Diamond Member
Jan 31, 2011
8,529
7,793
136
It really depends on the type of work that you want to do and how the problem can be broken down.

You could probably make a Teraflop Map Reduce setup with 10 CPUs. Alternatively you could just get a single, high-end GPU like the 6990 or 590 and probably be able to break 1 TF for certain jobs.
 

Kevmanw430

Senior member
Mar 11, 2011
279
0
76
Honestly, it depends on the workload and what not. As said, get a 6990, and at least on something like bitcoin mining, which is great on GPU's, you might technically break 1TF. Also, the number of CPU's would differ between them. If you used Core 2 era CPU's, you may need 20, but if you use new SB Xeons, you might only need 10. (Only an example)
 

ATSToffen

Junior Member
Jul 16, 2011
6
0
0
It really depends on the type of work that you want to do and how the problem can be broken down.

You could probably make a Teraflop Map Reduce setup with 10 CPUs. Alternatively you could just get a single, high-end GPU like the 6990 or 590 and probably be able to break 1 TF for certain jobs.

Its basically database related tasks which includes add/edit and delete commands over more than 1 trillion records.

Could you please shed more light about teraflop mapping?
 

GammaLaser

Member
May 31, 2011
173
0
0
Its basically database related tasks which includes add/edit and delete commands over more than 1 trillion records.

Could you please shed more light about teraflop mapping?

Sounds like something primarily memory/IO bound. Anyway I haven't heard anything about Intel commercializing its 80-core teraflop CPU. From what I recall it wasn't fully x86 compatible and it was primarily designed to be a research platform.
 

Mopetar

Diamond Member
Jan 31, 2011
8,529
7,793
136
Its basically database related tasks which includes add/edit and delete commands over more than 1 trillion records.

Could you please shed more light about teraflop mapping?

For a large database you'll want a lot of RAM and fast discs, either some 15000 RPM drives in RAID or some SSDs. You may also want to determine if you can make the database more efficient or determine if there's anything you should be indexing that you're not.

MapReduce is essentially a technique where you're applying the same function to an entire set of data. The process works by farming out the computation to a large number of machines and having it report the result back to a central machine that hands out the work and handles reassigning work if any of the worker nodes fails. Here's a link to the Google publication. (PDF warning) It might be useful if you needed to process all of your records, but probably not terribly useful if you're just adding, deleting, and modifying single records.
 

ATSToffen

Junior Member
Jul 16, 2011
6
0
0
For a large database you'll want a lot of RAM and fast discs, either some 15000 RPM drives in RAID or some SSDs. You may also want to determine if you can make the database more efficient or determine if there's anything you should be indexing that you're not.

Of course all database related care will be taken.

MapReduce is essentially a technique where you're applying the same function to an entire set of data. The process works by farming out the computation to a large number of machines and having it report the result back to a central machine that hands out the work and handles reassigning work if any of the worker nodes fails. Here's a link to the Google publication. (PDF warning) It might be useful if you needed to process all of your records, but probably not terribly useful if you're just adding, deleting, and modifying single records.

Tasks would be mostly on single record based on their primary key. But there would be trillion records so here comes the true problem on which I want hold a grip before jumping into a deep sea.

You mentioned GPU in above post.. but I think these are specifically for Graphics related work? (pardon me if i m wrong)
 

Mopetar

Diamond Member
Jan 31, 2011
8,529
7,793
136
I haven't done any research on doing DB-related activity on a GPU, but from the way you make it sound, a GPU wouldn't help you.

Database activity is somewhat of a difficult problem. Are the computations that you're performing the result of user activity or something that you're constantly doing in the background? Depending on the type of traffic patterns that you need to deal with, the solution you might need could be drastically different. Trillions of records aren't a serious problem if you don't need to access or modify them beyond a given threshold. However, if the usage is based on external users, it complicates things.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Tasks would be mostly on single record based on their primary key. But there would be trillion records so here comes the true problem on which I want hold a grip before jumping into a deep sea.

A trillion record database is going to be running on a pretty high end cluster. This is not something you will design yourself. Your database vendor will design this for you, along with your actual DB design.

Lets put it this way: The largest databases in the world are around three trillion records. These are the call detail databases maintained by AT&T and Sprint. Tom Tom also has a three trillion record database of 5 years of world wide traffic data.
 
Last edited:

ATSToffen

Junior Member
Jul 16, 2011
6
0
0
A trillion record database is going to be running on a pretty high end cluster. This is not something you will design yourself. Your database vendor will design this for you, along with your actual DB design.

Lets put it this way: The largest databases in the world are around three trillion records. These are the call detail databases maintained by AT&T and Sprint. Tom Tom also has a three trillion record database of 5 years of world wide traffic data.

I severely doubt that these days 1 trillion records database is considered outstandingly big.

here http://serverfault.com/questions/168247/mysql-working-with-192-trillion-records-yes-192-trillion
 

GammaLaser

Member
May 31, 2011
173
0
0
1 trillion records is certainly atypical. That link you posted only proves this point because with that many records they had a very large disk storage requirement and even scaling down by a factor of 200 you will still need a sizable disk array to handle it.

I saw some papers about accelerating DB operations with GPUs (yes, they can be used for things other than graphics), however as I had expected the main limitation that was found is the ability to fully fit the dataset within the RAM of the GPU. So in this case you will most certainly be I/O bound and will not benefit from GPU. Plus I'm not aware of any mature DB implementations which use GPU acceleration.