putting more than 1 cpu on a chip?

MrDudeMan

Lifer
Jan 15, 2001
15,069
94
91
how does that work? and will it make it faster by a LOT?

i dont understand this and how it would be cooled, powered, managed, speed boost, etc.

can anyone point me to a GOOD article on this? i have read anand's, but it didnt give me any answers.

thanks
 

Locutus4657

Senior member
Oct 9, 2001
209
0
0
Try doing a little research on the IBM Power 4 series.

Carlo

Originally posted by: MrDudeMan
how does that work? and will it make it faster by a LOT?

i dont understand this and how it would be cooled, powered, managed, speed boost, etc.

can anyone point me to a GOOD article on this? i have read anand's, but it didnt give me any answers.

thanks

 

capybara

Senior member
Jan 18, 2001
630
0
0
putting 2 cpus on one chip is the whole history of cpus:
2 x intel 4004 = intel 8088
2 x intel 8088=intel 80286
2 x intel 80286=intel 80386
2 x intel 80386=intel 80486, the first common 32 bit cpu.
there are already 64 bit cpus, like the sun ultra-sparc.
a 64 bit cpu would require 128 32-bit cpus in parallel.
 

pm

Elite Member Mobile Devices
Jan 25, 2000
7,419
22
81
There are two ways to do this. First you can put two separate die (CPU's) into the same package. IBM have been doing this for quite a while - mainly because they own most of the major patents on using liquid cooling to cool these solutions. As an extreme case of this, IBM has their Z900 multichip module with 20 CPU's and 32MB L2 cache all within 1 package that dissipates a little over 1.4kW (that's just this package alone, not the entire computer... just the 20 CPU multi-chip module). They use some form of liquid cooling through the package itself to have the die itself see an average temperature of 10C.

So that's one way - the expensive way.

The other way, is the way that the entire industry appears to be headed over the long term and that's multiple CPU's on the same die. If you think about it, there's room on a Pentium 4 die for a whole lot of 486's. So one way to think about it is putting a whole lot of little CPU's all onto the same chunk of silicon and hooking them up together. There are, of course, plenty of problems with this approach - or else everyone would be doing it. The biggest is that you need to have software that requires multiple CPU's to have it work out effectively. Multiple cores makes sense when you are IBM and you have one of their "eServers" running some transaction processing function on a website like Ebay. It makes a lot less sense for the average computer users. How many users on AT have dual-CPU systems? And what benchmarks really show off the power of that dual-CPU system? Plus, modern compilers do not handle more than dual CPU systems very well, nor do typical modern OS's (neither Windows nor Linux).

But multiple cores on the same die is most definitely in the cards in the future. Not the near future, but easily within the decade, I would think (no insider knowledge on this guess, by the way). The problem in the long term is that future process technologies have a lot of transistors available. The amount you can fit within a given space doubles every two years with each process generation. You can imagine that with hundreds of millions being used now, when we are three process technologies down the road and now have literally billions of transistors available. CPU's are already extremely complex - you have several hundred engineers working for several years (if not longer) to create these. If you increase the transistor count by 8-16x, there are only so many engineers you can throw at the problem (and having a team of 1000 is vastly less efficient/productive than typical teams today of 200 or so due to communication and logistical issues). So either you come up with an automated way to do CPU design (they have been trying to do this practically for as long as CPU's have been available largely unsuccessfully), you use a lot of transistors to make a really big cache (this is already being done in current CPU's - the HP PA-RISC 8700 and 8800 and Intel's Itanium 2), ), or you find a way to duplicate a core and thus hopefully nearly double performance while reducing the complexity of the chip nearly in half (you only have to create one core and then duplicate it).

The speed boost will depend entirely on the software, the application, the compiler, the available external bandwidth, the internal bandwidth, the microarchitecture, the OS, etc. It could theoretically be almost exactly double the performance of an individual CPU - but this is obviously best case. More realistically you could expect to see a 60-80% performance improvements in applications that are compiled to take advantage of this. In the worst case, you'd be barely better off that only having one core. The issues of power and cooling are really the exact same problem. The solution to this is to make sure that each core doesn't use too much power. You set a limit on the entire die, and then you make sure that each core doesn't use so much power that all together they are over the total limit.

As far as articles... there aren't that many. Most of them are fairly technical. I can suggest a couple of research papers, and some implementation papers. But for a basic overview... pretty much what I wrote above is a good start.

Patrick Mahoney
Microprocessor Design Engineer
Intel Corp.
Fort Collins, CO
 

pm

Elite Member Mobile Devices
Jan 25, 2000
7,419
22
81
putting 2 cpus on one chip is the whole history of cpus:
2 x intel 4004 = intel 8088
2 x intel 8088=intel 80286
2 x intel 80286=intel 80386
2 x intel 80386=intel 80486, the first common 32 bit cpu.
there are already 64 bit cpus, like the sun ultra-sparc.
a 64 bit cpu would require 128 32-bit cpus in parallel.
capybara, no offense intended, but this is utterly untrue. There are not 2 4004's in an 8088. The 8088 was a microarchitecture based on the 16-bit 8086 which had little to do with the 4004 at all. The i386 was a completely and totally revamped CPU that bore almost no resemblence at all to the i286. There aren't multiple cores inside each of these. And how on earth did you come up with a 64-bit CPU requiring 128 32-bit CPU's? This is, no offense intended again, utterly and completely wrong.

And the 486 was far from the first common 32-bit CPU. Aside from the obvious 386 itself (which was 32-bit, including the 386sx), there were earlier 32-bit CPUs that saw widespread use. The Motorola 68000 is a great example of this. It was released around 1980 - certainly a lot earlier than the i386 or any of it's clones which were introduced around 1985 or so. The 68000 was, for a long time, the best selling CPU of all time.
 

capybara

Senior member
Jan 18, 2001
630
0
0
pm: im repeating more or less what the teacher said in my a+ class, fwiw.
if he said "the i386 is putting together of 2 (or 4 i forget Xactly) i286s"
and you say its not, i dont know what to believe, no offense to either you or my
a+ teacher, of course ......<<<<me confused>>>>
>>>>which was the first 32 bit cpu in the intel world, your right , it was the 386,
not the 486.
 

pm

Elite Member Mobile Devices
Jan 25, 2000
7,419
22
81
pm: im repeating more or less what the teacher said in my a+ class, fwiw. if he said "the i386 is putting together of 2 (or 4 i forget Xactly) i286s" and you say its not.
Your teacher is most definitely incorrect. The i386 was a completely different architecture from the 80286. Compare for yourself. Here'sa die photo of a 8088, here's one of the 80286, and here's the i386SL. I worked on the Intel Pentium can definitely attest that the Pentium didn't contain 2 i486's - nor any other smaller subblocks of other CPU cores. It was a completely different microarchitecture from the i486 and the two had little that was similar between the two besides the fact that they both used the same instruction set.

Talk to your teacher about this if you are confused. Bring in a print-out of the thread, if you like. If your teacher wants more details, have your teacher email me or call me. My work email address is in my profile. If you email me, I'll give you my telephone number for your teacher to call.

Patrick Mahoney
Microprocessor Design Engineer
Intel Corp.
Fort Collins, CO
 

MrDudeMan

Lifer
Jan 15, 2001
15,069
94
91
pm...wow

very good post(s)

thats basically everything i wanted to know guys. thanks for all of the info.
 

Den

Member
Jan 11, 2000
168
0
0
I am sure PM is 100% technically correct.
Perhaps the teacher was speaking in abstract terms about the increasing transistor count? Or about the Change from 16 to 32 bit? Or perhaps even the data path? (Didn't that go 8,16,32,64 with 8086,286,386,Pentium?)

Just a thought...
 

piasabird

Lifer
Feb 6, 2002
17,168
60
91
About 2 years ago I saw an article on this at the AMD website. They were talking about placing 2 CPU's on one die. This would require a more complex operation than what you might think. You have to have a control mechanism to determine what gets run on what processor. On a multiprocessor motherboard this is taken care of by special software. If you placed 2 CPU's on a single chip they would need some kind of hardware control mechanism to control this.

A modern CPU already incorporates what they call a "Co-Processor", which is specifically designed to run the longer more complex instructions. On a processor many of the simple instructions are hard wired into the circuits. However some more complex instructions are placed into on-CPU Read Only Memory on the CPU. These more complex instructions are usually a combination of several instructions.

The old Von-Neuman type computing devices had a single pathway for the instuctions to go down and each instruction was run one at a time in the proper order. Newer CPU's can have multiple pathways like 6 or 8 or 12, and execute them all at once. This means the instructions are being run out of order, if possible. The CPU also incorporates special logic that guesses which instruction the program will execute next and trys to load that up ahead of time.

You might see if AMD still sends out all of its documentation on its processors to better understand this.
 

Cattlegod

Diamond Member
May 22, 2001
8,687
1
0
Originally posted by: pm
pm: im repeating more or less what the teacher said in my a+ class, fwiw. if he said "the i386 is putting together of 2 (or 4 i forget Xactly) i286s" and you say its not.
Your teacher is most definitely incorrect. The i386 was a completely different architecture from the 80286. Compare for yourself. Here'sa die photo of a 8088, here's one of the 80286, and here's the i386SL. I worked on the Intel Pentium can definitely attest that the Pentium didn't contain 2 i486's - nor any other smaller subblocks of other CPU cores. It was a completely different microarchitecture from the i486 and the two had little that was similar between the two besides the fact that they both used the same instruction set.

Talk to your teacher about this if you are confused. Bring in a print-out of the thread, if you like. If your teacher wants more details, have your teacher email me or call me. My work email address is in my profile. If you email me, I'll give you my telephone number for your teacher to call.

Patrick Mahoney
Microprocessor Design Engineer
Intel Corp.
Fort Collins, CO

yep, he is right, we studied the 586 a bit in my computer architecture class last year.

also, determining the "bit" of a processer is just how many bits are in its registers.

 

pm

Elite Member Mobile Devices
Jan 25, 2000
7,419
22
81
yep, he is right, we studied the 586 a bit in my computer architecture class last year.
Considering I spent two years of my life designing and validating the P54CS Pentium 133-200, I'd sure hope I'm right.
Also, determining the "bit" of a processer is just how many bits are in its registers.
Because it's a marketing term, it is not this black and white. The FPU registers in the 486 and up are 82-bits, but the 486 is definitely a 32-bit CPU. Also, the SSE registers are 64-bits wide in the 32-bit Pentiums. You can do 128-bit logic operations in SSE2, so I don't think the width of the IEU operations. It's not the external data bus - although this is part of it because the 386sx was a 32-bit CPU internally with a 16-bit bus and a lot of people said that it wasn't a true 32-bit CPU. It's probably not the addressable memory alone because the newer members of the Pentium and Xeon family can address 36-bits and still most call them 32-bit CPU's.

My definition of bitness of "x" is all of the following:
  • Native integer operations are x-bits.
  • Standard integer registers are x-bits.
  • The CPU architecture is capable of addressing x-bits in flat memory mode. (the CPU doesn't necessarily have to implement this function - but the architecture must be capable of it)
  • The width of the external data bus must be at least x-bits wide.
 

jhu

Lifer
Oct 10, 1999
11,918
9
81
Native integer operations are x-bits.

Standard integer registers are x-bits.

The CPU architecture is capable of addressing x-bits in flat memory mode. (the CPU doesn't necessarily have to implement this function - but the architecture must be capable of it)

The width of the external data bus must be at least x-bits wide.

what do you mean with the 3rd statement? with the 4th statement, what would you consider the 386sx?
 

PentiumIV

Member
Feb 19, 2001
56
0
0
Because it's a marketing term, it is not this black and white. The FPU registers in the 486 and up are 82-bits, but the 486 is definitely a 32-bit CPU.


Architecturally, x87 FP registers are 80 bits wide. (IEEE 754 extended precision).
If my memory serves me well, on IA 64 FP registers are 82 bits wide macroarchitecturally.


 

Sunner

Elite Member
Oct 9, 1999
11,641
0
76
For various technical articles, may I recomend RealWorldTech, and Ace's Hardware.
Very good sites, with lots of interesting articles, and one specifically addressing the POWER4 can be found here.

Argh, I still hate the syntax of the [L] tag in these forums :|
Thanks JHU.