Why dual cores and not one single faster one?

imported_Phil · Aug 17, 2004

Originally posted by: BDSM
Dap.. yes if we took a nv40 and slapped it together with a prescott @ 3 ghz+ it could probably melt glass.

BUT.. Maybe we wouldn't need 16 pipelines if we were at 3.4 ghz.. if we had only 4 that would (according to my very scienific calculations about equal 16 pipes at 850 mhz (twice what we have today) and we would only have used a fraction of the die space too.

Then lets integrate a decent soundchip, good nic and a hdd controller as well

Just imagine a chip that included all that.. No slow buses. everything ON chip!. Plus everything would run inside the gfx memory (super fast gddr3 ofcourse)(which would also serve as main memory).

I remember using a very simple 1 chip computer back in school. It was from the early 80's. It didn't have a nic and no sound.. but it worked. And it was tiny. Surely they could make 1 chip puters today that rocked if they wanted to!

I know.. these are all wet fantasies for a nerd like me.. But nerds have the right to have those just like everyone else

While I see where you're coming from, I have to point out that if you want to change the sound chip, NIC or a new form of HDD controller comes out, you're stuck; you'd have to change the entire CPU, which is an enormous waste.
Plus, you'd probably need to amalgamate Creative, Silicon Image, D-Link and a CPU manufacturer. Result = monopoly investigation. (Replace names of manufacturers with manufacturers of choice)

BDSM · Aug 17, 2004

Well.. Look at the bright side.. when you upgrade.. you upgrade EVERYTHING!

Mik3y · Aug 17, 2004

btw, some people believe that dual cored 2.5ghz cpu's will total up to 5ghz. not true. dual cored performs the same as running a dually setup. the primary performance increase is just multitasking. basically, the system will process data "better", not "faster".

clarkey01 · Aug 17, 2004

correct mik3y.

DonPMitchell · Aug 17, 2004

Integrating GPU and CPU is an interesting question, basically an asymmetric dual core. I know that Intel and nVidia eye each other as rivals (a friend at nVidia jokingly calls Pentiums the "host co-processor"). There are probably good economic reasons to not bundle these two types of devices, to allow them to evolve independantly with different market cycles. But I'm sure Intel would love to eat nVidia's lunch someday.

As for clock speeds, GPU's can't be clocked fast because they have short pipelines that take heavy steps. A GPU might do a floating point addition in one clock step, while a CPU is only executing one level of transister switching per clock. To get current CPU performances, you need deep pipelines, and the deeper you pipeline the more complex and sophisticated the design has to become. I doubt nVidia or ATI have the resources to do that kind of design, certainly not with the rapid turnaround demanded in that market.

futuristicmonkey · Aug 17, 2004

Originally posted by: BDSM
Thanx for all the replies guys!

I have two other lil questions 4 ya:

1. Why can gpu's only achieve maybe 500 mhz or so at a 0.13 micron process while the p4 does six times that with ease?

Cheers!

Here, check this out - its a thread i started a while ago with the same question, there's a lot of good info in there

DonPMitchell · Aug 18, 2004

Also besides the pipeline length issue I mentioned above, scary things happen in circuits at high clock speeds. GPUs are slapped together fairly cheaply with ASIC technology and software generated logic. A CPU more customized and operates in a regime where wires act like transmissions lines, not capaciters in "0" or "1" state. They have to keep wires father apart to avoid cross talk. Lots of scary problems with clock synchronization and high frequency electrical phenomenon.

All that is layered on top of the complexity of deep pipelining, and breaking operations into many tiny steps. But maybe these technologies will be applied to GPUs if their overall design stabilizes. It might be easier to superpipeline a GPU, if they don't have to deal with branching and exceptions and context switching. Well I don't know about context switching, I know Microsoft is already putting some GPU scheduling into Longhorn.

Exodus88 · Aug 18, 2004

Originally posted by: BDSM
2. Why can't they put both the cpu and gpu on the same chip?.. preferably with a gig of super fast gddr 3 @ 1 ghz .. Just imagine the sandra scores with a 256 bit bus.. woohoo

Actually that is possible and has been done. Sony Playstation 2, 500mhz CPU/GPU

Sunner · Aug 18, 2004

Originally posted by: BDSM
Dap.. yes if we took a nv40 and slapped it together with a prescott @ 3 ghz+ it could probably melt glass.

BUT.. Maybe we wouldn't need 16 pipelines if we were at 3.4 ghz.. if we had only 4 that would (according to my very scienific calculations about equal 16 pipes at 850 mhz (twice what we have today) and we would only have used a fraction of the die space too.

Then lets integrate a decent soundchip, good nic and a hdd controller as well

Just imagine a chip that included all that.. No slow buses. everything ON chip!. Plus everything would run inside the gfx memory (super fast gddr3 ofcourse)(which would also serve as main memory).

I remember using a very simple 1 chip computer back in school. It was from the early 80's. It didn't have a nic and no sound.. but it worked. And it was tiny. Surely they could make 1 chip puters today that rocked if they wanted to!

I know.. these are all wet fantasies for a nerd like me.. But nerds have the right to have those just like everyone else

A chip like that would be huge.
A Prescott is something like 115 mm2, don't know the size of an NV40, but it's probably more, and you'd need a memory controller, an interface between the CPU and GFX card, etc etc.
300-350 mm2 perhaps?
Not to mention a $hitload of pins for this wonderchip.

Yields would probably be disastrous if you're talking about massproduction in a price sensitive market, this would be a massively expensive chip.
Also, upgradeability would take quite a hit since you'd need a new CPU/GPU package, new motherboard, and new memory to upgrade.

Also, it would increase the complexity of the motherboard, making that more expensive as well.

Yanagi · Aug 18, 2004

Yeah, it would be huge but would be really cool to see how it would perform. I believe that budget PCs will one day feature everything on chip. Budget cpu with budger gfx and so on. The die will probably not be as big then.

Sunner · Aug 18, 2004

Originally posted by: Yanagi
Yeah, it would be huge but would be really cool to see how it would perform. I believe that budget PCs will one day feature everything on chip. Budget cpu with budger gfx and so on. The die will probably not be as big then.

IIRC Cyrix had something like this back in the day.
MediaGX or something, had CPU/GPU/Sound on the same chip, maybe a communications device as well.

BDSM · Aug 18, 2004

Hey there Donp!.. Thanx for explaining that to me. I suspected it was a short pipeline issue. But.. I am sure it would be possible to work around that if they really wanted to.

Sunner. yes.. it would be a huge chip. BUT if we have all these things on one chip we would likely gain a lot of performance from not having to thins over slow busses, so we could probably get away with a lil lower internal clockrates. Just imagine a mobo with just one big chip and some ram and then some connectors on it. It could get very small.
SO.. I still think it would be possible

Yanagi.. I agree. I believe though that in the future (20 years or more from now) even "budget" puters will be more than "fast enough" for everything. So they will go for price, functionality and power efficiency. So everything will be integrated I am sure.

OK.. just be rambling again.. but it's fun

imported_Phil · Aug 18, 2004

Originally posted by: Sunner
A chip like that would be huge.
A Prescott is something like 115 mm2, don't know the size of an NV40, but it's probably more, and you'd need a memory controller, an interface between the CPU and GFX card, etc etc.
300-350 mm2 perhaps?
Not to mention a $hitload of pins for this wonderchip.

Yields would probably be disastrous if you're talking about massproduction in a price sensitive market, this would be a massively expensive chip.
Also, upgradeability would take quite a hit since you'd need a new CPU/GPU package, new motherboard, and new memory to upgrade.

Also, it would increase the complexity of the motherboard, making that more expensive as well.

Haha, I'm just imagining a picture of some guy upgrading his CPU/GPU/etc by using an automotive engine winch to lower it into place...

Jeff7181 · Aug 18, 2004

Why dual cores and not one single faster one?

Because the future of computing is in parallel processing. Speeds are so fast right now that it doesn't take long for an instruction to be processed. What takes so long is multiple instructions being processed. So, if you can do two at the same time you get twice as much work done. The argument in the past has been that there wasn't two things that needed to be processed at the same time. Now with all the multi-tasking that we like to do, and games that have such advanced AI programs, and virus scanning/spyware scanning that's being done in real time, there's definately a place for parallel processing.

If you make the OS and software aware that two things can be done at once, even when there's not two programs running, you can still do things in parallel. Even things that don't appear to require more processing power can be accomplished faster because you're pumping out two results at the same time... to get that kind of speed increase by just increasing clock speed you would need a 6.8 GHz Pentium 4 or a 4.8 GHz Athlon-64. And then you still have to deal with the pipeline stalling due to branch mispredicts... with dual processing, a branch mispredict on one core wouldn't be as noticeable because the other core can still be humming along just fine.

Just look at GPU's... so many things need to be processed at the same time to be displayed on the screen, we have 16 pipelines in GPU's now. That's a 16 pipeline GPU vs. a 1 pipeline CPU. I'm not sure how long the pipelines are in GPU's, or how many instructions per clock cycle they do... but if they're close to the same as a CPU, you're talking about 16 500 MHz pipelines... which would be about the same as an 8000 MHz single pipeline.

So yeah... things are fast enough now where single instructions can be executed so quickly, that's it's more beneficial to do two instructions at the same time, rather than double speed at which a single instruction can be executed.

Sahakiel · Aug 18, 2004

Let's see... hope I don't miss anything...

In the past, multi-core processor packages were reserved for the high-end due to the extraordinary cost/performance ratio. Thanks to Moore's Law, transistor budgets are now high enough to offset the cost of extra logic. In fact, for the past several years, the increasing on-die caches have been a function of both increasing transistor budgets and increasion memory gap.
Nowadays, a lot of people view dual-core for the consumer as a band-aid. In a way, that's true because of the growing difficulties in clock scaling. It's not like nobody saw it coming. It's just that the problems showed up sooner than anyone predicted. No one's fault, really. CPU architects have to predict four to five years into the future and process technicians look even further. Since clock scaling is hitting a brick wall, the easiest way to increase performance (give consumers incentive to buy) is dual-core. Same situation with the internet boom. Whether or not people really need more performance, CPU companies have to make money.
Dual-cores are like SMP systems with a shared bus architecture. That translates to roughly 40-80% performance increase on average. For single-threaded applications, there will be about 1-2% decrease in performance.
Simply widening a single core processor will do little more than slow down the whole processor. A wider-issue processor will run slower, for various reasons electrically and logically. However, studies on ILP put the sweet spot at around 3-4 instructions. Few instruction streams issue 4+ sequential instructions that are independent. The majority of general purpose code averages 2 sequential instructions. Second up is 1, and there is still a significant presence of 3.

CPU/GPU combo packages will not show up anytime soon. HT already discussed the topic to death.
The only possibility may be low-end packages with the dog-slow performance of five to ten years ago. The problem is both financial and technical. Technically, a super chip with high-end graphics and high-end CPU is possible. However, it's going to take years to develop, around 5,000 of the best engineers in the world, and the Walton's family fortune. It will require close to 2000 or more contacts, 300-400 M transistors (NV40 is 222 M, I believe), and manufacturing yields will top out around 10% (Fabs top out at 90+% these days). Once the chip is ready, then there are logistical problems with configurations and technical problems with the rest of the system, including power, and memory (unless you're going on-package memory, in which case everything gets harder by a factor of 4).
On the other hand, bus interconnects would be blazingly fast. On-die buses have much lower latency and can likely run faster than motherboard traces. This is assuming you can get around the noise.
That said, system on chip is coming in the future. It's just going to take a long time. Embedded systems are already there, but that is because power and simplicity is more important than performance and flexibility. If everyone in the world used low-performance PC's with the exact same configuration, we'd have single-chip systems everywhere.

A while back, GPUs had very deep pipelines. This was because they didn't have as many exceptions as CPUs, which stall or flush the current contents of the processor. More familiar exceptions include branches and context switches. These days, I'm not sure how deep the average graphics pipeline extends. Either way, a 1GHz single-pipeline GPU would probably still outperform a 3GHz Pentium 4. The primary reason is customization. GPU's are designed from the ground up to render images. CPU's are designed for general purpose tasks. What takes one pass from a GPU may require multiple, dependent instructions on a CPU. Factor in the massive parallelization of a single GPU, massive memory bandwidth, and what is called the embarrassingly parallel nature of rendering, and you get a 300MHz GPU that can outrender a 3.2 GHz Pentium 4.
That said, GPUs run rather slow relative to CPUs on the same process. There are a few reasons for this, but I'm only familiar with a few. First off, GPU's have shorter product cycles (thanks to nVidia) so they are usually designed using HDL's. CPU's are designed by hand. That has traditionally been the case, although who knows what really goes on in company's labs these days. I believe Intel has been experimenting with HDL on the Prescott design and I'm guessing nVidia and ATI are looking to extend product cycles and, subsequently, design cycles.
Also, GPUs are highly parallel architectures with a masisve number of transistors dedicated to logic. CPUs are somewhat parallel with a majority of transistors dedicated to memory. Between noise, coupling, power, and other issues, ramping up a GPU is more difficult. With memory, only certain sections are active at any given time whereas logic is always on. Of course, this is assuming no power-saving logic.

Avalon · Aug 19, 2004

Great stuff here

Heh, I remember seeing an eBay listing just a couple days ago that this thread reminded me of. The guy was explicitly advertizing a 5.6ghz system, and when I checked the specs, it was dual 2.8ghz Xeons running in SMP. Two of those definitely would not equal a single 5.6ghz Xeon, though. I chewed the guy out pretty good

Mik3y · Aug 19, 2004

Originally posted by: Avalon
Great stuff here
Heh, I remember seeing an eBay listing just a couple days ago that this thread reminded me of. The guy was explicitly advertizing a 5.6ghz system, and when I checked the specs, it was dual 2.8ghz Xeons running in SMP. Two of those definitely would not equal a single 5.6ghz Xeon, though. I chewed the guy out pretty good

lol. i've always wondered if anybody would ever do that.

BDSM · Aug 19, 2004

Hey Sahakiel

Thanx for the long reply. I still believe though that today it wouldn't be at all impossible to design a low end on chip solution today.

And if we had everything on chip we would get a lot of "bang for the buck".. chip real estate wise.

I still believe it's a great idea.

aka1nas · Aug 19, 2004

Well the reason that it would probably not make economic sense is that if any of the integrated components on the chip didn't fab quite right, then the whole chip is trash. It is probably cheaper in actuality to fab seperate small chips and then pick out all the good ones.

Or to express it mathemically somewhat:

lets say you have a 90% success rate each for fabbing your individual chips. Putting them all on one die will greatly increase the area, which is definately known to lower yield. Say you had 4 chips, that each had a 90% success rate of working when made seperately. If this probability was still even the same of each part of the super-chip working, you would have a (.9)^4 or 65.61% probability of having a chip with all parts intact. This is assuming that the failure rate does not become exponentially higher as you increase area, which someone with more experience might be able to clarify.

These are all made up numbers, of course, but it gives you a very vague idea why the economics of chip fabbing might lead towards smaller indiviual chips rather than one large mammoth chip. The reason why we keep adding features to modern CPUs is because the performance benefit makes it desirable to reduce the latency between these circuits and improving technology allows us to do so at a reasonable cost.

zephyrprime · Aug 21, 2004

There used to be a CPU with onboard graphics from Cyrix but the advent of 3D destroyed it's usefulness.

stnicralisk · Aug 21, 2004

Originally posted by: Sahrin
Dual Cores are a solution to every problem that befalls CPU manufacturers today. Process technology, thermal dissipation, gate leakage...it all. At least, that's what the CPU manufacturer's would have you believe. The truth is it's a band-aid, just like Hyperthreading is. It's getting harder and more expensive to make a faster processor. Putting more throughput in a CPU doesn't necessarily change things. If there are 2 math operations to do and you've got the choice of having 2 Arithmetic Logic Units or 5 ALU's, you're obviously going to choose 2. 5 is a waste. The idea is to make the Executions Units you have go faster. A processor with 1 ALU, 1 FPU, 1 VU all running at 50Ghz is faster than any processor tiwh 100 of each running at 100Mhz. Most tasks aren't huge throughput ones, even now we are seeing that generally, encoding stresses CPU's and pretty much it's the only thing. What helps encoding the most? Multi-threading. Gaming won't work faster or better, that's GPU dependent. Content Creation won't. It's a band-aid, that's all. A sloppy one.

Most games on current generation technology has become CPU limited even at 12x8 4x/4x.

GPU is so very parallell they can just make the next generation 32 pipelines and the CPU would be the limiting factor in ANY setting with ANY game.

I believe when dual core becomes normal programmers will work towards multiple threads and parallel code.

The real boost for CPUs is 64bit. It needs to become standard.

clarkey01 · Aug 22, 2004

"The real boost for CPUs is 64bit. It needs to become standard. "

lets threaten to burn down MS till they get the ass in shape and get Windows 64 out !

lobadobadingdong · Aug 22, 2004

Originally posted by: clarkey01
"The real boost for CPUs is 64bit. It needs to become standard. "

lets threaten to burn down MS till they get the ass in shape and get Windows 64 out !

well currently the added regiters are the real boost,64 bit precision isn't really necissary for most programs yet, however the >4GB memory addressability will slowly become the key aspect. (esp. for servers/workstations)

Why dual cores and not one single faster one?

Diamond Member

Senior member

Banned

Diamond Member

Member

Golden Member

Member

Banned

Elite Member

Golden Member

Elite Member

Senior member

Diamond Member

Lifer

Golden Member

Diamond Member

Banned

Senior member

Diamond Member

Diamond Member

Golden Member

Diamond Member

Lifer