Fudzilla: Cypress yields 60-80% Fermi yields at 20% Fermi 20% faster than Cypress

T2k · Jan 13, 2010

Arkaign said:
That's a good deal for sure, and a great card, but in the case of CF and SLI cards, doesn't the 'total' memory not make a difference?

I mean 1gb 4850 vs. a 2gb 4850x2, the game can only 'see' 1gb of vram, or am I wrong?

Errr, no, the game 'sees' 2x GPUs and 2x1GB memoy... that';s the point in CF - it makes a lot of difference especially whe you got twice the memory per GPU... almost any modern, heavy-tasking games appreciate larger framebuffers.

SlowSpyder · Jan 13, 2010

T2k said:
Errr, no, the game 'sees' 2x GPUs and 2x1GB memoy... that';s the point in CF - it makes a lot of difference especially whe you got twice the memory per GPU... almost any modern, heavy-tasking games appreciate larger framebuffers.

Right, there is 2x1GB. But each 1GB pool is dedicated to one of the GPU's, they have to have the same information stored. So essentially it's like 1GB on a single GPU card... there is physically twice the memory but you have to duplicate the data.

T2k · Jan 13, 2010

nitromullet said:
Well, it can go either way. The GTX 295 and 5970 are good examples of how the single card/multi-gpu approach can potentially backfire. To maintain ATX spec compliance, the mutli-gpu card has to be down scaled compared to its single gpu counterpart. A modular high end card is an intriguing concept, but will the ATX spec allow for the power needed for such a card to operate? Not currently.

Unless the ATX spec is re-vamped or ATI's gpus all of a sudden become much more efficient, ATI may not be able to pull off a 6-series dual gpu card next gen with two of their top gpus. Of course, this affects NVIDIA's plans as well, but it looks like to me that NVIDIA is hedging their bets a little more either way by going the large gpu route first, doing a shrink, and then going multi. Of course (as you mentioned), they have to do this at the cost/risk of designing a huge chip to begin with, which as NV themselves have said, is "fucking hard".

Nezt stop will be at 32 nano if not right at 28 - that should gve you a good idea how far they can go with these chips... and MS already said that DX12 is a few years away, they want to slow down now.

T2k · Jan 13, 2010

SlowSpyder said:
Right, there is 2x1GB. But each 1GB pool is dedicated to one of the GPU's, they have to have the same information stored. So essentially it's like 1GB on a single GPU card...

...which works on half the screen, half the resolution, every other frame etc, right.

there is physically twice the memory but you have to duplicate the data.

No. While they share the input they work on separate renders, obviously.

Arkaign · Jan 13, 2010

T2k said:
...which works on half the screen, half the resolution, every other frame etc, right.

No. While they share the input they work on separate renders, obviously.

I think what we're trying to say is that a fast enough single GPU with 1gb will be functionally equivalent to a 2GB X2 card. Eg : 5870 1GB vs 4850 X2 2GB.

EDIT : Ah, yes, here it is :

http://techreport.com/articles.x/14284

"As a result, a Radeon HD 3870 X2 paired with a Radeon HD 3850 256MB would perform like a trio of Radeon HD 3850 256MB cards. And, of course, that means the effective memory size for the entire GPU phalanx would effectively be 256MB, not 768MB, because memory isn't shared between GPUs in CrossFire (or in SLI, for that matter)."

Essentially, the extra GPU's are helpful with calculations, but due to the nature of Crossfire and SLI technology, the video memory has to be filled identically for each GPU, and the work of rendering is doled out from that point on. This is why a Crossfired or SLI'd set of decent 512mb cards does not = a good single GPU w/1GB for higher resolutions/AA/etc. 512mb + 512mb doesn't equal 1GB when you Xfire or SLI as far as game settings are concerned, it only gives you more GPU power in total.

thilanliyan · Jan 13, 2010

T2k said:
Errr, no, the game 'sees' 2x GPUs and 2x1GB memoy... that';s the point in CF - it makes a lot of difference especially whe you got twice the memory per GPU... almost any modern, heavy-tasking games appreciate larger framebuffers.

It's not twice the memory per GPU. Each GPU will only see 1GB.

Arkaign · Jan 13, 2010

thilanliyan said:
It's not twice the memory per GPU. Each GPU will only see 1GB.

^^ That. CF is good, but neither it nor SLI will magically give you any advantage beyond the memory that each GPU can use (which in the case of a 1GB X2 setup is 512mb, or with a 2GB X2 setup is 1GB).

nitromullet · Jan 13, 2010

T2k said:
Nezt stop will be at 32 nano if not right at 28 - that should gve you a good idea how far they can go with these chips... and MS already said that DX12 is a few years away, they want to slow down now.

I'm not quite sure what this is supposed to mean.

Regardless of DX version or process technology, the multi-gpu cards from both camps have consistently gotten more and more power hungry with each generation. For ATI This stands to reason because even since their "small die strategy", the chips keep getting bigger and more complex even on a smaller process.

r600 420mm² 80nm ~720M transistors (never had a dual gpu card)
rv670 192mmm² 55nm ~666M transistors
rv770 256mm² 55nm ~956M transistors
Cypress 338mm² 40nm ~2.1B transistors

Now that ATI has also broken the ATX power draw barrier with dual Cypress, what's next? I'm not aware of an imminent change in the ATX spec, so it will be up to ATI/NV to figure out how to make more powerful cards that still adhere to the ATX spec. From what I can tell, doing this with a dual gpu card will be a serious challenge.

The thing is that's interesting is that NV took the small die approach as well from G80 to G92, but this isn't really mentioned by most people singing the praises of ATI. However G92 really did bring G80 performance to the masses. The 8800GT was launched at $199 - $249, which was less than what a G80 based 8800GTS 320MB cost at the time, and they launched it to zero competition from ATI. ATI's flagship 2900XT couldn't even keep up with NV's mid-high end 8800GT... http://www.anandtech.com/video/showdoc.aspx?i=3140&p=10 At this time, it would have been hard to predict ATI's success with the 4800s, but we all know how that turned out.

Yes, NVIDIA has seen better days than what we've seen from Fermi so far, but there are a lot of challenges ahead for both companies and I wouldn't count either of them out.

SlowSpyder · Jan 13, 2010

nitromullet said:
Now that ATI has also broken the ATX power draw barrier with dual Cypress, what's next? I'm not aware of an imminent change in the ATX spec, so it will be up to ATI/NV to figure out how to make more powerful cards that still adhere to the ATX spec. From what I can tell, doing this with a dual gpu card will be a serious challenge.

According to AMD, a 'sandwich' card is also not in spec. So who knows, maybe they'll both ignore the spec for their highest end parts if it means being competitive with the other company.

nitromullet · Jan 13, 2010

SlowSpyder said:
According to AMD, a 'sandwich' card is also not in spec. So who knows, maybe they'll both ignore the spec for their highest end parts if it means being competitive with the other company.

The 5870 and the GTX 295 shows us that they won't. NV may have broken spec with the number of PCBs originally with the GTX 295, but they kept it within spec for power consumption.

evolucion8 · Jan 14, 2010

nitromullet said:
I'm not quite sure what this is supposed to mean.

r600 420mm² 80nm ~720M transistors (never had a dual gpu card)
rv670 192mmm² 55nm ~666M transistors
rv770 256mm² 55nm ~686M transistors
Cypress 338mm² 40nm ~2.1B transistors

The thing is that's interesting is that NV took the small die approach as well from G80 to G92, but this isn't really mentioned by most people singing the praises of ATI. However G92 really did bring G80 performance to the masses. The 8800GT was launched at $199 - $249, which was less than what a G80 based 8800GTS 320MB cost at the time, and they launched it to zero competition from ATI. ATI's flagship 2900XT couldn't even keep up with NV's mid-high end 8800GT... http://www.anandtech.com/video/showdoc.aspx?i=3140&p=10 At this time, it would have been hard to predict ATI's success with the 4800s, but we all know how that turned out.

It was because nVidia didn't tweak the G92 design to reduce its die size beyond of a die shrink. The RV670 was an accomplishment thanks to the 55nm compared to the 80nm. Almost three times smaller with slighly less transistors and slighly better performance with much better power consumption. If we compare the G80 which was a 90nm process to the G92 which was a 65nm at that time, the chip didn't shrink that much. Even at 55nm made no difference except lower power consumption in its last two incarnations.

The RV770 included twice of everything and the chip only got 30% bigger on the same manufacturing process!! That was a nice show off in engineering. Even with the latest advancements in the manufacturing process, the chips will get inevitably bigger, but with wise engineering, it will help a lot to minimize the grow in die size which matters more than transistor count.

Ps: By the way, the RV770 has 956m, not 686m

nitromullet · Jan 14, 2010

evolucion8 said:
Ps: By the way, the RV770 has 956m, not 686m

I thought that looked a little low... The info was pulled from a number of different google searches.

T2k · Jan 14, 2010

Arkaign said:
I think what we're trying to say is that a fast enough single GPU with 1gb will be functionally equivalent to a 2GB X2 card. Eg : 5870 1GB vs 4850 X2 2GB.

Utter nonsense otherwise we wouldn't have dual-GPUs and you know it.

EDIT : Ah, yes, here it is :

http://techreport.com/articles.x/14284

"As a result, a Radeon HD 3870 X2 paired with a Radeon HD 3850 256MB would perform like a trio of Radeon HD 3850 256MB cards. And, of course, that means the effective memory size for the entire GPU phalanx would effectively be 256MB, not 768MB, because memory isn't shared between GPUs in CrossFire (or in SLI, for that matter)."

Essentially, the extra GPU's are helpful with calculations, but due to the nature of Crossfire and SLI technology, the video memory has to be filled identically for each GPU, and the work of rendering is doled out from that point on. This is why a Crossfired or SLI'd set of decent 512mb cards does not = a good single GPU w/1GB for higher resolutions/AA/etc. 512mb + 512mb doesn't equal 1GB when you Xfire or SLI as far as game settings are concerned, it only gives you more GPU power in total.

That's a 38xx, let's forget that horrible family quickly.

Second it's a complete nonsense anyway because any arbitrary single-GPU card can be faster if it's not the same chip but by default at least 2x faster - it doesn't make any sense to compare it to a 5870.

However compared to a single-chip 1GB 4850 you've got twice the GPU power and each has its own 1GB memory - ergo it's NOT equal to a single 1GB card which clearly demonstrated in games where it's INDEED faster than even a 4850 X2 1GB card.

T2k · Jan 14, 2010

thilanliyan said:
It's not twice the memory per GPU. Each GPU will only see 1GB.

Typo - I meant twice the memory over two GPUs, obviously.

Arkaign · Jan 14, 2010

T2k said:
Utter nonsense otherwise we wouldn't have dual-GPUs and you know it.

That's a 38xx, let's forget that horrible family quickly.

Second it's a complete nonsense anyway because any arbitrary single-GPU card can be faster if it's not the same chip but by default at least 2x faster - it doesn't make any sense to compare it to a 5870.

However compared to a single-chip 1GB 4850 you've got twice the GPU power and each has its own 1GB memory - ergo it's NOT equal to a single 1GB card which clearly demonstrated in games where it's INDEED faster than even a 4850 X2 1GB card.

Huh? I think you're a bit confused by what I was trying to say.

(1)- The 38xx was decent enough, it was at least competitive, and far better than the awful 2900.

(2)- The point of comparing it to a 5870 1GB is to say that any X2 setup with 2GB is equal to running a really fast 1GB card. Ditto comparing an X2 setup with 1GB to a really fast 512MB card (this is much less likely now that 512mb cards are pretty obsolete).

(3)- Of COURSE a 4850 X2 2GB is faster than a 4850 X2 1GB. Each GPU core will get 1GB of ram assigned to it rather than 512MB for each core on the 1GB 4850 X2.

The whole point of all of this is to clear up the misunderstanding. A 2GB X2 card is not a 2GB card as far as any game is concerned. The game will see 1GB of available memory for textures, AA, etc, and the card itself and the drivers will load identically, and then split the GPU load between the cores.

It would be awesome if there was some kind of new gen of Crossfire or SLI that wasn't so wasteful of video memory, but we're not there yet. I think the largest obstacle would be the dilution of memory bandwidth, as if both GPU's were pulling from the same memory pool dynamically, it would introduce latency and delays.

When cards DO come out en masse with single GPU + 2GB of video memory, that will be a lot more capable than a 2GB X2 card which runs as if it were a 1GB card with extra GPU power. A 5870 is a great comparison to a 4850 X2, as they seem to have about the same power for 3d

EDIT : The whole point of Crossfire and SLI is to make a greater available performance ceiling. The 4850 has been out for a long time, it's rather impressive that a crossfire setup can still be competitive with the top single GPU out there. 5850 Crossfire, which you can do right now, will likely be competitive with the hypothetical future ATI 6870.

T2k · Jan 14, 2010

nitromullet said:
I'm not quite sure what this is supposed to mean.

It meant they can keep stuffing trannies while keeping power specs in line.

nitromullet · Jan 14, 2010

Even with the latest advancements in the manufacturing process, the chips will get inevitably bigger, but with wise engineering, it will help a lot to minimize the grow in die size which matters more than transistor count.

The point is that the trend is ultimately the same for both - smaller process with more complexity, which leads to bigger (and more power hungry) chips overall.

G80 484mm² 90nm ~681M transistors
G92 324mm² 65nm ~754M transistors
G92b 231mm² 55nm ~754M transistors
GT200 576mm² 65nm ~1.4B transistors
GT200b 470mm² 55nm ~1.4B transistors
Fermi 530mm²(est) 40nm ~3B transistors

r600 420mm² 80nm ~720M transistors
rv670 192mmm² 55nm ~666M transistors
rv770 256mm² 55nm ~956M transistors
Cypress 338mm² 40nm ~2.1B transistors

...ATI is working its way back into the 'big chip' business.

nitromullet · Jan 14, 2010

T2k said:
It meant they can keep stuffing trannies while keeping power specs in line.

Obviously, they can't or we'd have a 5870 X2 and not a 5970.

Kuzi · Jan 14, 2010

nitromullet said:
The point is that the trend is ultimately the same for both - smaller process with more complexity, which leads to bigger (and more power hungry) chips overall.

G80 484mm² 90nm ~681M transistors
G92 324mm² 65nm ~754M transistors
G92b 231mm² 55nm ~754M transistors
GT200 576mm² 65nm ~1.4B transistors
GT200b 470mm² 55nm ~1.4B transistors
Fermi 530mm²(est) 40nm ~3B transistors

r600 420mm² 80nm ~720M transistors
rv670 192mmm² 55nm ~666M transistors
rv770 256mm² 55nm ~956M transistors
Cypress 338mm² 40nm ~2.1B transistors

...ATI is working its way back into the 'big chip' business.

Notice how Cypress has more than double the transistors compared to RV770, but the performance gain was at most only 50% higher, this suggests some kind of bottleneck somewhere within the Cypress hardware. Drivers could have some impact, but I don't think we could gain another 50% from drivers alone.

This is where the next-gen GPU comes in. ATI's engineers would be aware of the bottleneck with Cypress and should be able to unlock more performance from a new more efficient architecture for the 6xxx series cards. That's why I don't see ATI adding many more transistors for their new architecture. Probably only 25-50% more transistors (2.6-3.1 billion).

Using a 28nm process would mean the die size of such a chip (roughly speaking) can be anywhere from 296 to 345mm^2. But I think it will be a bit smaller than Cypress, so a dual GPU solution would be very doable. I guess we'll have to wait and see

Forumpanda · Jan 14, 2010

Kuzi said:
Notice how Cypress has more than double the transistors compared to RV770, but the performance gain was at most only 50% higher, this suggests some kind of bottleneck somewhere within the Cypress hardware. Drivers could have some impact, but I don't think we could gain another 50% from drivers alone.

Everything did not double (bandwidth among other things), nor did the rest of the computer.

The problem is that the frame rate isn't limited by a single factor, but some of the time its shader performance, other times it is bandwidth and some of the time it is CPU limited.
thus increasing one of these factors will increase the overall performance by a smaller amount.

Doubling the GPU specs will never lead to double the FPS.

nitromullet · Jan 14, 2010

Kuzi said:
Notice how Cypress has more than double the transistors compared to RV770, but the performance gain was at most only 50% higher, this suggests some kind of bottleneck somewhere within the Cypress hardware. Drivers could have some impact, but I don't think we could gain another 50% from drivers alone.

This is where the next-gen GPU comes in. ATI's engineers would be aware of the bottleneck with Cypress and should be able to unlock more performance from a new more efficient architecture for the 6xxx series cards. That's why I don't see ATI adding many more transistors for their new architecture. Probably only 25-50% more transistors (2.6-3.1 billion).

Using a 28nm process would mean the die size of such a chip (roughly speaking) can be anywhere from 296 to 345mm^2. But I think it will be a bit smaller than Cypress, so a dual GPU solution would be very doable. I guess we'll have to wait and see

If that is the case, then ATI would be truly committed to the small die strategy. There are definitely a lot of variables with this though... Is ATI going to do this, and if they do would it compete against a 4-5billion transistor NV chip (of course, provided NV doesn't decide to go smaller next gen). As you say, we'll have to wait and see.

For now, I'd be happy with more Fermi info, a launch, and a price war.

Kuzi · Jan 14, 2010

Forumpanda said:
Everything did not double (bandwidth among other things), nor did the rest of the computer.

The problem is that the frame rate isn't limited by a single factor, but some of the time its shader performance, other times it is bandwidth and some of the time it is CPU limited.
thus increasing one of these factors will increase the overall performance by a smaller amount.

Doubling the GPU specs will never lead to double the FPS.

Yes an exact doubling of FPS and performance is hard to achieve for all software, because many variables are at work here. But it is not impossible. Actually it happened before and is still happening now. Remember when the 4870 got released? Even though it was running slower at 750MHz vs. 775Mhz for the 3870, it got about 2x the performance in some games, especially at higher resolutions and settings. Oh and the memory bandwidth for the 4870 was "only" 60% higher (115GB vs. 72GB).

Also we see that some games do get almost 2x the performance with CrossFire/SLI setups, and that is with the driver overhead for CrossFire/SLI. So we can't say it never happens.

SlowSpyder · Jan 14, 2010

T2k said:
Utter nonsense otherwise we wouldn't have dual-GPUs and you know it.

That's a 38xx, let's forget that horrible family quickly.

Second it's a complete nonsense anyway because any arbitrary single-GPU card can be faster if it's not the same chip but by default at least 2x faster - it doesn't make any sense to compare it to a 5870.

However compared to a single-chip 1GB 4850 you've got twice the GPU power and each has its own 1GB memory - ergo it's NOT equal to a single 1GB card which clearly demonstrated in games where it's INDEED faster than even a 4850 X2 1GB card.

It's not nonsense. AMD wanted to make smaller, cheaper GPU's. I'm sure R&D and time to the market is improved over a single larger chip. Their strategy is to make a scalable design, they can put two on a board for the highest end parts and save a lot of money vs. designing and building a much bigger chip that give the same performance. Either way can get you to a performance goal, AMD just feels that their way is more economical.

Each GPU having it's own 1GB of memory is not the same, not even close, to having a single GPU with 2GB. While it's true you physically have 2GB of memory, you have to remember that the data has to be duplicated in each pool.

Say you had a game that uses 750MB of video memory, on your card each pool has to have that same 750MB of data. So even though you have 2x1GB, you also need to have 2x750MB of data stored. With a single GPU card that has 2GB of video memory, you could have that same 750MB stored, you'd have 1.25GB of free space. Even though there is the same amount of physical memory, because the data has to be duplicated with your card it's like having 1GB.

Please have a look at the 4th post. Both CF and SLI work like this...

http://forums.amd.com/game/messageview.cfm?catid=262&threadid=93311&enterthread=y

Q: Does CrossFire double the memory ?
A: NO, CrossFire does not double the memory , if You have a 512MB card , adding another 512mb card wont make your total memory 1GB, it still will be 512MB.

evolucion8 · Jan 14, 2010

T2k said:
Utter nonsense otherwise we wouldn't have dual-GPUs and you know it.

That's a 38xx, let's forget that horrible family quickly.

That family isn't that bad, my brother in law has an HD 3870 and he can max out Resident Evil 5, Need for Speed Shift and GRID at 1280x1024 plus 4x FSAA, their power consumption is awesome, specially when idle.

Forumpanda said:
Everything did not double (bandwidth among other things), nor did the rest of the computer.

The problem is that the frame rate isn't limited by a single factor, but some of the time its shader performance, other times it is bandwidth and some of the time it is CPU limited.
thus increasing one of these factors will increase the overall performance by a smaller amount.

Doubling the GPU specs will never lead to double the FPS.

That's usually true, but there's some scenarios were a certain architecture is bottlenecked by something and doubling it in its next iteration can boost the performance more than usual. A good example, the X800XT which had double of everything compared to the 9800PRO and usually was between 2 and 2.5 times faster. Another example, the HD 3870 which its weakness besides anti aliasing, was their TMU's. They got reworked and got 2.5 of everything and it was called the HD 4870. It usually is twice as fast with no issues and often can be three times faster than the HD 3870.

But the HD 5870 have double of everything compared to the HD 4870 and only in the best case scenario is twice faster. A very wide architecture such as the RV870 is very hard to keep its units busy without some clever optimizations in the compiler. I just think that the benefits of going wider after the RV870 will minize their performance gains. Hope they do something a bit different next time.

Painman · Jan 14, 2010

Fermi is nothing but hype and FUD until it gets here, but nonetheless, Here's post #100 in this worthless, hype and FUD filled thread full of hype and FUD.

Click my sig. Therion is pretty cool!

Fudzilla: Cypress yields 60-80% Fermi yields at 20% Fermi 20% faster than Cypress

Golden Member

Lifer

Golden Member

Golden Member

Lifer

Lifer

Lifer

Diamond Member

Lifer

Diamond Member

Platinum Member

Diamond Member

Golden Member

Golden Member

Lifer

Golden Member

Diamond Member

Diamond Member

Senior member

Member

Diamond Member

Senior member

Lifer

Platinum Member

Diamond Member