R600 to be 80nm

dreddfunk · Apr 10, 2007

I'm no engineer so I've no real idea just how difficult this process may be. As many have suggested, it seems unlikely that the transition will be instantaneous. Engineering revolutions happen, they just don't become profitable and make it to market overnight. I can't say how much preparation has been done in this area by any of the companies involved, so I've no real idea of how soon we are to seeing such solutions.

If pressed for a guess, however, I'd say it will be a long time before we see the technology in the mainstream market. I can easily envision very low-end, integrated solutions for applications that require little memory bandwidth (designed for the corporate and mobile, non-gaming markets). I can also possibly envision very expensive solutions, with motherboards that have integrated video memory and a socket for the GPU. We're already buying $250 motherboards and $600 GPUs, for some folks, the added cost of producing a really complex motherboard with integrated memory isn't going to be an issue.

The real problem in the high-end that I can see, is the added cost of switching between CPU vendors that will be imposed by having such a GPU/CPU socket combination directly on the motherboard. Short term, I think this is probably cost-prohibitive for many, nor is it ideal for the consumers to be locked into a certain GPU/CPU combination. Long term, with the switch to multi-core, we'll probably be entering an entirely new market of integrated CPU/GPU chips. When new chips come out, we'll be reading both CPU and GPU benchmarks on the same chip. I think that's pretty far in the future, however, at least 5+ years, but I'm no engineer.

In the meantime, it's hard to imagine too many scenarios on the high-end that make good sense. Not only would a GPU socket lock-in the customer to both a particular GPU and CPU vendor, but committing to a particular socket interface imposes added design limitations. Both nVidia and ATI(AMD) will have to switch how they do their GPU development quite drastically, it seems to me. I think they'll both do it at some point, but I'm not sure how soon it will actually happen.

To sum up: to me, low end solutions seem likely in the near term, with the added slight possibility of some very high-end options at some point.

On some level, just off the cuff, I wish Intel would adopt the hyper-transport system and the co-processor architecture AMD is coming up with. The reason I say this is that, long term, I don't want to see nVidia pushed out, or the possibility of exclusively GPU vendors pushed out. I'd like the industry to keep the option of having a drop-in, third-party co-processor. The best thing for the consumer would be for both major CPU vendors to adopt the same standard for co-processors, just as we have used the AGP and PCIe interfaces on the motherboard. Again, as I'm not an engineer, so I don't even remotely know if this is possible. I certainly doubt it is likely.

Once the memory situation is resolved, however, I don't want to see the industry limited to just what Intel and AMD can offer in the way of chips that perform both CPU and GPU functions. I suppose that isn't much different than the situation was for several years, with nVidia and ATI the only real GPU options, or Intel and AMD as the only real CPU options.

I'm a fan of open industry standards. The great thing about AGP or PCIe, or any standardized interface like IDE, SATA, PCI, etc., is that it provides a way for the consumer to access the best work, regardless of vendor. As we move to integrated GPU solutions, I want to see as much consumer flexibility preserved as possible.

dreddfunk · Apr 10, 2007

Actually, I posted while SexyK was posting, and his analysis makes a bit more sense to me than my own in terms of the cost involved in producing motherboards for the high-end.

SexyK - I guess the question is: how much more would a motherboard cost, given how much we're already spending on MB/GPU combinations? Would the costs really be much higher to produce a motherboard with on-board video memory (not socketed memory, but integrated) with a socketed GPU, versus the current arrangement? My guess is that doing something like that wouldn't be much more expensive for the consumer, it would just be much less flexible.

I'm actually not disagreeing with you at all. I don't really think we see high-end stuff for quite a while. I'm just not sure that it is because of cost, per se, but the lack of flexibility that such solutions would provide.

SexyK · Apr 10, 2007

Originally posted by: dreddfunk
I'm no engineer so I've no real idea just how difficult this process may be. As many have suggested, it seems unlikely that the transition will be instantaneous. Engineering revolutions happen, they just don't become profitable and make it to market overnight. I can't say how much preparation has been done in this area by any of the companies involved, so I've no real idea of how soon we are to seeing such solutions.

If pressed for a guess, however, I'd say it will be a long time before we see the technology in the mainstream market. I can easily envision very low-end, integrated solutions for applications that require little memory bandwidth (designed for the corporate and mobile, non-gaming markets). I can also possibly envision very expensive solutions, with motherboards that have integrated video memory and a socket for the GPU. We're already buying $250 motherboards and $600 GPUs, for some folks, the added cost of producing a really complex motherboard with integrated memory isn't going to be an issue.

The real problem in the high-end that I can see, is the added cost of switching between CPU vendors that will be imposed by having such a GPU/CPU socket combination directly on the motherboard. Short term, I think this is probably cost-prohibitive for many, nor is it ideal for the consumers to be locked into a certain GPU/CPU combination. Long term, with the switch to multi-core, we'll probably be entering an entirely new market of integrated CPU/GPU chips. When new chips come out, we'll be reading both CPU and GPU benchmarks on the same chip. I think that's pretty far in the future, however, at least 5+ years, but I'm no engineer.

In the meantime, it's hard to imagine too many scenarios on the high-end that make good sense. Not only would a GPU socket lock-in the customer to both a particular GPU and CPU vendor, but committing to a particular socket interface imposes added design limitations. Both nVidia and ATI(AMD) will have to switch how they do their GPU development quite drastically, it seems to me. I think they'll both do it at some point, but I'm not sure how soon it will actually happen.

To sum up: to me, low end solutions seem likely in the near term, with the added slight possibility of some very high-end options at some point.

On some level, just off the cuff, I wish Intel would adopt the hyper-transport system and the co-processor architecture AMD is coming up with. The reason I say this is that, long term, I don't want to see nVidia pushed out, or the possibility of exclusively GPU vendors pushed out. I'd like the industry to keep the option of having a drop-in, third-party co-processor. The best thing for the consumer would be for both major CPU vendors to adopt the same standard for co-processors, just as we have used the AGP and PCIe interfaces on the motherboard. Again, as I'm not an engineer, so I don't even remotely know if this is possible. I certainly doubt it is likely.

Once the memory situation is resolved, however, I don't want to see the industry limited to just what Intel and AMD can offer in the way of chips that perform both CPU and GPU functions. I suppose that isn't much different than the situation was for several years, with nVidia and ATI the only real GPU options, or Intel and AMD as the only real CPU options.

I'm a fan of open industry standards. The great thing about AGP or PCIe, or any standardized interface like IDE, SATA, PCI, etc., is that it provides a way for the consumer to access the best work, regardless of vendor. As we move to integrated GPU solutions, I want to see as much consumer flexibility preserved as possible.

Good post dredd, I agree with many of your points :thumbsup:.

In response to you follow-up, I think you're right - the total cost may be similar, but then you'll be stuck with one speed/amount of memory for as long as you're using that motherboard. It is much, much simpler to swap out your video card than to break down your whole build and replace the motheboard. I don't see the attraction to this type of solution. Even if you somehow overcome the engineering hurdles associated with GDDR memory slots, you're still stuck upgrading your motherboard when your GPU changes from a GDDR3 to GDDR4 interface rather than just yanking your PEG card and throwing in a new one.

Also, it's probably easier to resell an old discreet card to help finance an upgrade rather than selling a motherboard/GPU combination... what if someone wants upgraded graphics but they use a CPU that is incompatible with motherboards supporting the GPU they want? It's just going to get very, very complicated. If AMD incorporates all their GPUs into their CPUs, then they will lose out on the graphics market for all people using Intel CPUs! Doesnt seem to make much sense to me.

dreddfunk · Apr 10, 2007

I pretty much agree with what you've said SexyK. Thanks for your thoughts!

I'd just offer a couple of my own by way of reply.

First, we're still in the stage in the GPU market where every generation brings some serious, tangible benefits when compared to previous ones. I think we may have passed that stage in the CPU market for most people, including enthusiasts. I'm a believer that multi-core gaming will come, but CPU capability is ahead of the curve when it comes to multi-core gaming: we're getting the products in advance of really being able to use them.

This observation may not seem all that revolutionary, but it sets up the following point. If AMD could actually bring a fantastic price/performance option via on-board design to the mainstream and above market segments, I'm not so sure that it would be a bad thing for them. The same goes for Intel and maybe even moreso for nVidia.

As far as the 'gaming' market goes, the impact of the CPU is pretty small in most cases compared to the GPU. I wouldn't mind being locked into AMD or Intel's CPUs if it meant getting a great GPU on the cheap. nVidia is in the best position if the market edges in this particular direction, in some respects, as they could partner up to provide integrated-GPU motherboards for both AMD & Intel. Of course, if the all-in-one CPU/GPU wins the day, nVidia is in a tough spot unless there is some sort of coprocessor standard.

Mind you, I don't think it's likely that this is going to be the case anytime soon because the move to a socketed GPU is, IMHO, a ways away, let alone all-in-one CPU/GPUs.

I'm just making a personal observation that being locked in to a particular CPU wouldn't be the discouraging factor for me. My current CPU is a 3700 Clawhammer and it's more than enough for me in just about every situation imaginable. Now there are plenty of folks who like to game out there who also do serious CPU work on there machines, but I'd guess that's a smaller part of the gaming market than folks whose main computer usage is awfully mundane except for their gaming habits.

With video memory, you'll have to bear with me, because I've got a couple of questions. I thought that the memory controller for the GPU wasn't on-die as of yet. Am I incorrect?

The reason I ask is that it makes a difference in terms of how this 'socketed-GPU' concept might develop. If the memory controller is actually a part of G80 or R600, then I think your observation about being stuck when memory standards change is definitely on target.

If the memory controller isn't on-die, it would be more just a matter of how much performance you'd be leaving on the table slapping in an R600 with only GDDR3 (as opposed to GDDR4). Similarly, if the controller isn't on-die, then the question becomes not only what speed the on-motherboard controller supports, but also it's size (128-bit, 256-bit, 512-bit, etc.).

From what I've seen generally, I'd say that the memory standards themselves move slowly enough, both in terms of bus-width and speed, that memory turnover wouldn't discourage me from on-motherboard solutions. Indeed, it would be an interesting choice to have to be able to select a motherboard with a 256-bit interface and slap an 8600 core into it.

But maybe I'm just naive and didn't realize that the only way these modern GPUs got such high bandwidth was by having an on-chip memory controller.

Even in that case, however, the standards move slowly enough that it wouldn't be the real issue for me, any more than the DDR/DDR2 transition was on the CPU side.

I've got socket 754, which means I'm stuck with single-core and DDR. That happens and it isn't such a big deal right now--again mostly because I'm not really leaving that much performance on the table in terms of what I can really notice compared to an X2/DDR2 setup.

Of course, as I said before, the GPU market isn't in the same place as the CPU market right now, so all of these upgrade-ability issues are magnified. In the GPU market, you're still getting those massive, tangible benefits from one generation to the next. Until that slows down a bit, I don't foresee these integrated solutions coming to the performance or high-end.

Cheers.

thilanliyan · Apr 10, 2007

Originally posted by: SexyK
In response to you follow-up, I think you're right - the total cost may be similar, but then you'll be stuck with one speed/amount of memory for as long as you're using that motherboard. It is much, much simpler to swap out your video card than to break down your whole build and replace the motheboard. I don't see the attraction to this type of solution. Even if you somehow overcome the engineering hurdles associated with GDDR memory slots, you're still stuck upgrading your motherboard when your GPU changes from a GDDR3 to GDDR4 interface rather than just yanking your PEG card and throwing in a new one.

On the other hand...the size and type of memory usually lasts longer than the core, so having for example 512mb of GDDR3 could last you through 2 core revisions. (ie. X1800XT 512 to X1900XT 512). Even between vendors the ram type and amount could stay the same (ie. 7900GTX and X1900XTX), so the only thing you would need to change would be the core. Only problem is they would need a common socket.

I would definitely be in favour of a common GPU socket with a set amount of GDDR3 on the motherboard (say 512mb currently), and then I can just replace the GPU core with whichever one I want in 6 months or whenever I upgrade, and then replace the motherboard for a major upgrade, which doesn't come as often as replacing a GPU. One good point I see is that you would only be paying for the memory (which can be very expensive) only once, when you purchase the original motherboard. Whereas with buying discrete cards, you will pay for it each time you replace the card, even if it IS the same amount and type of ram.

Janooo · Apr 10, 2007

Originally posted by: SexyK

Originally posted by: kobymu

Originally posted by: Janooo

Originally posted by: kobymu
My point is the argument "i don?t see any CPU with high bandwidth to memory subsystem NOW" is flowed because CPU don?t NEED it.

If CPU NEEDED high bandwidth, you would have seen high bandwidth memory subsystem.

Saying that it doesn't exist NOW is a moot point. It isn?t needed NOW.
...

Click to expand...

CPU NEEDS high bandwidth! It's just not realistic (meaning much more expensive) to get it. That's why there are tricks around that (cache, prefetch, ...). The ultimate ideal state would be the whole RAM in a form of cache.

Some tasks don't need more than 1MB of memory. They run from cache and they are fast but there are many tasks that need to go main memory and they would benefit from high bandwidth.

GPUs need high bandwidth because by nature of a task at hand they need more than 1MB(2, 4, 8,... what ever cache size would be possible) of memory.

CPUs are a little bit different. They execute different type of tasks and many of them fit into cache and that's the reason why they appear that they don't need high bandwidth. But if there was no cache they would starve to death for high bandwidth.

Click to expand...

From anand latest article:

http://www.anandtech.com/cpuchipsets/intel/showdoc.aspx?i=2963&p=6

All the core2duo variety had the SAME bandwidth.
All the Athlon 64 X@ variety had the SAME bandwidth.

And that is from the 3D RENDERING Performance page.

Even when you look at 2 core CPU only you see a delta of 50% if not more.

What does that tell you?

Did you ever program any real world application? Do you have any idea what the hell you are talking about?

For every application you can find that is bottlenecked by bandwidth I can find you 10 that are bottlenecked by other subsystems, 10!

Click to expand...

The point he is making is that without on-die cache every application would be bandwidth limited. Thus the fact that most PC applications work on small data sets paired with powerful prefetchers and the presence of a extremely high-speed cache on-die lessens the impact of lower bandwidth to system memory. Try turning off your L1 and L2 cache and see if applications are limited by anything other than memory bandwidth. The vast majority will struggle mightily. That is why janoo is arguing that CPUs do need massive bandwidth - they need it, but only for a smaller data set which can be predicted and cached in L1 and L2 most of the time.

This is in contrast to GPUs which work with much larger data sets. 1-2MB of high-speed cache on a GPU would be insufficient to hold all the data required to render even one frame at a decent resolution. One approach to mitigating this issue with integrated GPUs is found in the xbox 360 where there is 10MB of high-speed eDRAM integrated into the GPU die. Note however that the 360 only renders in one resolution all the time, and all 360's have the same amount of memory, so developers can target one set of specifications and tailor their applications to fit into the eDRAM. This approach would have a much harder time working on the PC platform because people expect to be able to use extreme resolutions with their high-end GPU, so the size of the eDRAM would have to increase significantly. Creating a much large eDRAM block would make the die huge and is probably cost-prohibitive. Note also that the even with the inclusion of the eDRAM, the 360 still uses comparatively high-speed system memory (4x the bandwidth of current PC system memory) that is soldered directly to the system board to complement the cache. As others have pointed out, getting this kind of bandwidth onto a consumer-level motherboard with traditional DIMMs would increase the complexity of motherboards many times over, and would most likely increase the cost of DIMMs significantly because the DIMM interface would most likely have to be increased from 64-bits/channel to 128 or 256 bits per channel. Even then the trace length would still have to be addressed in order to allow the memory clockspeeds necessary. We are a long way off from having a fusion GPU being the top dog. Midrage? Possible. But high-end is still a ways off.

Thanks mate! I am glad that somebody understood my point.

Kobymu, 10:1? How about 9:1?

And then I'll run that one application in parallel with your ten and they will suddenly become bandwidth hungry

Fair enough?
With more cores and more tasks bandwidth will be bigger issue.

josh6079 · Apr 10, 2007

So uh...

I guess eh.....

I guess the R600 will be 80nm's....

.........wonder what eh...

what clock speeds we'll see...

SexyK · Apr 10, 2007

Originally posted by: thilan29

Originally posted by: SexyK
In response to you follow-up, I think you're right - the total cost may be similar, but then you'll be stuck with one speed/amount of memory for as long as you're using that motherboard. It is much, much simpler to swap out your video card than to break down your whole build and replace the motheboard. I don't see the attraction to this type of solution. Even if you somehow overcome the engineering hurdles associated with GDDR memory slots, you're still stuck upgrading your motherboard when your GPU changes from a GDDR3 to GDDR4 interface rather than just yanking your PEG card and throwing in a new one.

Click to expand...

On the other hand...the size and type of memory usually lasts longer than the core, so having for example 512mb of GDDR3 could last you through 2 core revisions. (ie. X1800XT 512 to X1900XT 512). Even between vendors the ram type and amount could stay the same (ie. 7900GTX and X1900XTX), so the only thing you would need to change would be the core. Only problem is they would need a common socket.

I would definitely be in favour of a common GPU socket with a set amount of GDDR3 on the motherboard (say 512mb currently), and then I can just replace the GPU core with whichever one I want in 6 months or whenever I upgrade, and then replace the motherboard for a major upgrade, which doesn't come as often as replacing a GPU. One good point I see is that you would only be paying for the memory (which can be very expensive) only once, when you purchase the original motherboard. Whereas with buying discrete cards, you will pay for it each time you replace the card, even if it IS the same amount and type of ram.

I agree with this point and the similar point Dredd was making in the post above this, but only as it applies to lower-tier cards. I don't think this integrated memory concept would be useful in the high end. As an example, let's say you upgrade to the top-of-the-line GPU every generation (clearly this is an extreme enthusiast example, but that's who top-of-the-line cards are aimed at, right?). Looking at the memory bandwidth of nVidia and AMD/ATI top-end cards for reference:

6800 Ultra: 35.2 GB/s
7900GTX: 51.2 GB/s
8800GTX: 86.4 GB/s

9800XT: 23.36 GB/s
x850XT PE: 37.76 GB/s
x1950XTX: 64 GB/s

As you can see, people wanting to upgrade from 7900GTX to 8800GTX or x850XT PE to x1950XTX will be severely crippled if they are stuck with the same memory from one generation to the next. Historically, bandwidth increases dramatically from one generations high-end to the next. Having a static amount and speed of memory would work well in the low end and probably the lower-mid-range, however I don't think it's a viable option for high-end products.

yacoub · Apr 10, 2007

Originally posted by: josh6079
So uh...

I guess eh.....

I guess the R600 will be 80nm's....

.........wonder what eh...

what clock speeds we'll see...

As long as they aren't cooking themselves at 80+C, I don't care. Just get me one that performs its best without being pushed to life-shortening temps and I'll be happy (assuming they don't just throw a loud, fast-spinning fan on it to achieve said temps).

thilanliyan · Apr 10, 2007

Originally posted by: SexyK
As you can see, people wanting to upgrade from 7900GTX to 8800GTX or x850XT PE to x1950XTX will be severely crippled if they are stuck with the same memory from one generation to the next. Historically, bandwidth increases dramatically from one generations high-end to the next. Having a static amount and speed of memory would work well in the low end and probably the lower-mid-range, however I don't think it's a viable option for high-end products.

I see your point. I suppose my suggestion would only work for people who don't buy THE high end card of each generation.

BassBomb · Apr 10, 2007

Originally posted by: yacoub

Originally posted by: josh6079
So uh...

I guess eh.....

I guess the R600 will be 80nm's....

.........wonder what eh...

what clock speeds we'll see...

Click to expand...

As long as they aren't cooking themselves at 80+C, I don't care. Just get me one that performs its best without being pushed to life-shortening temps and I'll be happy (assuming they don't just throw a loud, fast-spinning fan on it to achieve said temps).

Knowing most manufacturers, the benefit from 80nm will be counterbalanced by the higher clockspeeds they will probably gun it at.

Wreckage · Apr 10, 2007

Originally posted by: josh6079
So uh...

I guess eh.....

I guess the R600 will be 80nm's....

.........wonder what eh...

what clock speeds we'll see...

Yah know if they some how had been able to pull off 65nm I think they could have really been back in the game in a big way.

I think the R600 will be a fine and shiny card. I also think it will be nothing new over what we got wit the G80 last year.

Matt2 · Apr 10, 2007

Originally posted by: BassBomb

Originally posted by: yacoub

Originally posted by: josh6079
So uh...

I guess eh.....

I guess the R600 will be 80nm's....

.........wonder what eh...

what clock speeds we'll see...

Click to expand...

As long as they aren't cooking themselves at 80+C, I don't care. Just get me one that performs its best without being pushed to life-shortening temps and I'll be happy (assuming they don't just throw a loud, fast-spinning fan on it to achieve said temps).

Click to expand...

Knowing most manufacturers, the benefit from 80nm will be counterbalanced by the higher clockspeeds they will probably gun it at.

Yup. AMD (or Nvidia for that matter) doesnt care that you dont like the temps on their card. When you're looking at the high end, they're going to achieve the fastest performance at the highest possible tolerable temp.

dreddfunk · Apr 10, 2007

SexyK - Agreed. The only caveat I have comes in the form of two big "if's" to which I don't know the answer.

If the memory controller isn't on-die, then it could be placed on the motherboard. The bus width of the memory controller doesn't seem to change so often that it would concern me. If you want a high-end system you buy a motherboard with a wider memory bus.

Also, if memory becomes socketed (not soldered on the motherboard) then new memory with higher clock speeds could be purchased until the actual memory standard changed (a la DDR/DDR2, etc.).

Again, I agree however that there are serious limitations right now. The pace of innovation at the high end is simply too fast for it to settle down into very path-dependent practices such as socketing.

dreddfunk · Apr 10, 2007

Matt2 - I agree with you there. Yet I also remember being shocked that the default fan speed on my x850xt was so slow. Sometimes I think that the vendors play a few games to keep the acoustics in check as well. If it was always about 100% performance, Sapphire would have set the default fan speed on my x850xt to 100% and not 10%.

As much as everyone wants performance, no one wants a leaf-blower. It can ruin a game's audio experience. Having said that, manufacturers don't want to spend a dime more on cooling solutions than they have too in order to achieve a certain performance level.

I also think you're spot on when you say that higher clocking negates much of the temperature advantage inherent in a die shrink. There is still too much innovation going on in the top-end for either nVidia or AMD to leave performance on the table in their high-end parts. At this stage, each die shrink is just an opportunity to increase performance, not thermals.

kobymu · Apr 11, 2007

Some people here insist on not looking at the big picture here. More and more common applications are much more susceptible to low latency then to low bandwidth.

Applications like Heavy DSP induced (as in video recognition and analyzing, advance image processing and communications signal processing) application, advance encryption application, branch heavy application such as AI and so forth.

"The point he is making is that without on-die cache every application would be bandwidth limited."

Well of course with VERY LOW bandwidth application will starve, but at the and of the day, that is just a very bad argument, it is like saying that modern GPU can still give close to the same performance with the same bandwidth but with lower clock frequencies. Even my sister and my dog needs bandwidth, some kind of bandwidth.

The big point here is that CPU don?t need a fat pipe, CPU don?t need to be kept fed with a crapload of data every single moment.

It seems that some people here don?t understand what cache and prefetchers in a fundamental way.

When a CPU need data, which can be of extremely small size, it needs it NOW, every clock that passes by without the data in hand, is a wasted clock cycle. It is VERY COMMON in CPU application that certain data is kept In the registers for over 100 cycles, it is just common (not very) that certain data is kept In the registers for over 1000 cycles and so forth. This doesn?t happen in GPU (with the exception of the geometric data), GPUs need new data every cycle!

The first paragraph in the wiki entry about CPU cache:
http://en.wikipedia.org/wiki/CPU_cache
"A CPU cache is a cache used by the central processing unit of a computer to reduce the average time to access memory.The cache is a smaller, faster memory which stores copies of the data from the most frequently used main memory locations. As long as most memory accesses are to cached memory locations, the average latency of memory accesses will be closer to the cache latency than to the latency of main memory."

From the third paragraph in the wiki entry about Front_side_bus:
http://en.wikipedia.org/wiki/Front_side_bus
" The maximum theoretical bandwidth of the front side bus is determined by the product of its width (1), its frequency (2) and the number of data transfers it performs per clock tick (3)."

If CPU where starving for bandwidth then ANY one of the three could have been increased (with varied difficulty and cost), and trust me on this one, cache is expensive, more then increasing any of the three factor that makes up the bandwidth, it take up more the half of the die real-estate for crying out load, and one of the bigger reasons that the cache isn?t getting enormously bigger is the fact that if you make it bigger it would result in bigger cache latencies.

The reason CPU engineers turn to this expensive solution is that latencies, unlike bandwidth, is VERY HARD to lower. And I'm not talking about cost here, there are known technical limitation in the way memory cells operates, and they connote be overcome without changing the way memory cells are designed and manufactured altogether.

It is much less expansive in integrating cache in the CPU (as expansive as it is) then change the fundamental way memory operates, that is assuming it can be technically achieved.

yacoub · Apr 11, 2007

Originally posted by: Matt2

Originally posted by: BassBomb

Originally posted by: yacoub

Originally posted by: josh6079
So uh...

I guess eh.....

I guess the R600 will be 80nm's....

.........wonder what eh...

what clock speeds we'll see...

Click to expand...

As long as they aren't cooking themselves at 80+C, I don't care. Just get me one that performs its best without being pushed to life-shortening temps and I'll be happy (assuming they don't just throw a loud, fast-spinning fan on it to achieve said temps).

Click to expand...

Knowing most manufacturers, the benefit from 80nm will be counterbalanced by the higher clockspeeds they will probably gun it at.

Click to expand...

Yup. AMD (or Nvidia for that matter) doesnt care that you dont like the temps on their card. When you're looking at the high end, they're going to achieve the fastest performance at the highest possible tolerable temp.

Duh, of course they don't! And they hope it has a short life so you come back and buy another one sooner.

kobymu · Apr 11, 2007

Originally posted by: SexyK
The point he is making is that without on-die cache every application would be bandwidth limited. Thus the fact that most PC applications work on small data sets paired with powerful prefetchers and the presence of a extremely high-speed cache on-die lessens the impact of lower bandwidth to system memory. Try turning off your L1 and L2 cache and see if applications are limited by anything other than memory bandwidth. The vast majority will struggle mightily. That is why janoo is arguing that CPUs do need massive bandwidth - they need it, but only for a smaller data set which can be predicted and cached in L1 and L2 most of the time.

Small data sets <--> high bandwidth, see the problem?

Without L2 and L2 cache latency impact on the CPU will be so grate, the CPU will stand ideal most of the time, waiting for memory, even if it is just 1 byte worthy of it.

According to Intel one third (1/3) of all CPU operations are 'load', that means that the latency impact operates on one of every three cycles in a non superscalar CPU. Assuming a total latency of 1, and a non superscalar CPU and CPU frequency of 1Ghz, then without cache or prefetchers, the CPU will stand ideal for 1/3 of its cycles, being effectively a 667Mhz CPU.

That is total latency of 1 latency. Total latency is the sum of the all the latencies of the memory bank plus the latency of the bus, for example a good DDR will give you 2-3-3-5 which will result in a total latency of 2+3+3+5= 13 + bus latency, so as you see latency is crucial for CPU performances. A total latency of 15 will make a 1Ghz CPU without cache and prefetchers look like 290Mhz CPU.
http://en.wikipedia.org/wiki/RAM_latency
http://en.wikipedia.org/wiki/Memory_latency

And btw, right now there isn?t a CPU that prefetch to L1 cache.

SexyK · Apr 11, 2007

kobymu - I'm not sure what your point is. Obviously you are knowledgeable about CPU memory subsystems, but the point of this discussion is that GPUs do require massive amounts of high-bandwidth memory, and currently CPUs do not offer that kind of high-capacity, high-bandwidth memory interface. You keep saying that CPUs don't need a fat pipe, but that is not the point of the discussion, the point is if you stick a GPU onto the same die as a CPU (a la fusion) and do not massively rearchitect the main memory subsystem feeding the CPU/GPU combination chip, the GPU portion of the core is going to be bandwidth starved. There is simply no denying that fact.

Regs · Apr 11, 2007

I can summarize this thread -

Appoppin won't be able to boost details in oblivion
High End market holds prestige and dictates mid-range products value.

apoppin · Apr 11, 2007

Originally posted by: Regs
I can summarize this thread -

Appoppin won't be able to boost details in oblivion

why not?

not sure who 'Appoppin' is ...but my rig runs Oblivion - actually Shivering Isles, right now ... with *everything maxed* at 14x9

--even grass density is maxed

:lips:

kobymu · Apr 11, 2007

Originally posted by: SexyK
kobymu - I'm not sure what your point is. Obviously you are knowledgeable about CPU memory subsystems, but the point of this discussion is that GPUs do require massive amounts of high-bandwidth memory, and currently CPUs do not offer that kind of high-capacity, high-bandwidth memory interface. You keep saying that CPUs don't need a fat pipe, but that is not the point of the discussion, the point is if you stick a GPU onto the same die as a CPU (a la fusion) and do not massively rearchitect the main memory subsystem feeding the CPU/GPU combination chip, the GPU portion of the core is going to be bandwidth starved. There is simply no denying that fact.

Once the GPU will move to CPU die it is possible, even likely, that you will see new ways to handle rendering a 3D scene, CPU have ways and means to processes data that GPU don?t. I further stipulate that is very likely that some of these new techniques will be able to do so with less bandwidth then GPU.

A completely theoretical example - "Non ordered rendering":
-----------------------------------------------------------------------------------
Instead of "Scanline rendering"[*1] the CPU can perform some preemptive calculation on the geometric data on the 3D scene it is about to render to find A. the data and textures that are small enough to fit in to an easily accessed location (in order of ease of access to the CPU: XMM registers [*2], L1 cache, L2 cache); B. Find which texture is lightened by which light source and C.. Then find which pixels are rendered with the textures and light source data that can be fitted inside the CPU with ease, and so it can render them using the same data over and over, without the need to go in to the slow main memory.

Bear in mind that this technique will be better for 3D scenes that have the same light sources in many places and many object that uses the same texture.

[*1] http://en.wikipedia.org/wiki/Scanline_rendering
[*2] http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions

kobymu · Apr 11, 2007

Originally posted by: SexyK
kobymu - I'm not sure what your point is.

I already said it but I will repeat myself: my second point is that developing new way to offer the CPU with a LOT more bandwidth is very well with in reach RIGHT NOW (from a technological PoV), it was just never needed before, if it would be needed, it will happen.

SexyK · Apr 11, 2007

Originally posted by: kobymu

Originally posted by: SexyK
kobymu - I'm not sure what your point is.

Click to expand...

I already said it but I will repeat myself: my second point is that developing new way to offer the CPU with a LOT more bandwidth is very well with in reach RIGHT NOW (from a technological PoV), it was just never needed before, if it would be needed, it will happen.

Obviously if it is needed, it will be developed eventually. However, there are people on this board claiming that Fusion is arriving and will wipe out the discreet GPU add-in card on arrival within the next year or two. My only point in this whole discussion is that there are major hurdles to overcome and high-end graphics cores integrated into CPUs (a la Fusion) will need 5+ years of platform development before we even begin to discuss the demise of the add-in card.

kobymu · Apr 11, 2007

Originally posted by: SexyK
Obviously if it is needed, it will be developed eventually. However, there are people on this board claiming that Fusion is arriving and will wipe out the discreet GPU add-in card on arrival within the next year or two. My only point in this whole discussion is that there are major hurdles to overcome and high-end graphics cores integrated into CPUs (a la Fusion) will need 5+ years of platform development before we even begin to discuss the demise of the add-in card.

If I was in my optimist mood I would give it ~3 years, 5 years is a very "safe" gamble.

The problem is, IMHO, that know one knows in which way Intel will take it, not a lot of decision making people will be willing/comfortable to go for an AMD only solution, once Intel CPU/GPU hybrid specs/docs will be out (assuming that AMD will reach that point first) than we can start seeing some serious development taking place.

R600 to be 80nm

Senior member

Senior member

Golden Member

Senior member

Lifer

Golden Member

Diamond Member

Golden Member

Golden Member

Lifer

Diamond Member

Banned

Diamond Member

Senior member

Senior member

Senior member

Golden Member

Senior member

Golden Member

Lifer

Lifer

Senior member

Senior member

Golden Member

Senior member