Will AGP8X shine with Double-DDR/RAMBUS using AGP Compressed-Texture swapping?

MadRat

Lifer
Oct 14, 1999
11,999
307
126
Surely the results of texture compression across 1x AGP is a prelim to what we can expect from 8x AGP. Intel is going to be pushing 8x AGP this next round if I remember correctly. Won't the huge bandwidth of either Double-DDR or RAMBUS make it very attractive perfromance-wise?
 

MadRat

Lifer
Oct 14, 1999
11,999
307
126
LeoV demonstrated that the AGP bus can be utilized to push textures very well. I'm guessing that system RAM at such a high bandwidth will be something to take advantage of in the future.
 

xtreme2k

Diamond Member
Jun 3, 2000
3,078
0
0
With AGP8x the bandwidth is still only about 2GB/s, which is still very little compared to todays advanced 3D cards. We are looking into 5GB/s+ on the Geforces and Radeons DDRs, today a 'standard' card.

However, say if a game developer uses a LOT of textures, say 300MB for a game and make it swap in and out of the Video Cards Memory during the game play for additional realism, then AGP8x certainly helps.
 

MadRat

Lifer
Oct 14, 1999
11,999
307
126
I'd certainly think semi-realism, not FSAA, will be the future of gaming. No matter how often I see FSAA it still looks like a gimmick to me. I'd rather have curved depth perception (vs. straight lines of depth), transparency cubes, and multi-colordepth backgrounds than FSAA.

2Gb/sec. also sounds like more bandwidth than just about anything out a year ago.
 

xtreme2k

Diamond Member
Jun 3, 2000
3,078
0
0
FSAA is useful if you are stuck on 640 or 800 resolution. But honestly I rather play at 1280 or 1600 without FSAA since at 640 or 800, you already loose SO MUCH polygon details that FSAA wont do you crap. I would like to see games using more detailed texture maps, and more polygons, and some transparency effects.

By the way, I 'thought' Geforces has the curve depth of vision, but it doesnt seem to be true in Q3. You can see the straight lines of different blendings.

One thing I really wana see on game cards, Accelerated Anti-aliased Wireframe ;)

so that we can do some CAD at workstation performance :)
 

Leo V

Diamond Member
Dec 4, 1999
3,123
0
0
Just saw this thread...

...of course MadRat is right. When you're texturing across AGP, you're often bottlenecked by either AGP or memory bandwidth. The question isn't if faster AGP/system memory will help, it's how much will it help.

Newer memory configurations (PC2100 DDR, dual RDRAM) roughly match or exceed AGP8X bandwidth (2133MB/sec), so we can predict ~2GB/sec to become the limit. The trick to AGP texturing (the way I see it), is to use AGP textures just complex enough to saturate the AGP bandwidth--without slowing down the videocard. By complexity, I mean the number of unique visible texels (ie the amount of memory reads needed). However, given the increased fillrate of GeForce3, I would guess that AGP texturing will remain useful for limited-complexity textures (which really aren't that limited, w/S3TC compression--even 2048x2048 mipmapped is easily feasible when it doesn't cover half the screen).

The good news seems to be the reduced CPU idleness--the same AGP reads will take less time to complete. This means if you're operating w/limited-complexity compressed textures, you still get (almost) your full video fillrate, while reducing wastage of CPU cycles (unless you decided to use AGP textures 2x as frequently as today w/AGP4X). I definitely think it'll help, esp. for newer cards.
 

MadRat

Lifer
Oct 14, 1999
11,999
307
126
Now imagined interleaved QMR (Quad Memory Rate aka Double-Double-Data Rate) on the motherboard. Maybe not... Wouldn't the higher bandwidth of interleaved memory be restricted by the AGP8X, though, so it wouldn't probably help then? We can only hope it wouldn't.
 

sandorski

No Lifer
Oct 10, 1999
70,790
6,349
126
AGP is a failed technology that was conceived at a time when onboard vidcard ram was thought to be too expensive. The result of this cost, it was theorized, would keep onboard ram to a low volume, so AGP was introduced to give vidcards access to the ram it needed. However, vidcards now have more ram onboard than most systems had at the time of AGP's conception.

1x, 2x ,4x, 8x who cares? Vidcard makers are not going to rely on it, game makers are not going to implement it. If the AGP slot was removed, no one would notice the difference(unless they tried to install an AGP card in a PCI slot, of course :) ).

IMO, I'm just tired of the "promise" of marketing technologies. Where's are the wonders of MMX, 3dnow(Q2 excepted), SSE/2, and AGP? I'm all hyped out! :(
 

Soccerman

Elite Member
Oct 9, 1999
6,378
0
0
2Gb/sec. also sounds like more bandwidth than just about anything out a year ago.

my Voodoo 3 even beats 2 Gigabytes a second.. how many years old is it?

AGP will stay, only as a 'safeguard' for programs that absolutely have to have that last bit of memory bandwidth (doesn't matter where it comes from) for textures..

in todays games, even if you needed to use AGP texturing, you'd have problems competing with the CPU a bit (not too much) for the AGP bus use. after all, AGP 2x is about the sweet spot for all games today, and pretty much no games use AGP texturing.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
sandorski-

"1x, 2x ,4x, 8x who cares? Vidcard makers are not going to rely on it, game makers are not going to implement it. If the AGP slot was removed, no one would notice the difference(unless they tried to install an AGP card in a PCI slot, of course )."

Wrong. The problem with this statement, and this line of thought(which is quite popular) is that you are only looking at one very small, hardly used advantage of AGP. The assumption that AGP texturing doesn't help in texture limited situations is absolutely false.

For evidence, look to early 5.0x Det drivers and disable texture compression. Then run Quaver at 10x7 32bit UHQ on any 32MB board. You will be pushing roughly 10FPS-20FPS. In the 5.0X drivers AGP texturing was disabled under OpenGL, causing a massive performance hit. After that, try the same exact test under 5.2X or later Dets and watch the FPS jump to the 35FPS-60FPS range, a roughly 300% improvement overall and 900%+ in worse case FPS, the most important area of measurement.

Ignoring texturing completely, vertice data and textures must be uploaded to the gfx card. Even AGP 1X has far more then the theoretical doubling of performance over PCI. On the PCI bus everything is fighting over the same miniscule 133MB/s bandwith, enough to handle about 2.5million vertices/sec if nothing else is on the PCI bus and there are no textures being uploaded. In the real world, you have modems/NICs, coundcards and the likes taking up additional bandwith not to mention you do need to move texture data so your likely peak is probably closer to 1.5 million million vertices/sec.

Take a look at a game like Giants or something like the NV15 level for Quake3 and you are talking sub 20FPS performance no matter how fast your CPU or video card on the PCI bus. What's more, polygon and texture complexity is going up quite quickly right now, Giants was only the beginning.

The biggest problem with AGP has been system bandwith. That 1GB/sec AGP 4X spec sounds impressive, but that relies on system memory bandwith which for the vast majority of AGP 4X systems only matches the peak rate of AGP 4X, so nothing else can be using any bandwith in order to reach that rate of transfer. Quake3 when paired with the PIV gives a good example of this, look at the spike in FPS when using dual channel RAMBUS over single channel, a good indiciator that system memory bandwith needs are strangulating the available bandwith of the AGP interface. DDR and RAMBUS are needed to fully exploit AGP 4X.

AGP 1X right now is too slow for the most recent games. I'm not talking about two year old titles like Quake3, I'm speaking of actual up to date games. AGP 8X is going to be needed in the not too distant future, although increased system memory bandwith will also be needed to properly exploit the technology. PCI gfx are no longer viable unless you want to stay playing outdated or non 3D games.
 

Soccerman

Elite Member
Oct 9, 1999
6,378
0
0
DDR and RAMBUS are needed to fully exploit AGP 4X.

Benskywalker, the games today are limited by the memory because when the CPU sends its data to the Video card, the RAM has to see it too, subsequently, you'll be limited by RAM, correct?

and in the upcoming T&L games, the CPU has to send quite a bit less through the AGP bus correct?
 

jpprod

Platinum Member
Nov 18, 1999
2,373
0
0
and in the upcoming T&L games, the CPU has to send quite a bit less through the AGP bus correct?

It's true that AGP bandwidth utilization is reduced when moving from software T&L to hardware T&L just because the geometry transformed by the T&L accelerator can (not necessarily, though) reside in local video memory instead of system memory. This bandwidth requirement can be further reduced on DX8 hardware by using vertex shaders. No longer complex geometry modifications require uploading entire geometry over AGP, it can reside completely in local video memory (on a side note - could this be one of the reasons as to why there isn't a 32mb GeForce3?).

However, as texture pools in games grow larger, filtering techniques advance (32- and 64-tap anisotropic are supported by GF3) and more developers begin to let DX8 manage the textures instad of writing their own routines, need for AGP bandswidth will increase. AGP8X could ease the burden if it wasn't for the other bottleneck... As Benskywalker mentioned, even AGP4X can't reach it's peak on today's hardware because there just isn't free system memory bandwidth available.

As we all know, neither DDR SDRAM nor DRDRAM aren't living up to their claimed bandwidth in real life. Perhaps this will change in the future as well - new upcoming chipsets such as Micron's Mamba will sport a prefetch cache, which should greatly improve efficienct of sequential memory operations. And texture transfers over AGP are just about as sequential as operations get :)
 

Leo V

Diamond Member
Dec 4, 1999
3,123
0
0
Uh-oh! :Q I sense another 300-post AGP monster thread coming on... I'll try to sum up the points as politically as I can:

* AGP bandwidth doesn't replace local bandwidth with DIME texturing. It's only used for transferring texture data.
* In the case of compressed textures, such data may take as little as <10% of the total used bandwidth! Comparing AGP bandwidth to Voodoo local bandwidth is completely pointless.
* As a result, it's possible to actually have faster framerates, when AGP texturing offloads stress on local memory (as in the case of GeForce2-family cards). It's not common, but it illustrates my point--AGP texturing does not necessitate poor performance at all.
* 256MB of agp-mapped system memory is more than 64MB (actually, more like 40MB after frame&amp;vertex buffers) of onboard texture storage. It supplements local memory for realism, but doesn't replace it for most common textures.
* The necessity of faster AGP busses for improved T&amp;L throughput is perhaps even more urgent. Hardware T&amp;L requires much more vertex data to be passed, especially with dynamic geometry (what we want.)
 

Leo V

Diamond Member
Dec 4, 1999
3,123
0
0
PS: I see Ben and jukka already chipped in a fair share of wisdom! :)

Returning to MadRat's original point though...it makes sense to expect slightly improving results, even as system memory bandwidth begins to exceed AGP bandwidth. If only because system bandwidth is wasted in many directions, and AGP contends for it. I think T&amp;L might become a bigger &quot;customer&quot; of improving AGP bandwidth than texturing (though both offer a lot). Even AGP8X will theoretically max out at ~60 Mtris/sec of dynamic geometry, and much less in practice given its other uses.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Soccerman-

&quot;Benskywalker, the games today are limited by the memory because when the CPU sends its data to the Video card, the RAM has to see it too, subsequently, you'll be limited by RAM, correct?&quot;

Depends on the particular implementation being used. My understanding is that LDT will not be tied in with system memory directly. Even so, the CPU needs to pull the data to compute/transfer out of system RAM in the first place so it will be a bottleneck one way or the other. Edit- Also remember that we have game code that needs to be processed so the system RAM is likely fairly well strapped even before we touch any graphics code./Edit

&quot;and in the upcoming T&amp;L games, the CPU has to send quite a bit less through the AGP bus correct?&quot;

Catch22 here. You can reduce the amount of traffic over the AGP bus by using board level RAM to cache vertex data, but that gives you a bandwith hit on the gfx card. If you utilize AGP instead of local memory, then you run the risk of flooding the bus. Which way to go? Also remember, even if you do decide to use on board memory to reduce traffic over the AGP bus you are going to be looking at some dramatic increases in poly counts when X-Box ports start coming to the PC, so the needs for increased AGP bandwith is going up one way or the other.

Leo and jukka-

While I certainly enjoy those 300+ post monster threads, is there anyone left to argue with us???:)
 

Electric Amish

Elite Member
Oct 11, 1999
23,578
1
0
Also, won't the memory bottleneck on the Vid Card have to be solved first? Maybe the NV20 has fixed that a little?? Too soon to tell, probably...

amish
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
&quot;Also, won't the memory bottleneck on the Vid Card have to be solved first?&quot;

It depends on what you want to do. If you ignore FSAA then the problem is solved, the GF2U and GF3 are pushing FPS at 1600x1200 beyond the refresh rates of monitors for most games. Add FSAA and even then the GF3 is still not too far off from being able to do it. I wouldn't be surprised to see the next gen pushing 16x12 with FSAA at over 60FPS. Also, the GF3 has some really effective fillrate saving measures that will show up later on, but require developer support.

Simplified, the NV20 is designed so it can render things front to back and &quot;throw away&quot; anything not visible. The problem with that for current titles is that things need to be rendered based on their position relative to the viewpoint(you) or else it could render everything you don't see before it knows you aren't going to see it.

After that, we start adding features. Additional features will increase fillrate demands, most definately, but at that point we will need to have more system bandwith available or we cold be in a bit of a rut(we will have the T&amp;L power, the fillrate, and the general features to do incredible things).

I have thought about this quite a bit, and I felt that we should implement an AGP specific DIMM or RIMM. Throw in even a 64MB stick of RAM and have that as dedicated AGP memory, that way you free up system memory from additional bandwith needs and also you can match up the RAM to push the theoretical peak of the AGP slot.
 

MadRat

Lifer
Oct 14, 1999
11,999
307
126
Yes, I don't like to sync because my cat-like reflexive responses become hindered. :)
 

sandorski

No Lifer
Oct 10, 1999
70,790
6,349
126
Ok, I may be wrong. Or I &quot;may not have all the facts&quot; or &quot;I may not know what I'm talking about&quot; or &quot;I don't play many games that utilize AGP well&quot; or &quot;I ain't no rocket scientist&quot;. I just prefer my ram on the vidcard, the way God intended it! :)
 

Soccerman

Elite Member
Oct 9, 1999
6,378
0
0
Also, the GF3 has some really effective fillrate saving measures that will show up later on, but require developer support.

why am I not surprised they're using their 3dfx engineers to good use? :)

in all seriousness though, I didn't say that using the AGP at all for texturing is bad, it's great for supplementing the onboard RAM (which is overloaded as it is with all sorts of non-texture stuff).

but none-the less, they (game developers) don't appear to even try using AGP port bandwidth, because of a few reasons:
-not all video cards run on an AGP port
-rely on the already busy-as-hell system RAM (running lines of code at the minimum)
-until Hardware T&amp;L is implimented the AGP bus is already being used to transfer massive amounts of data..

that Tomshardware Preview on the GF3 really helped me see more of the complexity of the situation.. though they made some quite obvious mistakes, and most likely, more mistakes that I wouldn't know about.

1) they quite clearly stated that the V5 method of FSAA is an 'oddball' approach to that. they mentioned that you cannot take a screenshot of FSAA on a V5. instead you'll just get a 'normal' picture.

2) there were a few odd things about nVidia's approach to FSAA (DaveB3D pointed them out in the video card forum).

all this makes me want to go smack Tom for making stuff up, because now after reading what could have been an eye opening article for the common person in understanding how this all works, we now don't know what parts he got right.

are there more articles that go into detail like Toms (even moreso if possible, but in laymans terms as well!) but don't have mistakes??
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
&quot;are there more articles that go into detail like Toms (even moreso if possible, but in laymans terms as well!) but don't have mistakes??&quot;

Not yet. The only article that I am aware of that doesn't have mistakes in it is FS, but it doesn't go into much detail at all.

&quot;Wait, doesn't the refresh rate only come into play if the vidsynch is turned on?&quot;

Well, if you are refreshing at 75Hz then your monitor is only showing 75FPS. Keeping VSync off and getting higher FPS is better as far as response, but you aren't going to see anything over your refresh rate.
 

MadRat

Lifer
Oct 14, 1999
11,999
307
126
There is a definitive difference in response time when we are talking 75fps to 100fps. I notice it, don't know about others. Personally it doesn't require me to see it to feel it. :)