Poor console CPU performance, claim game devs

Page 9 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Miniscule? A single 7800 has over 38 GB/sec bandwith while two of them have almost 80 GB/sec effective bandwidth. I'm no math professor but even I can see ~22 < ~38 < ~80.

You are ignoring the eDRAM.

At 1920x1080px4AA you are going to be writing a lot out to system RAM.

Tiling- running 1920x1080 w/400x AA you would only be writing out 949.22MB/sec at 120FPS to system RAM the way the XBox is setup(actually using the peak rate for the XBox360 it won't exceed 210.98MB/sec)- real staggering amount there. Are you trying to be obtuse here? Read the exacting functionality of the R500 and how eDRAM plays into that.

Hypothetical example: would you like a miniOGL that needs to be updated with game-specific code every time a new game comes along or a full OpenGL ICD that is generic and robust enough to run anything that's thrown at it?

I certainly don't recall you ever signing praises about 3dfx so why the double-standard when nVidia's past actions were effectively creating miniGL and miniD3D drivers?

It isn't an either or. What about a robust general purpose driver with specific optimizations added on top of that? That is what we are talking about here.

You mean like when nVidia were caught rendering AA on only every alternating frame?

IIRC that was simply the capture functions not being able to do framegrabs the same way that they were used to- and that needed to be worked around(this is IIRC, else why not support temporal AA now if they already have the code base for that level of cheat lying around).

If you're going to compare a game with three animated characters on the screen at once (like a typical console) you need to be looking at something like Dawn, Nalu or Dusk.

Put them in a game and I will. As far as games with comparable amounts of characters- how about DooM3(not saying DOA looks better by any means- but it certainly isn't like PCs don't have more then there fair share of titles that lack lots of simultaneous characters)?
 

blckgrffn

Diamond Member
May 1, 2003
9,686
4,345
136
www.teamjuchems.com
Poor console CPU performance, claim game devs <--- I think that this discussion has moved pretty far out of what it was initially about.

I'll admit that both of you (Ben & BFG) know a great deal more than I do about graphics hardware. I would like to know more though. Where do you get your information? Forums like these on a different site? I would be really grateful if either of you could help me discover more and not embarrassing statements :eek:

There are some areas in which I have had a decent amount of education in, however, and they better pertain to what this topic was originally about.

Fact:
Deeply pipelined processors w/poor branch predicition are crippled by poor effeciency and stunning amounts of time lost to branch misprediction.
End Fact.

Well, this is what both consoles have.

What doesn't help is that these processors where designed to be on the cheap from the get go. For such a small die size, nothing else could be true. I'll venture to say that 3.2 is about the best these processors could do. Why? Well, there is a lot of complex circutry in everyones favorite long pipelined processor, the P4, to handle the fact that electricity at some point can't travel accross the die fast enough, and thus certain operations are given two or more cycles to complete while the signal gets sent to a middle point on one heartbeat and to the destination on the next. Addind this to a processor is no doubt expensive and every added transistor contributes to a larger die size. I would say that IBM kept this kind of effort to a minimum and 3.2 ghz was the sweet spot. This disadvantage here is that with this kind of processor, clock speed is extremely critical and since they are sitting around trying to push stuff down the pipe a lot of the time, higher clock speed could be a way to mitigate the effects of ifs and ors. 3.2ghz isn't really slow, but hardly anyone could call this extremely fast looking at the kind of hardware it is.

Another saving grace that would help would be a big, fast cache. The one that the xbox and PS3 is graced with certainly doesn't have a big L2 cache (1 meg and 512k, respectively), and I would venture to say it isn't that fast, either. Any idea how many n-way set associative it is? My guess is 2 or 4 to keep it cheaper, but if you have any information on this, that would be great. Obviously, if it were something like fully associative, I would retract most of my argument, but I highly doubt that. So, that leaves us with a smallish cache of unknown but probably lower speed SRAM. On the xbox, we now have three cores going over a northbridge to main memory. I remember reading about how much bandwidth they have, which was impressive but not the most important factor. There are two other important things to consider - the way in which this memory is accessed by the cores and its latency. Having a northbridge at all is something of a hindrance, but obviously we have been using that tech for quite some time without issue. We are probably talking some thing like 80ns to access optimally. The real trick is how the cores will reach the memory. The best would be a crossbar, and imagine this what is being implemented. This would give access to main memory for all three cores simulatneously for the most part. If, however, it looks like the Pentium D approach, well, that would be ugly but it is cheap and simple. Does anyone have anymore information on this?

You claim that multithreading will overcome these physical and really undeniable limitations.

Well, in some situations, it will be appropriate to use multiple threads, some instances that have been mentioned have been sound and physics. Yes, these are realistic tasks to split off from the main process.

Even to do that, we are going to be at the mercy of a developers budgeting department if in fact we are to see any of this. Under the traditional SDLC, adding the sort of complexity that we are talking about (attempting to multithread a application that doesn't lend itself naturally to it) will add exponential costs in terms of development time and cost. Ben, I don't think that you are grasping this. There will be many coders you will deem lazy in the first couple of generations for their lack of optimizing their code. What it will really be is the dev house trying to put a game out on an a realistic budget and timeframe telling the coders to get it all checked in, because once it is running acceptably (and that standard seems to be astonshingly low in some cases on consoles) that it will be shipped out. Real world economics has to play a factor in this. Also, we don't want to see 3-4 development times here. That is nearly the lifetime of the console. 1-2 years is pretty long as it is, but to really, really multithread a project on the scale we are talking, that is what it could potentially take to get a polished product out the door. On the console side, I guess if the game is a Halo or GTA this could be feasible, but for many titles it would not be.


Last is this minor issue that bugs me about HDTV and the future of our media. This off topic here, but the TV you linked me too still doesn't have HDMI. That means it is useless for HD-DVD for sure, and almost assuredly for Blue-Ray as well. It would be foolish to invest in a TV that doesn't have that feature at this point. If the xbox360 wants to ever have an HD-DVD drive, it will have to worry about DRM too. Horrible stuff, really. To enjoy these consoles on a decent TV you are going to have to shell out some big bucks at this point, bottom line.

Again, I would still really appreciate access to the sites/articles/books that have given you such large insight to the technical workings of the graphics hardware of todays consoles and PCs. :)

Nat
 

kylebisme

Diamond Member
Mar 25, 2000
9,396
0
0
Originally posted by: BenSkywalker
I played it side by side and I posted pictures for those who couldn't, and the commonly accepted conclusion was that you are wrong.

Here is GameSpot's XBox screenshot with active camo on. Doesn't look anything at all like the day glow effect you claim your "XBox screenshots" look like. I figure linking to a neutral 3rd party it eliminates dishonesty and doctoring of images.

Oh please, after last time when you accused me of not even owning an xbox and I posted pictures of it and the recipt, you are going to accuse me of dishonestly again? The camo is clearly wearing off in that gamespot shot, so is obviously not going to look the same as a fully camo'ed player on the PC. I didn't doctor in any way the xbox shots I presented in the previous thread, and I can upload new ones if needed.
 

Nickrand

Member
Sep 4, 2004
67
0
0
I have to believe that 99% of the people here believe the truth is somewhere in the middle on many of the key points. Let me take my stab:

1. the multi-core cpu's are relatively cheaply produced but for the cost are going to be quite powerful.
2. The XBOX360 and PS3 will be very, very powerful. They won't be the most powerful pieces of equipment on the planet, but I wouldn't be surprised if you needed something like 2 6800's in SLI to compete.
3. Price of the console will be ridiculously low for the power you get.
4. Programming/coding will be difficult since it is new technology that many are not efficient at. May drive game costs and lead times up. But the companies that work the hardest at it, learn it the best, fastest, will make great games in relatively short time periods.


My point is - people are playing extremes here and it likely won't be either extreme (i.e. cheap piece of crap that no one buys cuz it sucks or the death of pc gaming).
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Where do you get your information? Forums like these on a different site?

If you are really interested in learning the finer nuances of 3D then I would start with Computer Graphics Principles and Practice(Foley and Van Damm) and when you finish with that you will quickly figure out where to pick up the rest.

Deeply pipelined processors w/poor branch predicition are crippled by poor effeciency and stunning amounts of time lost to branch misprediction.

I can tell you are from the younger generation ;) That isn't meant to be a dig at all, just there are quite a few of us who cut our teeth writing code long before there was anything like a BPU on a processor- deep pipelines don't matter much if you know how to generate in order code everywhere possible(sometimes the way you have to do things takes significantly longer on an IOE core then it would be using a branch on an OoO core but it is much faster then having a branch on an IOE).

What doesn't help is that these processors where designed to be on the cheap from the get go. For such a small die size, nothing else could be true.

Cell has a die size of 221mm/\2 compared to the P4 D which has a die size of 206mm/\2- it is larger then Intel's latest and greatest dual core chip. It also has a slightly higher transistor count(234M for Cell v 230M for the P4D). Cell is not 'cheap' no matter how you look at it.

Another saving grace that would help would be a big, fast cache. The one that the xbox and PS3 is graced with certainly doesn't have a big L2 cache (1 meg and 512k, respectively), and I would venture to say it isn't that fast, either.

PS3 has a total of 2.28MB of cache(or what amounts to it) on die- 32KB L1 for the PPE; 128KB each for the SPEs(although the SPEs SRAM functions as memory in terms of programming interface) and 512KB for L2. In terms of speed Cell's on die SRAM operates @3.2GHZ- it's quite fast actually.


On the xbox, we now have three cores going over a northbridge to main memory. I remember reading about how much bandwidth they have, which was impressive but not the most important factor. There are two other important things to consider - the way in which this memory is accessed by the cores and its latency. Having a northbridge at all is something of a hindrance, but obviously we have been using that tech for quite some time without issue.

Much like a Pentium 4D- the latest and greatest from Intel. XB360 has significantly more bandwidth available to it then a desktop CPU when it runs in to a cache miss- although remember that these are IOE cores so predicting what needs to be in that cache is much easier then it is on a OoO core.

Even to do that, we are going to be at the mercy of a developers budgeting department if in fact we are to see any of this.

You need to realize the scales of economy when dealing with the consoles. Think of it this way- the XBox and PC could easily utilize a fully developed extremely mature development platform and take their normal routines and throw them to MS's compilers and all ran great. The PS2 OTOH need to have the chip architecture studied- assembly level coding learned- and all of the development tools built from the ground up by developers. The PS2 landed by far the most viable commercial games out of all the gaming platforms in its lifetime. Why? Closing in on 100Million installed base and all of them are gaming machines(where PCs have single digit game based priority use).

Under the traditional SDLC, adding the sort of complexity that we are talking about (attempting to multithread a application that doesn't lend itself naturally to it) will add exponential costs in terms of development time and cost. Ben, I don't think that you are grasping this.

I full appreciate the level of complexity we are talking about- I know how much dev time is going to be added and I know all of the additional headaches they are going to have to go through and the consensus is already in- it's a lot easier then the PS2 was.

What it will really be is the dev house trying to put a game out on an a realistic budget and timeframe telling the coders to get it all checked in, because once it is running acceptably (and that standard seems to be astonshingly low in some cases on consoles) that it will be shipped out.

A game that crashes at all is considered a disgusting embarassment to the console community. A new PC game that doesn't crash is called Carmack's last game(and that is pretty much it). PC games ship in a state that would be laughed at as an alpha by any console manufacturer(outside of MS in the odd case of Morrowind- that game was so horrificly badly coded it almost felt like a PC game after only a few patches). Consoles standards are leagues beyond anything PC games can approach outside of id.

Real world economics has to play a factor in this.

That's right- a game that hits 100K units on the PC is a hit, on a console its an abject failure. Million plus selling games are commonplace on the consoles- unless they get in to the ~10million range sold you won't hear much about it because it isn't that big of a deal(as opposed to the PC where a single game has ever broken the 10Million mark- The Sims).

Also, we don't want to see 3-4 development times here. That is nearly the lifetime of the console. 1-2 years is pretty long as it is, but to really, really multithread a project on the scale we are talking, that is what it could potentially take to get a polished product out the door. On the console side, I guess if the game is a Halo or GTA this could be feasible, but for many titles it would not be.

Devs are finding Cell much easier then the EE- it looks like the coding side is going to be simpler for them then it was on the PS2.

Last is this minor issue that bugs me about HDTV and the future of our media. This off topic here, but the TV you linked me too still doesn't have HDMI.

I actually went in to a WalMart yesterday and they had a comparable TV to the one I linked for $537(just mentioning that).

That means it is useless for HD-DVD for sure, and almost assuredly for Blue-Ray as well. It would be foolish to invest in a TV that doesn't have that feature at this point. If the xbox360 wants to ever have an HD-DVD drive, it will have to worry about DRM too. Horrible stuff, really. To enjoy these consoles on a decent TV you are going to have to shell out some big bucks at this point, bottom line.

DRM won't impact outputting games, and Blue-Ray hasn't signed on to the DRM standard that HD has yet AFAIK(wouldn't surprise me to see Sony come up with their own for sure). DMI equipped big screen for $1400. I was pointing out cheap WalMart displays to point out that it was easy and cheap to get HDTV now, but you can get yourself a DMI equipped decent HDTV for pretty cheap also. As far as gaming is concerned- you have no need for a DMI equipped set- the WM caliber displays will do for that, but to get the most out of Blu-Ray or HD-DVD the step up in price isn't ver much compared to a non DMI equipped TV of comparable specs(a couple hundred bucks).
 

blckgrffn

Diamond Member
May 1, 2003
9,686
4,345
136
www.teamjuchems.com
A game that crashes at all is considered a disgusting embarassment to the console community. A new PC game that doesn't crash is called Carmack's last game(and that is pretty much it). PC games ship in a state that would be laughed at as an alpha by any console manufacturer(outside of MS in the odd case of Morrowind- that game was so horrificly badly coded it almost felt like a PC game after only a few patches). Consoles standards are leagues beyond anything PC games can approach outside of id.

I fully appreciate your point. I rarely pick up PC games as soon as they come out for just this reason, there is always a patch. Typically ports are much more stable than straight PC games, I appreciate that too. What I meant is that if we had a game, for the PC, that ran 15-20 FPS all the time on the latest and greatest hardware, it wouldn't be acceptable. Even 15-20 FPS on minimum requirements is unacceptable. I had a really hard time playing through Halo2 for just this reason, and I was really stunned about how many fans thought it was the greatest thing since sliced bread.

Also, I was referring more to the small die size of the Xenon than the cell. Sony paid plenty for that baby, so we will just have to see how it performs. I don't like that they use simple DMA and seem to share the bus, but that is definitely their call. Also, when I referred to the cache being slow, I was referring to its latency and how many way set associative it is. For example, the Pentium 4's went from 1 meg of L2 cache to 2 meg, the latency went up. This is because of deepening each set, they simply doubled the physical number off them. Inside each "set" are a number of fully associative chuncks of memory that can all be looked at simultaneously, the size of data is how many "way" associative it is. Another example, one you will know quite well due to your quite extensive background, is the cache on the P3/celeron in the Xbox. The reason it would be called a celeron that it only has 128K L2 cache. The reason it is called a P3 is because that is still the more expensive 4-way set associative cache instead of the 2-way cache that a celeron would normally enjoy. Associative memory is expensive and gets more and more so the bigger it gets. I am guessing that the caches are only 2-way set associative on the cell and xenon simply because while they are small for the number of cores (particularly the xenon), they are fairly large as far as total die size is concerned. Thus, a lot of money could be dumped on the cache... although the cache might be something like 8-way, which would be nice and competitive with current PC processors.

Calling the Pentium D the latest and greatest from intel is a bit of a misnomer, as it was more band-aid fix so that AMD was not the only player in the DC market. This is indicated by their poor, from an engineering standpoint, method of accessing main memory and how much they didn't think that the market would go there (DC) for a long time and that single core performance would carry them on. Obviously, that bubble has burst. But it would be a mistake, imho, for the xenon to do the same thing that the D does and give only one processor at a time access to MM (shared bus). Also, bandwidth really pertains to how much data can travel at any one time in and out of a processor, but the real gains come from lower latency. That much bandwidth does the processor little good if after a wrong branch it needs data or instructions from main memory. Once developers start making use of all the cores this will be especially important, as the cache will be split many ways, making L2 cache misses that much more frequent. Also, it is important that in-order code can take up a lot more space than in order code, loops aren't for lines and 5-6 instructions, they are 4 lines/instructions*how many times the loop must execute, if I remember in order right. No riding a pointer around, right? Right. This may make the somewhat low L1 cache (I am sure 16K instruction/16Kdata) that much tighter as instruction size grows. I could right machine code in OoO in about, well, 6-7 lines that had a loop that would go around adding one to an integer until that integer equalled the absolute value of the desired data. That same machine code with in order code would quickly grow.... hmm.... the more I think about it, the less I like in order altogether :) Sounds simple, but man, I can only imagine what that code would like when you compiled/ran it and what the hardware would be doing. Thanks for pointing out how you might simply lay out all the code rather than branch in an deep pipelined environment - I suppose if that is what gives you the best consistent performance that is what you would do. You're right, I am younger, it just seems like bass-akwards way of doing it. I can see how it would work though, and with some BP and some heavy hinting in the code you could probably eke out some decent performance saving branches when you had to. In console land, this will be doable, I keep thinking of how this would work for PC game devs and I would just shake my head in pity, they would go so long and be so over budget, if they didn't make the next The Sims they would all have to find new places to work :)

I realize that Blu-Ray hasn't gone the same way as HD-DVD, I only am hoping they come up with a solution like DVD that allowed it to go out to a TV but look really crappy if you tried to capture it with a VCR, etc. Lets just say that I am bitter as none of the big screens that I help people purchase last year have HDMI and I am going to look foolish for recommending them, even when HDMI didn't even show up on the mainstream radar at that point. At this point, I cannot recommend a spending more than $300 on a TV without DRM compliant input.

Code can be written in a such a way to minimize branching, but it is going happen, espcially in applications like physics and collision detection. I guess that is why devs are lucky to have another processor or processor array to feed that too and that it is something that is easily threaded off.

Economics does play in, and there is much money to be made in the console business, that is for sure. For me, it is hard to imagine how consoles manage to live for so long. It is hard for me to envision this, the masses flocking to their consoles and snapping up games, but then again I went from a genesis to a decent computer and have never really looked back. Sure, I own a gamecube (just shipped it back to nintendo yesterday, actually, to get one with digital out, today I have to order the cables) to see how 540P *edit, only 480P, sadly...* looks. I always liked the cubes graphics, especially when it was considered the underdog of the consoles by many.

If you are really interested in learning the finer nuances of 3D then I would start with Computer Graphics Principles and Practice(Foley and Van Damm) and when you finish with that you will quickly figure out where to pick up the rest.

This isn't going to fly over my head, is it? I didn't fare well in linear, and was only in the middle of the pack in calc :) They are both still painfully fresh though :p
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,003
126
You are ignoring the eDRAM.
No, you're ignoring there's only 10 MB of it and it's not enough to fit 720p + 2xAA, much less 1080p + 4xAA.

Tiling- running 1920x1080 w/400x AA you would only be writing out 949.22MB/sec at 120FPS to system RAM the way the XBox is setup
Uh, what? 1920x1080 with 32/24/8 and 0xAA requires about 1.9 GB/sec @ 120 FPS and that's assuming nothing more than a straight data transfer with no rendering whatsoever. So I have to ask where those numbers came from?

And 400x AA? LOL! Is that going to magically fit into the 10 MB cache too is it? Along with the tiles required for it? :roll:

What about a robust general purpose driver with specific optimizations added on top of that?
But that didn't happen. If the optimizations fell over (which was quite easy to happen) the next layer was far from robust given it was abysmally slow.

IIRC that was simply the capture functions not being able to do framegrabs
IIRC that was only one part of the problem.

Put them in a game and I will.
That's not possible with the likes of 32 bots/players running around in a typical PC MP based game. Consoles never do this so they can afford to put such characters in the game. This isn't a limitation of PCs, it's the fact that PC games just have a lot more things happening on the screen.

As far as games with comparable amounts of characters- how about DooM3
Doom III is a great example of what PC games are capable of.

That isn't meant to be a dig at all, just there are quite a few of us who cut our teeth writing code long before there was anything like a BPU on a processor
Sure, back in the days of Tetris, Pacman and Space Invaders.

Back in the days with no caches or pipelining and when fetching data from memory was faster than evaluating it on a processor.

Thngs have changed a bit since then I'm afraid.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
BFG

No, you're ignoring there's only 10 MB of it and it's not enough to fit 720p + 2xAA, much less 1080p + 4xAA.

They are TILING THE DATA.

Uh, what? 1920x1080 with 32/24/8 and 0xAA requires about 1.9 GB/sec @ 120 FPS and that's assuming nothing more than a straight data transfer with no rendering whatsoever. So I have to ask where those numbers came from?

Why don't you read a little tiny bit into the functionality of the R500 and exactly how it is working? First off- Z data isn't going to be stored in system memory- they will utilize everything they can fit in the GPU's caches and use overflow to the eDRAM. Then they perform a Z pass and calculate out which geometry is going to fall into a given tile and then split the rendering for a frame up into segments to be drawn out per tile. When a tile is rendered- entirely inside of the eDRAM- completely then it is written out to main memory. ALL of your BB mem accesses is going to be performed directly to the eDRAM- not until rasterization is complete will anything be written to system memory. All of this has been widely talked about in every remotely decent article on the XB360- why aren't you getting it?

But that didn't happen. If the optimizations fell over (which was quite easy to happen) the next layer was far from robust given it was abysmally slow.

So it is the mobo's drivers fault that a 486-66MHZ is slow..... right.

That's not possible with the likes of 32 bots/players running around in a typical PC MP based game.

I put no restrictions on the type of game- show me any title, any genre, on the PC.

Consoles never do this so they can afford to put such characters in the game. This isn't a limitation of PCs, it's the fact that PC games just have a lot more things happening on the screen.

I've already pointed out to you that D3 doesn't have much going on at all on screen- neither do most games actually. If you want to talk about something like Rome:Total War then you are talking N64 level graphics for the individual characters. I asked you to point out anything comparable to the PR girls in a game- any game- on the PC. DOA is just a singular example of a launch title created for a last generation console.

Doom III is a great example of what PC games are capable of.

And looking quite dated compared to the upcoming console games such as RE5.

Back in the days with no caches or pipelining and when fetching data from memory was faster than evaluating it on a processor.

The PentiumPro was the first chip with a BPU- they haven't really been around long at all. Quake was mainly run on chips without a BPU as an example.

blckgrffn

What I meant is that if we had a game, for the PC, that ran 15-20 FPS all the time on the latest and greatest hardware, it wouldn't be acceptable.

It happens- was it UltimaXI then had trouble hitting 20FPS when it came out(on the highest end rig you could build and that wasn't average- that was peak :p ). You are correct that the performance standards are lower for a console overall but not entirely. CivIII runs at framerates that no console title would allow- but due to the type of game it is that isn't a problem on the PC.

Also, I was referring more to the small die size of the Xenon than the cell.

It appears MS wanted to spend more money on the GPU then the CPU- that dual chip setup the R500 is using has to be extremely costly.

Also, when I referred to the cache being slow, I was referring to its latency and how many way set associative it is.

I understand that but it depends on what core doing what function- you can't really draw a direct comparison to PC processors on that front.

Calling the Pentium D the latest and greatest from intel is a bit of a misnomer, as it was more band-aid fix so that AMD was not the only player in the DC market.

But it is the latest and greatest from Intel- even if it was a desperate attempt to keep up with AMD :)

Also, bandwidth really pertains to how much data can travel at any one time in and out of a processor, but the real gains come from lower latency.

My apologies- I forgot to mention that latency tends not to be as much of an issue when dealing with proper IOE code- you don't need much guess work when you know your instruction scheduling in advance.

No riding a pointer around, right?

Depends on the operation but most of the time you would want to stay away from it. Both Xenon and Cell support AltiVec or a super set of it.

I could right machine code in OoO in about, well, 6-7 lines that had a loop that would go around adding one to an integer until that integer equalled the absolute value of the desired data. That same machine code with in order code would quickly grow.... hmm.... the more I think about it, the less I like in order altogether

It goes beyond how far back you are taking it though- you try to think of another approach to get the same results entirely. Not always possible- but you need to think of it in a manner that will end up making the best useage of an IOE core.

You're right, I am younger, it just seems like bass-akwards way of doing it.

Compared to how it is done now you are right- a lot of us were forced to learn it this way from day one(and speaking for myself I really didn't like it which is why I got away from it- way to freakin tedious).

It is hard for me to envision this, the masses flocking to their consoles and snapping up games, but then again I went from a genesis to a decent computer and have never really looked back.

A big part of it is the games- way too many console exclusive titles the PCs will never see. Another is the ship it now finish it later mentality of PC publishers. This is not tollerated on the console side- you buy a game you know it is going to work exactly as the developers knew it would for you the same as everyone else. It allows for considerable tweaking of gameplay, and obviously scales of economy allow for much larger levels of polish to be put into console titles. For me I have been gaming on both since the 70s and have watched both platforms evolve considerably over the years. They both have had their strengths and weaknesses over the years but in reality PCs at this point have two genres they dominate and the rest are really leaning heavily towards the consoles.

Sure, I own a gamecube (just shipped it back to nintendo yesterday, actually, to get one with digital out, today I have to order the cables) to see how 540P *edit, only 480P, sadly...* looks. I always liked the cubes graphics, especially when it was considered the underdog of the consoles by many.

If you are a Cube owner- have you played Eternal Darkness? A title that is rather unfortunate. Extremely well done, but considered a horrible failure on the Cube with numbers that would have been considered a smash hit on the PC. A shame really as I thought SiliconKnights did an exceptional job with the title- my favorite Cube title this gen. The Cube is the underdog in terms of it being Nintendo- but the rasterizer in the PS2 is so weak that the Cube's GPU is enough to give it a clear lead in the visual department.

This isn't going to fly over my head, is it?

I doubt it, although I always aced every mathematics course I took(took Calc as a freshman in HS) so I may not be the best one to ask on that. Really though, when you can see exactly what it is and why it is you are doing what you are doing I think it would come fairly easily to you(they start off with the very basic fundamentals- not complex by any means).
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,003
126
they will utilize everything they can fit in the GPU's caches and use overflow to the eDRAM.
But at 720p x 2AA there'll be overflow to main memory, much less at 1080p x 4AA

When a tile is rendered- entirely inside of the eDRAM-
Given there's not enough room for 720p x 2AA where are these tiles going to go at 1080p x 4AA?

So it is the mobo's drivers fault that a 486-66MHZ is slow..... right.
Huh?

I put no restrictions on the type of game- show me any title, any genre, on the PC.
asked you to point out anything comparable to the PR girls in a game- any game- on the PC.
Doom III.

And looking quite dated compared to the upcoming console games such as RE5.
Let's wait until the games are actually available, shall we?

The PentiumPro was the first chip with a BPU- they haven't really been around long at all. Quake was mainly run on chips without a BPU as an example.
I'm sorry, but that's just nonsense.

The Pentium processor 75/90/100/120/133/150/166/200 superscalar architecture can execute two instructions per clock cycle. Branch prediction and separate caches also increase performance.
As for Quake, that required a Pentium processor as a minimum and once they moved to GLQuake the assembler rendering code became obsolete.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
But at 720p x 2AA there'll be overflow to main memory, much less at 1080p x 4AA

What part of this are you not getting?

Given there's not enough room for 720p x 2AA where are these tiles going to go at 1080p x 4AA?

So when you tile a 15'x15' room- every tile you use is going to be 15' x15' or larger, right? What you are saying at this point is nonsensical- I have no idea what part it is you aren't grasping so I'll give the whole thing a shot I guess-

Scene data arrives at the GPU
GPU processes a Z only pass(which running 1080p would only be 7.91MB; 720p 3.51MB) to determine what geometry falls within a given tile
GPU divides up the scene into tiles it can fit in to eDRAM
A tile is rendered to eDRAM performing all of the rasterization for the segment of the screen it covers then that segment- the final render- is written to system RAM
The next tile is rendered the same as the first with the final rendered output written to system RAM
Repeat until frame complete- flip buffer- start next frame


It is also the drivers fault that the Radeon 9000 isn't faster then SLI 7800 GTXs using your logic. Your statement was that removing optimizations that hurt performance considerably indicated less then robust drivers- if the optimizations are working around a performance limitation of the hardware how are robust drivers going to help out?

Doom III.

D3 looks very good- but it isn't remotely close to the PR girls- that is a laughable assertion at best.

Let's wait until the games are actually available, shall we?

Sure. What do you think is going to change? You will still be in denial then as you always have been.

I'm sorry, but that's just nonsense.

My mistake, the PPro was the first out of order code chip from Intel- not sure why I mixed those up.

As for Quake, that required a Pentium processor as a minimum and once they moved to GLQuake the assembler rendering code became obsolete.

GLQuake upped the system requirements all around though(understandably).
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,003
126
What you are saying at this point is nonsensica
The fact that the scene has to be split into multiple tiles means you're going to get a performance hit not associated with a scene that fits into one tile. Not to mention texturing operations still continue to be done from system RAM which means they'll be done on multiple tiles.

If ATi claims 95% performance between two and three tiles I'd like to see their claims for eight or nine tiles which is what 1920x1080p 4xAA would require.

In any case a 7800 GTX SLI setup can run pretty much any game at that setting far faster than 30 FPS, that's for sure.

Your statement was that removing optimizations that hurt performance considerably indicated less then robust drivers- if the optimizations are working around a performance limitation of the hardware how are robust drivers going to help out?
I really don't see where you're going with this.

D3 looks very good- but it isn't remotely close to the PR girls
Actually in many ways it looks better due to the lighting and normal mapping.

You will still be in denial then as you always have been.
The fact that many PC games don't match console games' character quality is nothing to do with the platform, it's the fact that PC games have much more happening on the screen at once.

GLQuake upped the system requirements all around though(understandably).
But it ran ten times faster with far higher IQ. They moved away from assembly rendering and things got better.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
The fact that the scene has to be split into multiple tiles means you're going to get a performance hit not associated with a scene that fits into one tile. Not to mention texturing operations still continue to be done from system RAM which means they'll be done on multiple tiles.

If ATi claims 95% performance between two and three tiles I'd like to see their claims for eight or nine tiles which is what 1920x1080p 4xAA would require.

Vertex shader overhead is likely going to be your limiting factor there as you have to rehandle all geometry based operations.

In any case a 7800 GTX SLI setup can run pretty much any game at that setting far faster than 30 FPS, that's for sure.

So what console devs do is utilize more and more power towards increasing the quality of on scene visuals. Something fixed platform gives you the benefit of.

I really don't see where you're going with this.

You stated it was better to have a robust all around driver then one with optimizations- I stated it is better to have a robust all around driver with optimizations- you stated that isn't what we were looking at as without optimizations the NV3x parts fell over- I'm stating that they had much slower shader hardware then the R3x0 parts- it had nothing at all to do with their drivers lacking robustness.

Actually in many ways it looks better due to the lighting and normal mapping.

Have you seen Dawn and Nalu running? They are easily vastly superior to the characters in D3 in pretty much every aspect(detail, skinning, shaders, complexity- Nalu's hair in particular).

The fact that many PC games don't match console games' character quality is nothing to do with the platform, it's the fact that PC games have much more happening on the screen at once.

Where are you getting this from? I'm really totally baffled by why you think PC games have more going on on screen then console titles- which games are you comparing them to?

But it ran ten times faster with far higher IQ. They moved away from assembly rendering and things got better.

Which is why I said it was understandable. Also- they moved away from assembly based software rasterization to hardware rasterization. The PS3/XB360 are certainly not lacking in terms of hardware rasterization(the PS2 OTOH most certainly is though).
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,003
126
So what console devs do is utilize more and more power towards increasing the quality of on scene visuals.
Theoretical discussion aside, it appears a 7800 SLI setup can already best these consoles.

You stated it was better to have a robust all around driver then one with optimizations
No, I stated that if you're going to put optimizations into a driver they should be robust and benefit a wide range of situations.

Have you seen Dawn and Nalu running? They are easily vastly superior to the characters in D3 in pretty much every aspect(detail, skinning, shaders, complexity- Nalu's hair in particular).
Yes and I'd have to disagree. The quality of Doom III's lighting and normal mapping is not reproduced in any of the tech demos. The cinematics of Betruger and Swann for example are simply amazing, as is the detail seen in the Hell areas.

I'm really totally baffled by why you think PC games have more going on on screen then console titles- which games are you comparing them to?
What console game has the likes of UT2004 or BF1942 with dozens of players and vehicles on the screen at once? Most console games are similar to Doom III with respect to number of characters on the screen, except on the PC Doom III is not the norm in this regard.
 

Drayvn

Golden Member
Jun 23, 2004
1,008
0
0
Well you can be pretty sure that your not gonna get much on screen with the Xbox Doom 3 :p
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Theoretical discussion aside, it appears a 7800 SLI setup can already best these consoles.

It isn't theory- it is reality. Carmack states that given a fixed hardware platform a 100% performance increase is reasonable over the PC- the RSX is considerably faster then a GTX. Remember all the BS about PS 2.0 being so important years ago? Still is a minority feature on games, SM 3.0 even moreso and even those titles that do use them use them extremely sparingly for the most part(D3 is the only game I've seen that uses what I would consider a heavy shader load, and that is mainly a PS1.1 level shader). Console devs don't have to deal with writing code for POS Dell systems that most gamers run- an anchor PC gaming will always have around its neck.

No, I stated that if you're going to put optimizations into a driver they should be robust and benefit a wide range of situations.

If it isn't possible to place an optimization that is capable of benefiting a wide range of situations on a particular piece of hardware but it IS possible to put in opitmizations that can speed up the game you are playing you don't want them.....?

Yes and I'd have to disagree. The quality of Doom III's lighting and normal mapping is not reproduced in any of the tech demos.

And the skinning, polygon complexity, and animation is vastly inferior in D3.

What console game has the likes of UT2004 or BF1942 with dozens of players and vehicles on the screen at once?

UnrealChamionship2 and Battlefield2 perhaps?

Most console games are similar to Doom III with respect to number of characters on the screen, except on the PC Doom III is not the norm in this regard.

You very clearly don't have a clue in the least about console gaming. You know what the two biggest franchises are on the consoles for this generation- GTA and Madden both of which have 20-30 characters on screen regularly(between the two franchises they have sold in the 50Million unit range this gen). Take a title like Kessen where you are dealing with closer to 100(I actually believe it peaks much higher, but I'm not sure) on screen characters for the less common element(although still not an extremely rare case) or you can get in to something like Shenmue where you have loads of characters and busy city streets everywhere you look, or something like Pikmin where you are looking at, if you are playing the game properly, in excess of 100 characters on screen almost constantly, RogueSquadron which has in excess of fifty ships on screen at once, Jak&Daxter which has enemy counts comparable to SeriousSam(well short of the real big hitters for the PC or consoles)- pretty much every game that isn't a corridor shooter, a fighting game or something that doesn't lend itself to a lot of characters on screen at once has them for consoles. It is a given that you will be looking at between a dozen and two dozen characters on screen playing through most console titles- very far removed from D3.

PCs have the absolute edge in terms of on screen characters- Rome:Total War surpasses anything currently on the consoles- but to imply that D3 is in any way remotely indicative of what console titles have for character count is delusional.
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,003
126
Carmack states that given a fixed hardware platform a 100% performance increase is reasonable over the PC
Which PC?

the RSX is considerably faster then a GTX
Perhaps but I doubt it's faster than two.

Remember all the BS about PS 2.0 being so important years ago? Still is a minority feature on games,
That is completely false. Almost every game that has shipped in the last twelve months has an SM 2.0 path and almost every game shipped in the last six months requires at least 1.1 with BF1942 requiring 1.4.

but it IS possible to put in opitmizations that can speed up the game you are playing you don't want them.....?
Sure, but how does the IHV know which games I'm playing? And how do I know the IHV is putting optimizations for the games I'm playing?

You can't trust the benchmark bars for this very reason and that's my point. With "optimizations" like these you don't even know what you're getting.

You know what the two biggest franchises are on the consoles for this generation- GTA and Madden both of which have 20-30 characters on screen regularly
You can't claim that those games' characters have the level of detail matching something like Doom III or Dawn, which brings us back to my original comment of numbers of characters on screen.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Which PC?

Comparable spec hardware.

Perhaps but I doubt it's faster than two.

2x24ALU part at 450MHZ vs a 32ALU part @550MHZ given SLI overhead it certainly wouldn't shock me to see it quite comparable- being faster in certain situations.

Almost every game that has shipped in the last twelve months has an SM 2.0 path and almost every game shipped in the last six months requires at least 1.1 with BF1942 requiring 1.4.

By 'almost every' you in fact mean 'almost every high profile first person shooter that I personally happen to be interested in' of course. The majority of PC games coming out today certainly don't support a PS 2.0 path.

Sure, but how does the IHV know which games I'm playing?

Sales charts tend to give a good indicator.

And how do I know the IHV is putting optimizations for the games I'm playing?

You won't.

You can't trust the benchmark bars for this very reason and that's my point.

And why do you think you can anyway? When we saw benches of the R9700Pro v the NV25 parts it looked like the R9700Pro slaughtered it- albeit with enormous reductions in IQ that you couldn't work around- a trend that has continued from both ATi and nV with both of them offering inaccurate parts that can't render basic elements properly. At the very least we should expect decent performance to go along with it.

With "optimizations" like these you don't even know what you're getting.

The hardware "optimizations" that we are seeing are a much larger factor in reduced IQ then any driver level ones we have been seeing.

You can't claim that those games' characters have the level of detail matching something like Doom III or Dawn, which brings us back to my original comment of numbers of characters on screen.

What original comment? You have made several erroneous claims on this discussion arc- which one are you returning to?
 

bry223

Junior Member
Jul 2, 2005
4
0
0
To anyone that doubts the Xbox360's performance please watch the real time videos of Gears Of War and Project Gotham Racing 3 wich are running at 1080i. Gears Of War on the U3 engine is smooth as silk. While Epic can barely get it to run smoothly on a 7800GTX @ 1280X1024.

Whats even more funny is Gears Of War was running on a alpha kit (Dual G5 2.5ghz w/ X800).

As for the article, someone is saying hte CPU performance in games is weak because they didnt program it for multi-threading to take advantage of all 3 cores? Thats like me complaining I cant get a square box into a circular hole.
 

BFG10K

Lifer
Aug 14, 2000
22,709
3,003
126
Comparable spec hardware.
It's fortunate then that PCs are always getting faster while consoles are expected to have a shelf life of years on end.

The majority of PC games coming out today certainly don't support a PS 2.0 path.
Likewise, not all console games look like DOA:XBV.

Sales charts tend to give a good indicator.
Not to me they don't. I play the games I like, not what sales charrts tell me.

You won't.
That doesn't help me then does it?

When we saw benches of the R9700Pro v the NV25 parts it looked like the R9700Pro slaughtered it
Except in that case you could pick just any non-standard shader based game and see similar trends. Or to put it another way, ATi wasn't just winning in the high profile titles.

What original comment? You have made several erroneous claims on this discussion arc- which one are you returning to?
Do you think a PC can manage 32 Dawns in the screen at once? No? Then why do you think a console can?
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Likewise, not all console games look like DOA:XBV.

That's true, but in the case of DOA3 it was a launch title for the XBox, nothing on the PC was close at the time despite the PC having a very decisive edge in processor power and close to comparable GPU.

Not to me they don't. I play the games I like, not what sales charrts tell me.

I realize that, but games that tend to do well on the sales charts also tend to be good games in the last few years. Most of the games you have been talking about lately are the big hits on the PC side for sales too.

That doesn't help me then does it?

In the case of D3 the ATi optimizations benefited you- are you saying that didn't help you? We didn't know about it for a while in terms of them publicly admitting it, but it was there.

Except in that case you could pick just any non-standard shader based game and see similar trends. Or to put it another way, ATi wasn't just winning in the high profile titles.

That's true- but it was losing in IQ in the high profile and low level titles alike.

Do you think a PC can manage 32 Dawns in the screen at once? No? Then why do you think a console can?

Fully optimized code path. I think if you took one exacting PC, say an A64x2 4400 paired with 7800GTX SLIs on a NF4U mobo with the fastest RAM settings you could manage and wrote it for exactly that singular piece of hardware then the PC could handle it. But PC code can't be written that way- it needs to work on thousands of different setups and must use general purpose code in order to do this. Writing to an exacting piece of hardware allows you tollerances that are not possible on flexible platforms- the reason why Carmack states that you can expect a ~100% performance improvement for like hardware when one is a fixed platform. Also on a console you don't have OS overhead to deal with, you have significantly greater system level bandwidth and it also appears that both of the upcoming consoles should have more effective shader power then a 7800GTX SLI setup.

Hacp-

Really shouldn't this be in highly technical?

This isn't highly technical- if it was in that forum it would have been locked by now. If BFG and I were to really start debating the finer nuances of how these systems work on a lower level then it would merit being placed there, but right now we are mainly discussing higher level functionality and generalities. Really, these are relatively simplistic elements.
 

Drayvn

Golden Member
Jun 23, 2004
1,008
0
0
Ben, in terms of PC games, there have been many great games. At least half or maybe less (probably exaggerating a little there :p ) These great games have not actually sold all that well for some reason, even if you hear tons and tons of praise for it, not a lot of them sell much.

Sacrifice, Giants: Citizen Kabuto, Beyond Good and Evil and many more over there years have gotten tons of praise but never really sell much.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Ben, in terms of PC games, there have been many great games. At least half or maybe less (probably exaggerating a little there ) These great games have not actually sold all that well for some reason, even if you hear tons and tons of praise for it, not a lot of them sell much.

And by statistical analysis that means not very many people are playing them- so odds are that optimizations for them will benefit less people.

Sacrifice, Giants: Citizen Kabuto, Beyond Good and Evil and many more over there years have gotten tons of praise but never really sell much.

None of which are titles BFG got into- all of which are titles I have played extensively. In the case of Sacrifice in particular I'm still p!ssed off that ATi refuses to fix the game in their current drivers. As an aside- as far as Sacrifice and Giants are concerned both of them were in fact heavily optimized in driver profiles by nVidia at least- nVidia even financed a special DX8 build of Giants which added more advanced shader techniques to the games already extremely impressive(for the time) DX7 visuals. They also convinced Shiny to add Dot3 bump mapping to Sacrifice in the third major patch for the game IIRC(can't recall if that was the particular patch revision or not, but it was a nV pushed initiative that Shiny went along with).