Originally posted by: 5150Joker
Originally posted by: Acanthus
Originally posted by: 5150Joker
Originally posted by: Acanthus
Originally posted by: 5150Joker
Originally posted by: Acanthus
Youre beating branching to death, evidence?
Next time do your own research:
R520's batch size, being only 4x4 pixels large, should be very efficient for large batch sizes, at least in relation to NVIDIA's G70 which is described as having batch sizes of 64x16 (1024) pixels. R520's Pixel Shader architecture also has a specific Branch Execution Unit which means that ALU cycles aren't burned just calculating the branch alone for each pixel.
Source: http://www.beyond3d.com/reviews/ati/r520/index.php?p=04
Im not the one spouting off in a thread about it. Dont bitch about needing to back up your claims.
Do you know what branching is?
Dynamic branching is clearly spelled out in the article or do you need someone to translate it for you as well? Are you mentally disabled? You don't have to be a programmer to understand the article.
I know exactly what branching is and how it works. The problem here is the performane hit for doing it the way nvidia does it is not large enough to matter. There is no evidence to point to that says this hinders performance at all.
And you start with personal attacks, suprise suprise.
Quick, pull up all of those awesome effects in games that occupy 16 pixels of the screen.
If you know what it is, then why ask for a link? Or were you just trying to troll as usual?
Actually im calling out the Ati troll that takes any part of an article that he thinks is beneficial to his argument and beats it to death in every single video thread he comes across.
Since you seem to think im trolling we can go into detail.
You have effect X and effect Y
ATis buffer for these effects is 16 pixels (4x4)
Nvs buffer for these effects is 1024 pixels (64x16)
Now, lets say we got with beyond 3ds worst case scenario, and say that the particular effect we are putting together is relatively small X can be 10x10, and Y can be 10x10.
But, we can make it even worse for nvidia and make X overlap 4 Ys, so the calculation must be repeated 4 times.
So we have to calculate X effect + Y effect = output 4 times individcually on nv hardware.
On ATi without going into insane logistics and just giving a best case scenario, lets assume they perfectly overlap.
ATi would have to make 9 combination calculations on the exact same effect in an ideal situation.
So while the buffer is smaller individually, more calculations have to be made. Its a benefit and a drawback at the same time.
Potentially by atis design more effects could be combined in total on the same size total buffer, but by nvidias design less total calculations are made.