railven
Diamond Member
- Mar 25, 2010
- 6,604
- 561
- 126
Isn't that still 4?
2 HW * 2 ACE each = 4.
+ 4 ACEs == 8 total.
Another post covered it in more details.
Isn't that still 4?
2 HW * 2 ACE each = 4.
+ 4 ACEs == 8 total.
Another post covered it in more details.
This has me wondering about how benchmarks are made. At least dedicated benchmark apps. What was their goal? best case on each arch? worst case? What level of optimizations? How did they choose what they put in? Why does it run like crap and look lame?
Why should it matter, if they choose to make it in a way that won't fit most games? i kinda see why some sites ignore 3dmark.
IMO they should do maximum feature support to achieve the same visuals. that might be worth something.
It also has me wondering what prompts you to ask this at this time.
Yup. RX 480 gains decent, but note that it's 32 ROPs and lower bandwidth which may or may not be a bottleneck, it's dependent on the game/synthetic.
Under the hood, the engine only makes use of FL 11_0 features, which means it can run on video cards as far back as GeForce GTX 680 and Radeon HD 7970. At the same time it doesn't use any of the features from the newer feature levels, so while it ensures a consistent test between all cards, it doesn't push the very newest graphics features such as conservative rasterization.
That said, Futuremark has definitely set out to make full use of FL 11_0. Futuremark has published an excellent technical guide for the benchmark, which should go live at the same time as this article, so I won't recap it verbatim. But in brief, everything from asynchronous compute to resource heaps get used. In the case of async compute, Futuremark is using it to overlap rendering passes, though they do note that "the asynchronous compute workload per frame varies between 10-20%." On the work submission front, they're making full use of multi-threaded command queue submission, noting that every logical core in a system is used to submit work.
Both cards pick up 300-400 points in score. On a relative basis this is a 10.8% gain for the RX 480, and a 5.4% gain for the GTX 1070. Though whenever working with async, I should note that the primary performance benefit as implemented in Time Spy is via concurrency, so everything here is dependent on a game having additional work to submit and a GPU having execution bubbles to fill.
Futuremark is using it to overlap rendering passes, though they do note that "the asynchronous compute workload per frame varies between 10-20%."
@Azix
That's a terrible technical guide, it does not even go into the technical aspects.
They don't specify further, just that they use Async Compute to increase GPU utilization.
id Software uses Async Compute to both increase shader utilization with post effects, and to actually run Rasterizers & DMAs in parallel with Shaders via Shadow Maps & Megatexture streaming.
If it's just filling out gaps in shader usage, then Fiji should have much bigger gains than Tahiti, Tonga or RX 480, due to the scheduler : shader ratio being so shader heavy.
yeah I was disappointed. adds nothing much.
They probably held off on some things for the same reason they use FL 11_0. But then that brings up the question of the relevance of the benchmark. If it doesn't reflect what we see in games then using it would give inaccurate information.
Maxwell 2 chips should be able to do some form of async compute, iirc. But not Maxwell 1.
http://wccftech.com/nvidia-amd-directx-12-graphic-card-list-features-explained/4/