FarCry 5 and Wolfenstein to use FP16

Elixer

Lifer
May 7, 2002
10,371
762
126
Yeah, I saw that as well...but, I don't think people care what it is using, as long as it is a good game. :)

I would also like to see if they will allow you to toggle that on/off so we can get a performance metric.
 
  • Like
Reactions: Phynaz

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
Well it's good to see that AMD isn't resting on its laurels, as Vega will certainly require specific optimizations in order to compete with big Pascal and Volta. I wonder if this will impact visuals though, since it's a lower precision?
 

dogen1

Senior member
Oct 14, 2014
739
40
91
Well it's good to see that AMD isn't resting on its laurels, as Vega will certainly require specific optimizations in order to compete with big Pascal and Volta. I wonder if this will impact visuals though, since it's a lower precision?

They probably won't use half precision floats where it would make a noticeable difference in quality.
 

Spjut

Senior member
Apr 9, 2011
932
162
106
I guess we can expect FP16 for Maxwell too?

I don't think Nvidia or ID ever said they brought support for Shader Intrinsics on any Nvidia GPU? I endorse that AMD and ID share the viable optimizations between consoles and all GCN gens, but I really wish Nvidia will get more specific optimizations for their GPUs in Wolfenstein.

Speaking of DOOM, I really want to see some recent benchmarks. I did some quick comparisons between Vulkan and OpenGL on the original GTX Titan, and Vulkan had a few FPS higher minimums in those scenes (I forgot about vsync so max was topped at 60^^). But the old tests show terrible Vulkan performance on Kepler.
 

EXCellR8

Diamond Member
Sep 1, 2010
4,070
905
136
was anyone able to get the current gen Wolfenstein to run on a Polaris card? Doom ran fine but new order wouldn't even launch on my Ryzen+Radeon PC.
 

EXCellR8

Diamond Member
Sep 1, 2010
4,070
905
136
nah that's why i bring it up... tried a few different drivers and scoured Steam forums but nothing lead me to any concrete conclusion. ended up just playing it with GTX 980 and it was fine but I would have liked to use the RX 480. few users hinted that a more recent update broke the GFX support but people cry wolf on there like it's their job.

the game doesn't even launch, just a brief black screen and then nothing. CTD perhaps but in a matter of milliseconds
 

Krteq

Golden Member
May 22, 2015
1,009
729
136
FP16 are widely used in previous and current gen consoles.

Anyway, NV30 needed some own graphics language/compiler to utilize that if I remember correctly.
 

zlatan

Senior member
Mar 15, 2011
580
291
136
I guess we can expect FP16 for Maxwell too?

Nope. This is not really FP16, this is packing. It realies on a packing strategy. The theoretical performance uplift is also different between the strategies. And GCN3/4 will get much smaller boost compared to Vega.

I don't think Nvidia or ID ever said they brought support for Shader Intrinsics on any Nvidia GPU?
They just port the PS4 shaders to PC, so these optimizations only works well on GCN.
While in theory it is possible to do the same kind of optimization for NV, it will require an awful lot work. Mostly because they don't really know how the hardware works, while GCN is an open book for everyone, now with a shiny new profiler.
If NV want the same kind of "treatment" from the devs, they need a GPUOpen(Green) initiative.
 

Azix

Golden Member
Apr 18, 2014
1,438
67
91
do they really need intrinsics on nvidia hardware? doesn't the driver just do whatever it wants with the shaders already?
 

Cookie Monster

Diamond Member
May 7, 2005
5,161
32
86
nvidia = fast FP16 and slow FP32
ATI = FP24

:D

Forgot that ATi use to use FP24 up until the R520 and infact might have been a good choice at the time given performance/IQ/power consumption. Miss those days when ATi use to churn out great architectures and features.

Cant believe this was more than a decade ago though!
 

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
Forgot that ATi use to use FP24 up until the R520 and infact might have been a good choice at the time given performance/IQ/power consumption. Miss those days when ATi use to churn out great architectures and features.

Cant believe this was more than a decade ago though!

That depends on the software. What good are architectures and hardware features without software to show for it ? (I guess that's why AMD is banking so much on consoles such as the PS4 Pro to get their games optimized for their hardware. AMD should also additionally try to get AAA developers to take advantage of their GPUs resource binding model by using constant buffers for fetching every resource (lowest latency method for AMD hardware) since Nvidia seemingly released their 64KB constant buffer limit by upgrading to D3D12 resource binding tier 3 even if it might mean more CPU overhead for Nvidia hardware.)
 

Cookie Monster

Diamond Member
May 7, 2005
5,161
32
86
That depends on the software. What good are architectures and hardware features without software to show for it ? (I guess that's why AMD is banking so much on consoles such as the PS4 Pro to get their games optimized for their hardware. AMD should also additionally try to get AAA developers to take advantage of their GPUs resource binding model by using constant buffers for fetching every resource (lowest latency method for AMD hardware) since Nvidia seemingly released their 64KB constant buffer limit by upgrading to D3D12 resource binding tier 3 even if it might mean more CPU overhead for Nvidia hardware.)

It does depend on the software but back in those days, R300 still managed to perform well across a whole slew of games regardless of game engine. Its just like the G80, where you let your hardware do the talking. Sure more performance can be extracted out from certain architectures for specific loads, but if the performance is already good to begin with across the board.. even better!

By features i mean the tile based rasterizion for instance. nVIDIA's DCC technique and so forth where you don't have to rely on software to extract the extra performance (by using instructions specific to the game, or in a way that favors one GPU architecture for example).

I always believed that AMD's strategy ever since the introduction of GCN was to extract performance via software (with incremental architectural updates) due to their financial situation. For gamers, this resulted in longevity. They cant let their hardware to do most of the talking because they cannot afford to churn out new architectures like nVIDIA. Instead they managed to convince the industry (probably because they sell things for cheap or free via open source) about lower level API due to their own short comings (e.g. driver overhead for one and overcoming bottlenecks like under utilisation problems) and took on consoles because developers will be coding for their GPUs.
 

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
It does depend on the software but back in those days, R300 still managed to perform well across a whole slew of games regardless of game engine. Its just like the G80, where you let your hardware do the talking. Sure more performance can be extracted out from certain architectures for specific loads, but if the performance is already good to begin with across the board.. even better!

By features i mean the tile based rasterizion for instance. nVIDIA's DCC technique and so forth where you don't have to rely on software to extract the extra performance (by using instructions specific to the game, or in a way that favors one GPU architecture for example).

I always believed that AMD's strategy ever since the introduction of GCN was to extract performance via software (with incremental architectural updates) due to their financial situation. For gamers, this resulted in longevity. They cant let their hardware to do most of the talking because they cannot afford to churn out new architectures like nVIDIA. Instead they managed to convince the industry (probably because they sell things for cheap or free via open source) about lower level API due to their own short comings (e.g. driver overhead for one and overcoming bottlenecks like under utilisation problems) and took on consoles because developers will be coding for their GPUs.

Back then hardware design was arguably more divergent than it was now. With D3D12, Microsoft and the developers have come to expect IHVs to converge on a few hardware features so there's less profound ways to experiment on new hardware design. AMD obviously realized their error and very much want to capitalize on their unused hardware features as much as possible so that they can be competitive again ... (That means standardizing more GCN features as official D3D12 extensions and leveraging game console programming.)
 
  • Like
Reactions: CatMerc
May 11, 2008
22,566
1,472
126
was anyone able to get the current gen Wolfenstein to run on a Polaris card? Doom ran fine but new order wouldn't even launch on my Ryzen+Radeon PC.

Try a clean gpu driver install with driver 16.9.2.
I had success with an RX480 8GB.

Both new order and old blood ran ok for me. average 44fps with medium to high settings.
What kind of cpu do you have ?
I have a 4c/4t piledriver and i think that held me back.
In the near future when i own a ryzen cpu i will play the games again. I am sure i will have higher fps.

Link for rx series windows 10- 64 bit

http://support.amd.com/en-us/download/desktop/previous?os=Windows 10 - 64
 

Muhammed

Senior member
Jul 8, 2009
453
199
116
Speaking of DOOM, I really want to see some recent benchmarks. I did some quick comparisons between Vulkan and OpenGL on the original GTX Titan, and Vulkan had a few FPS higher minimums in those scenes (I forgot about vsync so max was topped at 60^^). But the old tests show terrible Vulkan performance on Kepler.
Kepler is a lost cause, as it's severely limited by VRAM size, and Vulkan needs more VRAM than OpenGL to function properly.


As for Pascal and Maxwell vs Fiji, these are the most recent results, the 1070 is slightly ahead of FuryX, the 980Ti is close as well.



Doom_Vulkan_Average_FPS.png

http://techreport.com/review/31562/nvidia-geforce-gtx-1080-ti-graphics-card-reviewed/5
http://www.techspot.com/review/1352-nvidia-geforce-gtx-1080-ti/page2.html
http://www.tomshardware.com/reviews/nvidia-geforce-gtx-1080-ti,4972-3.html
[URL='https://www.purepc.pl/karty_graficzne/test_msi_geforce_gtx_1080_ti_gaming_x_pascal_bardzo_wypasiony?page=0,8']https://www.purepc.pl/karty_graficz..._ti_gaming_x_pascal_bardzo_wypasiony?page=0,8[/URL]
http://www.tomshardware.com/reviews/nvidia-geforce-gtx-1080-ti,4972-3.html

As for GTX 1060 vs 580, they are very close, especially @1440p, where they are almost neck and neck.

doom_1440p_2.png

https://www.purepc.pl/karty_graficz...eforce_gtx_1060_9_gbps_msi_gaming_x?page=0,13
http://pclab.pl/art74695-6.html
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
I remember how Doom was seen as a win for AMD when the Vulkan patch was first introduced, and to be fair, it was. Because NVidia's Vulkan drivers at the time were quite inferior to AMD (had an annoying stutter issue), and they lacked shader instrinsics support and asynchronous compute for Pascal. Now the tables have turned though. NVidia has been optimizing the "Hell" out of Vulkan over the past several months, and now Doom runs really fast and smooth. No idea what runtime library version AMD's latest drivers are running, but NVidia's Vulkan libraries which are bundled with the 385.xx drivers show version 1.0.42.1.
 
  • Like
Reactions: Arachnotronic

greatnoob

Senior member
Jan 6, 2014
968
395
136
Store 2 fp16 by bitshifting them into a fp32 value and then retrieving them by bitshifting and masking. What has changed exactly?
 
May 11, 2008
22,566
1,472
126
Store 2 fp16 by bitshifting them into a fp32 value and then retrieving them by bitshifting and masking. What has changed exactly?

It does not work like that. the two shifted and added fp16 are now treated as one fp32 value in your example which will result in one wrong answer, also all that adding and shifting needs separate instructions which costs clock cycles. Even if you could add and at the same time do a shift, which can be done with a barrel shifter such as for example in the ARM cpu alu , it would not work. It would still be treated as a single fp32 value.

The packed fp16 is more like a simd alike instruction if i am not mistaken (did not look it up). it as if there are two independent fp16 units that both execute (i think) the same instruction but on 2 different fp16 values..
 
  • Like
Reactions: CatMerc and NTMBK
May 11, 2008
22,566
1,472
126
I remember how Doom was seen as a win for AMD when the Vulkan patch was first introduced, and to be fair, it was. Because NVidia's Vulkan drivers at the time were quite inferior to AMD (had an annoying stutter issue), and they lacked shader instrinsics support and asynchronous compute for Pascal. Now the tables have turned though. NVidia has been optimizing the "Hell" out of Vulkan over the past several months, and now Doom runs really fast and smooth. No idea what runtime library version AMD's latest drivers are running, but NVidia's Vulkan libraries which are bundled with the 385.xx drivers show version 1.0.42.1.

Yeah, i sure hope AMD will make enough money to start a R&D department with bigger budgets mainly for the driver software. That will allow more software engineers to work with less stress and that will lead to more creativity.
The reality is that the theoretical maximum flops numbers are rarely reached, will it be either Nvidia or AMD. So the trick is to take the data that the game supplies and process it in such a way that the gpu is utilized as much as possible. In mining , it is possible to use all alu's. But in games it is more difficult and that is where the driver magic is. And Nvidia knows that very well and has that problem very good under control. "Mo money" less problems.

I always wonder how a graphic scene is build up in general. Since the software writing is abstract, it is not written (optimized) to match the exact amount of computing units a given gpu has for every task. Also, this can only be done if there is only one type of gpu to program for.
This exact matching cannot be done because then every manufacturer would only have one gpu card to sell. This is where the consoles have an advantage, being fixed hardware. But it would also
take more effort from the software designers in one way and eases their lives in another way..
So in one scenario, the driver has to take all these computations and batch them to reach the maximum theoretical tflops throughput of the gpu. But that is not always possible because if in a given frame that number can not be reached, the driver would have to stall the computations to extract maximum tflops throughput from the gpu. And that could lead to delays.
Engineers who write drivers for gpu do not have it easy. They have to examine every scenario and create a solution and also modify the driver to detect that scenario and use the solution.
I have come to understand that the gpu driver tracks the programs being executed or is notified by windows that a game is started(I do not know how this is done exactly) and loads the optimized gpu profile. But i am sure that the gpu driver in general also does run time optimizations on the fly.
But in the end, the gpu driver writer is often just like the virus scanner writers, they are solving the problems afterwards.

And that is where the various initiatives like game works and mantle come from, for the gpu drivers to be ahead of the situation and create an environment where the hardware is fully utilized.
Of course, how the initiatives are implemented is a different matter because the implementation is based on making money and company strategy .
 
May 11, 2008
22,566
1,472
126

https://www.golem.de/news/id-software-wolfenstein-2-unterstuetzt-fp16-und-vulkan-1707-129201.html

Wolfenstein will also support vulkan and shader intrinsics. but that should be known. should run well like doom.


I do have to say that he has a very thick accent.

Imagine working with different people who all have a different native language and a very thick accent.
When discussing to solve a problem, one would be more busy with understanding what the other is trying to say than understanding the problem being discussed.
That would be very exhausting.
This is totally the of "Tower of Babel" at its maximum.