AMD Radeon Pro graphics cards announced at Capsaicin Siggraph event

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
http://wccftech.com/amd-radeon-pro-wx-7100-workstation-card/

Three new Radeon Pro graphics cards announced tonight. Apparently, this is the new branding for professional graphics, taking the place of FirePro.

The Radeon Pro WX 7100, despite being the top SKU, is based on a cut-down version of P10. It looks to be basically the professional version of the RX 470. It's specced out for 150W (1x6-pin) and advertised as delivering >5 TFlops. Given 32 CUs (2048 shaders), that would mean a clock speed of at least 1220 MHz.

Perhaps the most impressive entry is the Radeon Pro WX 5100. It's based on a further cut-down P10, with 28 CUs (1792 shaders). The billed >4 TFlops of compute performance means a clock speed of at least 1116 MHz. The impressive part is the fact that it doesn't have an external power connector, meaning it's under 75W. Now that's more like the power efficiency we were promised with Polaris.

Finally, the Radeon Pro WX 4100 will deliver the full P11 chip (16 CUs, 1024 shaders). This means a clock rate of over 976 MHz (I'm guessing probably an even 1000). Power consumption is not specified, but should be very low, considering the modest clock rate.

Thoughts?
 
Feb 19, 2009
10,457
10
76
Expected. Though that <75W Polaris 10 is interesting. At this point I am more interested in Vega news. :)

The SSG is a smart move, big dataset workloads will benefit tremendously.
 

beginner99

Diamond Member
Jun 2, 2009
5,210
1,580
136
The SSG is a smart move, big dataset workloads will benefit tremendously.

As a poster in the comments said: Is that really true? Is a local NAND-flash storage really faster than going to system RAM?

RAM has a latency of what? 100ns? Compared to an SSD with like 100 microseconds? That's 1000x times slower. I'm having a hard time to believe that going to system RAM is slower than a local NAND flash ssd. In fact if you need to go to solid storage, going over PCIe doesn't add much to the total time.

Latency from PCIe is actually hard to find. Going by this it's around 200 to 1000 ns and here its around 500 ns but the original source is gone.

However going by a thread in CUDA forums it can be much higher. 10-35 microseconds. Hard to tell what is actually true.

Still, even with the worst figure of 35us, the onboard SSD would save you about 33% of total latency vs going to system RAM and then SSD. But this won't mean the end process will be 33% faster. That's the best-case scenario. If PCIe latency is 1us, then you save maybe 1%. What am I missing?

This sounds like a good use-case for non-volatile RAM.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Latency from PCIe is actually hard to find. Going by this it's around 200 to 1000 ns and here its around 500 ns but the original source is gone.

However going by a thread in CUDA forums it can be much higher. 10-35 microseconds. Hard to tell what is actually true.

The Intel Optane comparison showed the NVMe competition at 58us. The physical interface itself is somewhat better, but you can't discount software and NAND latency.

According to Anandtech bench, the NAND based drives are at hundreds of us at high load and closer to the 60us figure at low load: http://www.anandtech.com/show/8104/...ew-the-pcie-ssd-transition-begins-with-nvme/3
 
Feb 19, 2009
10,457
10
76
If you looked at the demo of 8K video content creation AMD listed, they already showed the metric that's vital and why SSG makes a huge difference. Look at their MB/s throughput.

If you're doing computing that exceeds the Firepro's vram, it will spill into system ram, and if it exceeds that, it's going to spill into HDD/SSD, via SATA. This adds layers of latency, CPU/OS, and be bottlenecked by the SATA performance.

By having 2 x SSDs on the Graphics Card itself, the combined throughput is over 4500MB/s.

This is the next step up for big data computing. Until RAM itself becomes dense enough to support TBs easily & cheap.
 

beginner99

Diamond Member
Jun 2, 2009
5,210
1,580
136
If you looked at the demo of 8K video content creation AMD listed, they already showed the metric that's vital and why SSG makes a huge difference. Look at their MB/s throughput.

If you're doing computing that exceeds the Firepro's vram, it will spill into system ram, and if it exceeds that, it's going to spill into HDD/SSD, via SATA. This adds layers of latency, CPU/OS, and be bottlenecked by the SATA performance.

By having 2 x SSDs on the Graphics Card itself, the combined throughput is over 4500MB/s.

This is the next step up for big data computing. Until RAM itself becomes dense enough to support TBs easily & cheap.

Well in such a system you would not have SATA ssd but PCIe giving you them same bandwidth? Xeon E7 support up to 6 TB of RAM so yeah it's possible but of course expensive.
 
Feb 19, 2009
10,457
10
76
Well in such a system you would not have SATA ssd but PCIe giving you them same bandwidth? Xeon E7 support up to 6 TB of RAM so yeah it's possible but of course expensive.

PCIe SSD has the added latency and CPU overhead of OS/CPU/GPU interactions. If a GPU wants to address data on that PCIe SSD, it has to go through those hoops first.

And I know server grade Intel CPUs support a lot of RAM, however, the build is not feasible for a cheap workstation.
 

crisium

Platinum Member
Aug 19, 2001
2,643
615
136
Interesting. Fully enabled P10 is consumer gaming card only; heavily cut down P10 is professional only. Fully enabled P11 is professional only; lightly cut down P11 is consumer gaming card only.

For now.

~1100MHz 1792SP with no-pin sounds pretty good for a potential RX 465 (should be comparable to stock 380X - 970MHz 2048SPs). RX 460 with 896SPs seems so weak that it it will lose to a GTX 950 (which does have no-pin variants).

I think it's a mistake not to get a no-pin 1792SP card in the market before Nvidia unleashes a beast here (it could be as simple as a GTX 1060 Green edition that maxes out at ~1500MHz).