Why doesn't Nvidia have a Radeon Pro SSG challenger?

piesquared

Golden Member
Oct 16, 2006
1,651
473
136
The Pro SSG looks to be quite valuable to those working with large data sets especially combined with Vega's HBC and HBCC. Why doesnt Nvidia design something similar, or is AMD's early investment in HBM and associated tech paying off here, while Nvidia's products just arent able to address this part of the market?
 

piesquared

Golden Member
Oct 16, 2006
1,651
473
136
Real time 4K workflows are common place and content creators are already working with 8K. This market certainly exists.
Photorealistic and interactive real time rendering of very large data sets.

https://www.youtube.com/watch?v=7QJVRdMYvXY

https://www.youtube.com/watch?v=KvdGTCFEqhg

https://www.youtube.com/watch?v=fjlsSyStSIg

Some examples there of how a Pro SSG is able to help developers create content faster and easier. Sure Nvidia can address this market, but not with anything near as capable as this Pro SSG.

So does Nvidia have any plans to provide access to on package high speed storage and caching? I haven't been able to find anything on their roadmaps so i'm guessing not, but maybe i missed an announcement or slide show somewhere. Reason i'm wondering is because if enabling this functionality requires IP built into the chip itself it could be a while before Nv is able to duplicate a response, which means it's a market AMD has all to themselves for the forseeable future.
 

Wall Street

Senior member
Mar 28, 2012
691
44
91
I think that you are looking at the SSG as the only possible solution. While the on card storage is a novel idea, I believe that nVidia can compete with this in systems with GPUs paired up with PCIe x8 and PCIe x16 SSDs which have been recently starting to appear. Also, NVLink is nVidia technology for getting more information to the GPU, so I doubt they would scuttle this effort by using on-card storage. There are multiple ways to skin a cat, and if/when 10 GB/s workflows become more common, I am sure that all of the major hardware vendors will figure out how to play a part.

Of course the marketing for the card is going to come up with example of when you need to stream huge bit rates. They marketing materials say that it is 8x better in 8k than an 850 pro, but the 960 pro is over 5x faster than an 850 pro, so the gap isn't too huge.
 
  • Like
Reactions: Phynaz

ThatBuzzkiller

Golden Member
Nov 14, 2014
1,120
260
136
NVLink 2 I imagine. 150 GB/sec each way isn't too bad. Course you have to use Power though.

Which is a fairly big showstopper when we consider that most content creation applications are built with x86 microcode infrastructure with some specialization in GPU centric APIs ...

Content creation tool developers hardly have a lot of resources at their disposal to be able to refit their code base in a reasonable timeframe unlike AAA game developers so an extra miniature API seems like a safe bet with the SSG in comparison to adding support for a new ISA but like Phynaz says there's hardly a market for faster access to large localized data sets and even if it weren't the case AMD still has to compete with Intel Xeon CPUs and the Core-X series where you can get nearly comparable memory densities (some server motherboards can support up to 1.5TBs worth of DRAM!) while delivering an upwards of 80GB/s (IO/bandwidth) in comparison to the measly 5GB/s you can get with the SSG ...

Even better if you're performance limited, Skylake-X can deliver more computational power with AVX-512!

If you're workload is fixed function or bandwidth bound (octa-channel CPUs might release in the near future to alleviate this) then you might be better served with a Radeon Pro SSG after all ...

It would be a fairly pathological workload you have hitting all sorts of bottlenecks like that to get a really specialized solution that AMD is currently providing ...
 
  • Like
Reactions: Headfoot and Phynaz

zlatan

Senior member
Mar 15, 2011
580
291
136
NV didn't have endpoint to endpoint communication support in their PCIe IP. So they need a lot of research before they bring an SSG-like solution to the market. They probably have some plans now, but they need time to create the hardware.

For HBCC ... Volta will support a same kind of solution with Power9 host CPU. For the PC they need an x86 license before they build a Vega-like solution.
 
  • Like
Reactions: PhonakV30

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,730
136
Which is a fairly big showstopper when we consider that most content creation applications are built with x86 microcode infrastructure with some specialization in GPU centric APIs ...

Content creation tool developers hardly have a lot of resources at their disposal to be able to refit their code base in a reasonable timeframe unlike AAA game developers so an extra miniature API seems like a safe bet with the SSG in comparison to adding support for a new ISA but like Phynaz says there's hardly a market for faster access to large localized data sets and even if it weren't the case AMD still has to compete with Intel Xeon CPUs and the Core-X series where you can get nearly comparable memory densities (some server motherboards can support up to 1.5TBs worth of DRAM!) while delivering an upwards of 80GB/s (IO/bandwidth) in comparison to the measly 5GB/s you can get with the SSG ...

Even better if you're performance limited, Skylake-X can deliver more computational power with AVX-512!

If you're workload is fixed function or bandwidth bound (octa-channel CPUs might release in the near future to alleviate this) then you might be better served with a Radeon Pro SSG after all ...

It would be a fairly pathological workload you have hitting all sorts of bottlenecks like that to get a really specialized solution that AMD is currently providing ...
Are you saying that there aren't workloads where a PCI-E interface-based GPU needs to access large locally stored data sets?
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Are you saying that there aren't workloads where a PCI-E interface-based GPU needs to access large locally stored data sets?

Not really. Large video datasets are kept on shared storage (SAN) so that workgroups can access it, along with reasons like reliability, such as Avid NEXIUS or Grass Valley K2. I think you are underestimating the amount of storage required for professional video production. As an example a professional 8K video camera can record at up 300MB/s.

While home users use PC HDDs, the pro's don't.
 
Last edited:

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,730
136
Not really. Large video datasets are kept on shared storage (SAN) so that workgroups can access it, along with reasons like reliability, such as Avid NEXIUS or Grass City K2. I think you are underestimating the amount of storage required for professional video production. As an example a professional 8K video camera can record at up 300MB/s.

While home users use PC HDDs, the pro's don't.
I think the point of the Radeon Pro SSG is having fast access to the immediate data set that you're working with. Even with Thunderbolt enabled storage you will have to stream in the video data at 1/5th the speeds that you get with an on-board SSD.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
I think the point of the Radeon Pro SSG is having fast access to the immediate data set that you're working with. Even with Thunderbolt enabled storage you will have to stream in the video data at 1/5th the speeds that you get with an on-board SSD.

In non-linear editing the raw footage isn't what's edited. The raw footage is never touched because that would be destructive. The video data is directly accessed at the frame level and isn't streamed. While the video is being worked, all editing is done off-line.

The way it works is the software outputs an edit decision list (EDL) which is a semi-standardized command list in XML format which is used to make video editing systems compatible with each other. After the editing is complete, the EDL is used to combine all the individual video files - there can be thousands making up a single project - and perform all the editing. Again, the original files aren't touched during this process the edited project is basically a copy.

Basically Pie's idea of what real-time editing is isn't how it works. He's thinking that studio level video production is like editing a single home video.

Am I saying there's not a use case for what AMD is trying to do? No, of course not. We'll have to see if the video industry sees any benefit from what AMD is proposing, but I don't see a current use case for it.

Edit:
For reference here is the Grass Valley catalog, you can get an idea of how video production works from reviewing it. http://wwwapps.grassvalley.com/docs/Catalogs/GVB-1-0263J-EN-GV_QuickView.pdf
 
Last edited:

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,730
136
In non-linear editing the raw footage isn't what's edited. The raw footage is never touched because that would be destructive. The video data is directly accessed at the frame level and isn't streamed. While the video is being worked, all editing is done off-line.

The way it works is the software outputs an edit decision list (EDL) which is a semi-standardized command list in XML format which is used to make video editing systems compatible with each other. After the editing is complete, the EDL is used to combine all the individual video files - there can be thousands making up a single project - and perform all the editing. Again, the original files aren't touched during this process the edited project is basically a copy.

Basically Pie's idea of what real-time editing is isn't how it works. He's thinking that studio level video production is like editing a single home video.

Am I saying there's not a use case for what AMD is trying to do? No, of course not. We'll have to see if the video industry sees any benefit from what AMD is proposing, but I don't see a current use case for it.
Okay, so is there a possibility of the on board SSDs being used as a buffer? The way you describe it, the process seems like query a frame, apply transformations, collect those frames together into an output container. Seems I/O intensive.
 
Last edited:
  • Like
Reactions: Phynaz

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Okay, so is there a possibility of the on board SSDs being used as a buffer? The way you describe it, the process seems like query a frame, apply transformations, collect those frames together into an output container. Seems I/O intensive.

Possibly. One issue is most of the current systems that supports GPUs (there aren't many) all use CUDA. Systems that do use GPU acceleration only use it for very specific tasks and not for overall process throughput improvement. For example, since according to Pie AMD is talking about the rendering pipeline (I haven't watched the marketing videos), Renderman uses GPU for only denoise. Final Cut will use OpenCL enabled GPUs to speed up final output, but I'm not familiar with what functions the GPU is used to accelerate.

We'll see if AMD can gain traction with this, but they will have to make a very compelling argument to get the industry to spend the dollars to implement.
 
Last edited:

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
Video is just the first use case, hardly the last. Anything that needs lots of data fast and local to a GPU (e.g. lots of HPC tasks) can potentially benefit.
 
  • Like
Reactions: wilds