Speculation: SYCL will replace CUDA

Vattila · Jun 19, 2021

I've had my eye on SYCL for some time now. I don't think it is widely recognised that this is going to be the standard for heterogeneous programming models going forward (eventually to be merged into the ISO C++ standard, it seems). If you read any discussion on programming, tech and investor forums about Nvidia vs AMD in the AI space, you rarely see SYCL mentioned at all. The discussion is usually about CUDA vs ROCm/HIP — about how poor and difficult to install and use the latter is, and how good, easy and dominant the former is.

However, in the the supercomputer space they are looking for a vendor neutral programming model for AI and HPC. SYCL is just that:

"SYCL (pronounced “sickle”) is an open standard programming model that enables heterogeneous programming based on standard ISO C++. Heterogeneous programming is the basis for today’s growing HPC, AI, and machine learning applications. SYCL has been gaining momentum as supercomputing communities look for a nonproprietary programming model. Maintained under the Khronos Group and initially released in 2014, SYCL is a royalty-free, cross-platform open standard abstraction layer that enables code for heterogeneous processors to be written with the host and kernel code for an application contained in the same source file. SYCL has been closely aligned to OpenCL, but over time has evolved into its own completely distinct programming model. The latest revision SYCL 2020 can decouple completely from OpenCL and therefore eases deployment support on multiple backends."

hpcwire.com

So, in my view, CUDA is not long for this world — in the supercomputer space, at least. I think the transition to SYCL is going to trickle down, probably pretty rapidly. I fear a lot of Nvidia's valuation (currently over 4 times AMD and 2 times Intel in market capitalisation) is based on the idea that Nvidia is going to be "the Wintel of AI". I don't think so. I think AI is soon going to be wide open. On the other hand, I hope that Nvidia's valuation is just a prelude to the amount of investment dollars that are going to be poured into the high performance compute space, as AI takes off and the demand for ever more processing performance keeps growing.

PS. To their credit, Intel chose the right path with oneAPI. Their open Data Parallel C++ compiler (DPC++) is becoming the de-facto open compiler for SYCL, it seems. There is also hipSYCL from Heidelberg university, providing a layered solution on top of CUDA and HIP/ROCm toolchains. Heidelberg is also working with Intel on compatibility with the oneAPI/DPC++ toolchain (link). It is also cool that Xilinx and AMD have been involved with SYCL from the beginning through the experimental triSYCL project. With the merger, I expect great things on the software side.

SYCL.tech - Find out the latest SYCL news, videos, learning materials and projects.

Please do not post the same thread in multiple sub-forums here.
I will leave this thread open for now.

Iron Woode
Super Moderator

NTMBK · Jun 19, 2021

Nah. SYCL has been limping along for years. In my former job we considered targeting it, as cross platform support would be fantastic... But until it's supported by Nvidia, it's going nowhere fast. Yet another dead end like OpenCL 2.x.

Vattila · Jun 19, 2021

NTMBK said:
Nah. SYCL has been limping along for years. In my former job we considered targeting it, as cross platform support would be fantastic... But until it's supported by Nvidia, it's going nowhere fast.

Interesting — thanks for sharing your experience. Was it before or after hipSYCL and oneAPI entered the scene? Did the preference for CUDA come down to vendor-specific features and support libraries, perhaps? Performance?

As I understand it (as a C++ programmer without any GPU programming experience), hipSYCL was a big step forward for SYCL, since it layers on top of the native CUDA and ROCm toolchains, using AMD's HIP as the backend abstraction layer (HIP is just a CUDA-like programming model that maps CUDA features pretty much one-to-one), while allowing vendor-specific features to be used if needed or desired. Intel has partnered with Heidelberg to improve compatibility between oneAPI/DPC++ and hipSYCL (see author's comments at ArsTechnica). There have been performance studies that show there is not much performance penalty, if any, for the SYCL abstraction layer. Contrast this with the old layering on top of poorly supported OpenCL drivers with subpar performance.

Intel's oneAPI was another big leap forward for SYCL, with Intel putting their weight behind the standard, contributing improvements for SYCL 2020, as well as the breadth of support libraries in their oneAPI framework. Apparently, Intel's DPC++ compiler, a Clang/LLVM derivative, is being upstreamed into the the mainstream version, signalling wide SYCL support and simplifying availability in the general C++ community.

Meanwhile, there is a big push from Codeplay, lead by their R&D VP Michael Wong, a long-term C++ standard committee member, to align SYCL with the ISO C++ standard, making it likely that the standards will merge at some point. The C++ community is eager to have standard support for programming heterogeneous systems.

These developments seem significant to me. But let me know what you think.

PS. In supercomputers, SYCL is obviously already established, with the USA national labs investing in it, in part to provide a good open programming model for the upcoming exascale supercomputers which are built on AMD/Intel CPU+GPU hardware, but even Nvidia-based supercomputer users are being transitioned to the SYCL programming model. For example, see the recent announcement from Codeplay:

"Codeplay is playing a huge part in helping developers migrate away from proprietary CUDA to SYCL through our contribution of DPC++ for CUDA. Steffen Larsen along with our partners from Lawrence Berkeley National Laboratory have contributed a poster session entitled "Bringing SYCL to Ampere Architecture." This describes the work Codeplay is doing to bring SYCL programming to the Nvidia A100 based Perlmutter supercomputer which will be used by the US National Laboratories for running crucial research applications."

codeplay.com

NTMBK · Jun 20, 2021

Vattila said:
Interesting — thanks for sharing your experience. Was it before or after hipSYCL and oneAPI entered the scene? Did the preference for CUDA come down to vendor-specific features and support libraries, perhaps? Performance?

As I understand it (as a C++ programmer without any GPU programming experience), hipSYCL was a big step forward for SYCL, since it layers on top of the native CUDA and ROCm toolchains, using AMD's HIP as the backend abstraction layer (HIP is just a CUDA-like programming model that maps CUDA features pretty much one-to-one), while allowing vendor-specific features to be used if needed or desired. Intel has partnered with Heidelberg to make the oneAPI toolchain another backend option. There have been performance studies that show there is not much performance penalty, if any, for the SYCL abstraction layer. Contrast this with the old layering on top of poorly supported OpenCL drivers with subpar performance.

Intel's oneAPI was another big leap forward for SYCL, with Intel putting their weight behind the standard, contributing improvements for SYCL 2020, as well as the breadth of support libraries in their oneAPI framework. Apparently, Intel's DPC++ compiler, a Clang/LLVM derivative, is being upstreamed into the the mainstream version, signalling wide SYCL support and simplifying availability in the general C++ community.

Meanwhile, there is a big push from Codeplay, lead by Michael Wong, a long-term C++ standard committee member, to align SYCL with the ISO C++ standard, making it likely that the standards will merge at some point. The C++ community is eager to have standard support for programming heterogeneous systems.

These developments seem significant to me. But let me know what you think.

PS. In supercomputers, SYCL is obviously already established, with the USA national labs investing in it, in part to provide a good open programming model for the upcoming exascale supercomputers which are built on AMD/Intel CPU+GPU hardware, but even Nvidia-based supercomputer users are being transitioned to the SYCL programming model. For example, see the recent announcement from Codeplay:

"Codeplay is playing a huge part in helping developers migrate away from proprietary CUDA to SYCL through our contribution of DPC++ for CUDA. Steffen Larsen along with our partners from Lawrence Berkeley National Laboratory have contributed a poster session entitled "Bringing SYCL to Ampere Architecture." This describes the work Codeplay is doing to bring SYCL programming to the Nvidia A100 based Perlmutter supercomputer which will be used by the US National Laboratories for running crucial research applications."

codeplay.com

My experience was long before hipSYCL was a thing. This was with early versions of Codeplay's stack.

One of the key issues was that we wanted formal support from Nvidia- not just hacks and workarounds to get it working on their hardware, without their support. Without that support, you don't know if the next generation of hardware or the next driver iteration will break support in critical ways. Obviously deployed devices would run on fixed hardware and software platforms- you don't just dump a new graphics driver on a certified medical device!- but it could be a serious impediment to development. There was a real risk that months or years of software development could be rendered useless because Nvidia locked down their software stack.

We needed to make a decision that would put us in a good position for the next 5-10 years, so we went with CUDA. We had the confidence that it would still be well supported for at least that long.

ThatBuzzkiller · Jun 20, 2021

Response continued ...

Vattila said:
I am not sure what Codeplay plans to do for DPC++, but SYCL 2020 is fully independent of the backend. In fact, hipSYCL uses either the CUDA or ROCm toolchain (by compiling down to HIP, which is in turn compiled by the backend toolchain).

Head over to the Programming discussion thread.

Sure but only if you accept that there will be no portability/interoperability between the different backends which means you can't really reuse the same SYCL code on different hardware vendors!

So basically, it's not very different to OpenCL's model and features much of the similar drawbacks ...

Ultimately, SPIR-V was supposed to be the solution to breaking the deadlock to the portability issues but only Intel drivers have accepted this ingestion format. Without a SPIR-V compiler on AMD/NV GPUs, we remain oblivious to the fact they they don't have to guarantee identical behaviour between each other and can diverge ...

Vattila · Jun 20, 2021

ThatBuzzkiller said:
Sure but only if you accept that there will be no portability/interoperability between the different backends which means you can't really reuse the same SYCL code on different hardware vendors!

Good point — this is where heterogeneity comes back to bite you, I guess. It will always be a problem when there is a lot of innovation and change, like there is in the AI/ML space. But that said, I envisage there will be high demand for a lot of standardised functionality, accessed through a common programming model. For example, how to deal with different memory types — coherency, NUMA, etc. — will be standardised with industry-agreed memory models, much like C++11 finally got a formal memory model, mappable onto the models used by the hardware ISA (x86, ARM, etc.).

My guess is that, for those who care about standards and open non-proprietary solutions, we'll have SYCL with vendor-specific code on the side, as needed for optimisation, custom features, etc.

Vattila · Jun 20, 2021

NTMBK said:
One of the key issues was that we wanted formal support from Nvidia

Another good point — you need goodwill from the major players and market influencers for standards to become trusted, effective and widely used. Nvidia's reluctance to support anything other than CUDA may have lost them supercomputer wins, though. And the pressure on Nvidia to participate in an open standard will remain high. Ultimately, if ISO C++ subsumes the SYCL standard, then I think there is no option for Nvidia to comply. Of course, as long as they have a competitive lead, they could drag their feet by providing subpar performance and support for the standard solution, while continuing to promote their proprietary solutions. But as soon as they have proper competition (which is already emerging), that strategy will fail fast. Being second best with poor support for the industry standard — that is a bad place to be in.

Vattila · Jun 20, 2021

tamz_msc said:
CUDA got to be where it is today because for ten years and counting NVIDIA invested in supporting developers, making tie-ups with universities, getting top people in unis to create courses for CUDA programming, organizing workshops for students and conferences for researchers and so on.

diediealldie said:
Making people to believe that 'something will work and last forever' takes tremendous efforts. Capital and developers support. CUDA have been there for 14 years with only single company GPU(NVIDIA).

You are both right in pointing out the immense momentum that CUDA has built up, through Nvidia's many years of strategic investment and pioneering work on GPGPU. So CUDA is unlikely to disappear. Written code has a tendency to last a long time. Fortran and Cobol code written more than half a century ago is still maintained today. But the preferred solution may change, and CUDA could soon find itself further down on the preference list — as is already happening in the supercomputer space.

Regarding momentum, I think a good comparison is C# and Java — proprietary languages with huge commercial backing — which had enormous success from the late 1990s to late 2000s, particularly due to ease of use, a large number of convenient libraries provided out of the box, large user communities, university courses, job training, etc. In this period, C++ usage was declining, and many considered the language to be poor, outdated and on its deathbed. C++ programmers like myself were viewed as behind the times. Since then non-proprietary ISO C++ has made a fantastic comeback, not just because it is non-proprietary, but because it is addressing an industry need (efficient code with low-level control, combined with powerful abstractions for large scale systems).

Likewise, there is an industry need for heterogeneous programming models that CUDA alone does not address. SYCL, built on nothing but standard ISO C++, seems to me to have a great chance at becoming the standard that can garner industry wide support.

But, of course, just as there is an immense amount of Java and C# code out there, never to be replaced by C++, there will be CUDA code written and maintained for decades, I'm sure.

diediealldie said:
If I have to, I'd bet on Intel's OneAPI.

Intel's weight behind SYCL increases its likelihood of success immensely, and their decision to make their implementation (DPC++) open-source is commendable and wise. Interestingly, AMD has been quiet about SYCL. Perhaps their analysis is similar to most here — CUDA is irreplaceable — so the best we can do is to chase near compatibility (which is what ROCm/HIP basically is trying to do). It has worked so far, though, judging by their wins in the supercomputer space, with USA national departments and labs cooperating with AMD on the software support for their machines (but apparently looking to SYCL as the long-term programming model).

DrMrLordX · Jun 20, 2021

Java is not proprietary. It's also still very popular.

moinmoin · Jun 20, 2021

More support for open standards is always good. But SYCL isn't replacing CUDA until Nvidia does that or Nvidia's market share becomes negligible. Before that point support for CUDA will always be good to have for better or worse, if only to be able to move away from it and not be locked to it.

Vattila · Jun 20, 2021

DrMrLordX said:
Java is not proprietary.

Thanks for pointing that out. I think it started out proprietary at Sun, though, didn't it?

"Java was originally developed by James Gosling at Sun Microsystems (which has since been acquired by Oracle) and released in 1995 as a core component of Sun Microsystems' Java platform. The original and reference implementation Java compilers, virtual machines, and class libraries were originally released by Sun under proprietary licenses. As of May 2007, in compliance with the specifications of the Java Community Process, Sun had relicensed most of its Java technologies under the GNU General Public License."

Java (programming language) - Wikipedia

Anyway, my example was just a rehash of the story C++ inventor Bjarne Stroustrup often tells about the predicted doom-and-gloom for C++ up against the then fashionable languages backed by large corporations, and the struggle the C++ language faced in the languishing years from the the C++98 standard to the C++11 standard. Not a perfect analogy by any means, but I think Nvidia's promotion of CUDA looks similar to other languages backed by large corporate interest. In the end, the languages and standards that meet the industry needs will win out, I guess.

It's also still very popular.

True. But very few describe Java as a "C++ killer" anymore.

"I have commented (negatively) about Java hype and ascribed much of Java's success to marketing. For example, see my HOPL-3 paper. Today (2010), the claims made about Java are more reality based and less gratuitously derogative about alternatives. This was not always so."

Stroustrup: FAQ

To get back on track on the CUDA vs SYCL topic, I would love to see Nvidia open up CUDA. But I think SYCL is a cleaner and more forward-looking programming model, applicable to more devices, and more aligned with the direction of ISO C++. Nvidia should get ahead of the curve on this: Support SYCL with the best implementation and tools in the industry. CUDA is already a lost battle in supercomputers, and I bet the trend will continue with hyperscalers and data centres, as many alternative GPU, FPGA and ASIC solutions will emerge, requiring vendor-independent programming solutions.

Vattila · Jun 20, 2021

GodisanAtheist said:
Do we have any concrete examples of any sort of open source code (especially in the context of AMD) A: having longevity, and B: being commercially viable or even more so successful?

There is a lot of successful open-source code, on which many companies rely — Linux and Red Hat being the obvious example. In the context of AMD, you can argue that their open approach to their GPU software has helped them get game developers aboard, supporting their hardware and technologies (e.g. Mantle, which turned into Vulkan and influenced DX12), and maybe even contributing to the console wins. As another example in hardware, AMD's open approach to dynamic refresh rate with FreeSync enabled support from monitor vendors, and ultimately, support from Nvidia, as well (although they still like to call it something else, I think).

In the programming language space, more closely related to this topic, the non-proprietary ISO C++ language has shown incredible resilience, and there are countless open-source code projects using this language, supporting its broad application around the world in all areas, including research, industry and commerce.

Hopefully, we'll see SYCL becoming the open standard for programming heterogeneous systems, building on the success of C++. I'm optimistic!

Vattila · Jun 20, 2021

darkswordsman17 said:
Based on that diagram I'm not really following how you're reaching that conclusion

The thread title reflects the linked news release and the trend in supercomputers (SYCL being selected as the programming model, replacing CUDA). The diagram explains the current implementation space, technologies and contributors involved.

For an example of CUDA being replaced by SYCL, see:

"Codeplay is playing a huge part in helping developers migrate away from proprietary CUDA to SYCL through our contribution of DPC++ for CUDA. Steffen Larsen along with our partners from Lawrence Berkeley National Laboratory have contributed a poster session entitled "Bringing SYCL to Ampere Architecture." This describes the work Codeplay is doing to bring SYCL programming to the Nvidia A100 based Perlmutter supercomputer which will be used by the US National Laboratories for running crucial research applications."

codeplay.com

And there are many more projects on non-proprietary heterogeneous programming models going on in the supercomputer space.

DrMrLordX · Jun 20, 2021

Vattila said:
Thanks for pointing that out. I think it started out proprietary at Sun, though, didn't it?

Yes.

True. But very few describe it as a "C++ killer" anymore.

It has a different niche. The things Java does now are things you would never do with C++ unless you were criminally insane (or very patient). Java will never fully encroach on C++'s turf because:

1). runtime overhead
2). excessive verbosity

There are some other reasons, but those are the big ones.

There are plenty of people pushing for Rust to become the "C++ killer".

To get back on track on the CUDA vs SYCL topic, I would love to see Nvidia open up CUDA. But I think SYCL is a cleaner and more forward-looking programming model, applicable to more devices, and more aligned with the direction of ISO C++. Nvidia should get ahead of the curve on this: Support SYCL with the best implementation and tools in the industry. CUDA is already a lost battle in supercomputers, and I bet the trend will continue with hyperscalers and data centres, as many alternative GPU, FPGA and ASIC solutions will emerge, requiring vendor-independent programming solutions.

I would like for NV to have opened up CUDA ages ago, hopefully so that they could integrate the better bits of their platform into various standards. NV thinks they can make more money by keeping everything proprietary. And you have to give them credit for creating a unified hardware/protocol/software stack where someone who is trained in using CUDA with NV's toolset can get work done reliably on NV hardware of just about any generation.

moinmoin · Jun 20, 2021

GodisanAtheist said:
Do we have any concrete examples of any sort of open source code (especially in the context of AMD) A: having longevity, and B: being commercially viable or even more so successful?

Somehow missed this. The answer is: The internet.

Without open source the internet as we know it wouldn't exist, and even today pretty much all of the software stack making it work is available as open source whereas closed source solutions do exist but are the exception.

This also effects hardware in data centers used on the internet: Since open source is predominant the barriers to entry for hardware manufacturers is lower as soon as basic support is in e.g. Linux, and thanks to open source anybody can add that support. This helps AMD, but also ARM, RISC-V etc.

semiman · Jun 21, 2021

Vattila said:
Likewise, there is an industry need for heterogeneous programming models that CUDA alone does not address. SYCL, built on nothing but standard ISO C++, seems to me to have a great chance at becoming the standard that can garner industry wide support.

Yeah, actually that's a good point. There are some workloads which aren't really suitable for GPU + CPU systems. That's one of the reason why Intel was looking into network infrastructure where FPGAs clearly have upper hands.
But something worries me is if it's really possible to create one-unified-f**kin-great programming model which

- Ties CPU, GPU and FPGA together
- Guaranteed usable performance with almost any combinations of hardware like IBM PC did

For me, SYCL is too ambitious. Not everyone can match requirements to use SYCL especially for mobile phone manufacturers. There's too many types of GPUs(and NPUs...etc). This will scare people(programmers) out by making them to think "Oh they said it works but it's not working".

The first thing SYCL should do is looking for a tasks which is substantially faster with SYCL than CPU+CUDA pair. Gains from SYCL hybrid codes should be like ~10 times faster I guess. Because projects with hybrid computing models will take much longer time to mature since they are written by 'early adopter's by nature.
They need to compete against 2023 NVIDIA cards with 2021 devices, assuming that it takes 2 more years to mature. It's like fighting against "Well-trained soldier with regular arms" using "Recruits with perhaps-super-weapon which no one ever tried"

Anyway, if they somehow break some path, then maybe Intel can get some of its former glories.

Vattila · Jun 21, 2021

diediealldie said:
But something worries me is if it's really possible to create one-unified-f**kin-great programming model [...] For me, SYCL is too ambitious.

I see the concern, but I think your worry is misplaced here. Don't confuse SYCL with a high-level programming model abstracting the hardware to oblivion. In terms of programming languages, SYCL is not like Mathlab (a numeric computing language), which is oblivious to the underlying hardware. SYCL is simply a clever way to use plain ISO C++ (a system programming language) to efficiently program underlying heterogeneous hardware.

The purpose of SYCL is to standardise a programming model (i.e. build on ISO C++, without any non-standard or proprietary syntax extensions), as well as standardise methods and technologies, where it makes sense, for how to access and control the underlying hardware — such as memory models, coherency, buffers and data movement, direct memory access, issuing of workloads, device enumeration and characterisation, topology query and configuration, device initialisation and state, device interrupts, processes/threads and other low-level details like that. You should expect SYCL code to query the underlying system for the resources available and issue workloads accordingly, just as an efficient C++ program today is designed to run well on the underlying CPU and operating system (e.g. issue the optimal number of threads for the number of cores available, use SIMD vectorisation if available, use buffer sizes that fits in cache, use RAM drives and memory mapped files to utilise available memory, and so on).

Look at SYCL simply as a way of talking to heterogeneous devices in plain C++. Hopefully, the standard will be subsumed by ISO C++ at some point.

PS. Libraries will of course be built using SYCL, e.g. for doing math on various accelerators. These may abstract the underlying details of the hardware. This is already going on in the standard C++ library (parallel algorithms).

The first thing SYCL should do is looking for a tasks which is substantially faster with SYCL than CPU+CUDA pair.

A killer application would be cool, but a programming model doesn't correlate with performance (unless it is a poor model that restricts performance, of course). It is simply a language and a set of protocols and conventions for how to use that language to program some system (in this case, a heterogeneous system).

The killer feature of SYCL, I think, is its openness, as well as the fact that it is built on top of pure ISO C++, which itself is looking towards adding support for programming heterogeneous systems. Hence, it is a natural evolution of ISO C++.

soresu · Jun 21, 2021

Vattila said:
As I understand it (as a C++ programmer without any GPU programming experience), hipSYCL was a big step forward for SYCL, since it layers on top of the native CUDA and ROCm toolchains, using AMD's HIP as the backend abstraction layer (HIP is just a CUDA-like programming model that maps CUDA features pretty much one-to-one), while allowing vendor-specific features to be used if needed or desired. Intel has partnered with Heidelberg to make the oneAPI toolchain another backend option. There have been performance studies that show there is not much performance penalty, if any, for the SYCL abstraction layer. Contrast this with the old layering on top of poorly supported OpenCL drivers with subpar performance.

I'm all for hipSYCL, but I'm wary enough trying out client software with a v0.9 tag, so I can imagine the hesitation in a software dev when looking at it for the moment at least.

Unfortunately even now with AMD in a far better position that it has been for years we are still lacking ROCm support in Windows, and therefore hipSYCL in Windows for AMD GPUs.

As far as I'm aware it's also still lacking official support for RDNA1/2 GPUs too, which is far from optimal even if they have switched to CDNA for enterprise compute customers.

Do we know if there is any possibility of hipSYCL getting a backend on top of Vulkan as well as ROCm?

Or is the compute part of Vulkan still too restrictive for running large, complex programs on the GPU?

It would certainly open up the platform to many more devices if it was possible.

Vattila · Jun 21, 2021

soresu said:
Unfortunately even now with AMD in a far better position that it has been for years we are still lacking ROCm support in Windows, and therefore hipSYCL in Windows for AMD GPUs.

True. AMD has a long way to go on software support. They are concentrating their focus in the supercomputer space, it seems, where CUDA has been completely displaced in the upcoming exascale supercomputers in USA: Aurora (Intel CPU+GPU, using SYCL/oneAPI), Frontier and El Capitan (both AMD CPU+GPU, using SYCL/HIP/ROCm). As mentioned before in this discussion thread, even the Perlmutter supercomputer, based on Nvidia Ampere hardware, is being transitioned to SYCL (see NERSC announcement).

However, AMD has been completely silent on SYCL so far, as far as I know. They should just concede that Intel made the right choice to support SYCL in their oneAPI, and get aboard sooner rather than later. Despite being open, I doubt there is any prospect of the industry embracing HIP as a standard, mainly because HIP is just chasing Nvidia's latest CUDA for automated one-to-one translation.

Do we know if there is any possibility of hipSYCL getting a backend on top of Vulkan [...]?

I doubt you'll see it in hipSYCL, which is based on using AMD's HIP as a backend abstraction layer for ROCm and CUDA. As far as I understand, Heidelberg has partnered with Intel to add support for their oneAPI as another backend, but I guess that is more or less just pass-through to the DPC++ compiler (part of oneAPI), which itself is a SYCL implementation.

That said, there are other projects working on implementing SYCL with Vulkan as the backend. See Sylkan:

"In this paper, we discuss the opportunities and challenges of mapping SYCL to Vulkan, a low-level explicit programming model for GPUs. This includes an analysis of the potential semantic mismatch between each respective standard, as well as approaches to work around some of these issues."

Sylkan: Towards a Vulkan Compute Target Platform for SYCL - SYCL.tech

Here is an interesting presentation comparing the programming models, including Vulkan:

Evaluation of modern GPGPU technologies for image processing (PowerPoint Presentation at SYCLcon 2020)

beginner99 · Jun 21, 2021

CUDA still has the huge advantage for your "average Joe" use-case that doesn't have the budget like supercomputers do to reinvent the software stack nor does he need that much compute. It's also has the advantage you can just prototype on your company laptop with a NV gpu running windows. Heck, windows is the default where I work. even basic web apps run on windows. Not trivial to actual get access to a linux machine with a gpu and cloud is out of the question for IP reasons. Hence CUDA is the only option really.

Vattila · Jun 21, 2021

beginner99 said:
CUDA still has the huge advantage for your "average Joe" use-case

Unquestionably, CUDA currently has immense advantage when it comes to ease of adoption in the mainstream.

Hopefully, AMD, Intel and others can get themselves into gear and collaborate on SYCL toolchain support, availability and ease-of-use on all platforms.

I guess it will happen eventually. The question is just how long it will take.

Here is what I would like to see: AMD, Intel, Nvidia and Xilinx (and others) should collaborate with Microsoft on getting SYCL support in Visual Studio (replacing the old C++ AMP extension). AMD and Nvidia should collaborate with Codeplay on getting good backends for DPC++ for their respective GPU architectures (as mentioned before, Codeplay just announced they have a contract to work on AMD GPU support in DPC++, and they have previously done work on Nvidia GPU support). This should cover Windows and Linux pretty well. Google should invest in support on Android, and Apple should do the right thing for their platform.

Cogman · Jun 21, 2021

Vattila said:
Unquestionably, CUDA currently has immense advantage when it comes to ease of adoption in the mainstream.

Hopefully, AMD, Intel and others can get themselves into gear and collaborate on SYCL toolchain support, availability and ease-of-use on all platforms.

I guess it will happen eventually. The question is just how long it will take.

Here is what I would like to see: AMD, Intel, Nvidia and Xilinx (and others) should collaborate with Microsoft on getting SYCL support in Visual Studio (replacing the old C++ AMP extension). AMD and Nvidia should collaborate with Codeplay on getting good backends for DPC++ for their respective GPU architectures (as mentioned before, Codeplay just announced they have a contract to work on AMD GPU support in DCP++, and they have previously done work on Nvidia GPU support). This should cover Windows and Linux pretty well. Google should invest in support on Android, and Apple should do the right thing for their platform.

What needs to happen first is others need to break into the GPGPU market in server farms. AMD's "Instinct" has been a flop and Intel's "Phi" didn't really go anywhere.

It's a chicken and the egg problem. In order for people to adopt OpenCL and SPIR-V, there needs to be market competition. In order for that to happen, OpenCL needs better support and there needs to be more cards on the market supporting it.

Sadly, all this points to nVidia and CUDA being the defacto standard for GPGPU programming for anything other than games.

Vattila · Jun 21, 2021

Cogman said:
AMD's "Instinct" has been a flop

Let's wait until MI200 is revealed. Apparently, AMD has capable GPGPUs coming, judging by the many CPU+GPU supercomputer wins.

Cogman said:
In order for people to adopt OpenCL and SPIR-V, there needs to be market competition.

Note that SYCL 2020 is now completely independent of OpenCL and SPIR-V.

"SYCL has been closely aligned to OpenCL, but over time has evolved into its own completely distinct programming model. The latest revision SYCL 2020 can decouple completely from OpenCL and therefore eases deployment support on multiple backends."

Argonne, ORNL Award Codeplay Contract to Strengthen SYCL Support for AMD GPUs (hpcwire.com)

moinmoin · Jun 21, 2021

soresu said:
we are still lacking ROCm support in Windows

We are also still lacking an open source Windows. Microsoft itself is perfectly aware of Windows itself being to the detriment of open source support on Windows, which is exactly why they work on WSL.

soresu · Jun 21, 2021

Vattila said:
Here is what I would like to see: AMD, Intel, Nvidia and Xilinx

Xilinx is soon to be just AMD anyways.

The merger was announced late Oct '20 and approved in April.

HIP is apparently to be extended for FPGA use and integrating with Xilinx's current stack (yay more demand on AMD's stretched SW dev teams).

Also here is the full PDF for that Sylkan page.

I do wonder if there is some possibility of a 'headless' subset of Vulkan that could be created as an ultimate cross platform compute only backend for non gfx devices - ala NPUs/IPUs/TPUs, FPGAs, DSPs and other assorted accelerators.

Making it a subset of Vulkan could reduce API maintenance efforts for Khronos.

Speculation: SYCL will replace CUDA

What does the future hold for heterogeneous programming models?

CUDA is going to lead for the foreseeable future, due to installed base and support.

SYCL will rapidly replace CUDA, due to being an open standard with wide backing.

Senior member

Lifer

Senior member

Lifer

Golden Member

Senior member

Senior member

Senior member

Lifer

Diamond Member

Senior member

Senior member

Senior member

Lifer

Diamond Member

Member

Senior member

Diamond Member

Senior member

Diamond Member

Senior member

Lifer

Senior member

Diamond Member

Diamond Member