Speculation: SYCL will replace CUDA

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

What does the future hold for heterogeneous programming models?


  • Total voters
    21

Vattila

Senior member
Oct 22, 2004
679
973
136
It looks more like SYCL is an extra abstraction layer on top of lower level GPU APIs like CUDA or OpenCL, than it is a replacement for either.
Although some implementations may rest upon other programming models and supporting frameworks — such as hipSYCL using HIP (AMD's cross-platform CUDA dialect) to provide cross-platform support on top of AMD ROCm and Nvidia CUDA frameworks and toolchains — there is nothing in the SYCL 2020 standard that mandates a particular implementation. You can make a SYCL 2020 compliant implementation using pure machine code, if you should want to.

With regards to abstraction level, the SYCL programming model is similar to CUDA and HIP. They are all providing a single-source programming model, using similar fundamental concepts (kernels, scheduling, memory access, etc.), and all are based on C++, compiling down to high-performance native machine code for the targeted system architecture. Porting between them is feasible and can be relatively straight-forward, with the help of automation, as AMD has proven with their numerous recent supercomputer wins.

For example, here is a blog post on porting from CUDA to HIP for the LUMI supercomputer:

"As mentioned, HIP is AMD’s answer to CUDA, however, whereas CUDA code can only run on Nvidia GPUs, programs using HIP can run on both AMD and Nvidia GPUs. The HIP API syntax is very similar to the CUDA API, and the abstraction level is the same meaning that porting between the two is easy and we will cover the practical ways this can be done below. [...] In the end converting CUDA code to HIP is usually quite straightforward, with the catch being that the most bleeding edge CUDA features are not supported but may be supported in the future. The AMD GPU software stack comes with tools that will significantly speed up the conversion process compared to doing it manually."


Intel provides similar tools in their oneAPI framework for automating the translation of CUDA code to SYCL.
 
Last edited:

Vattila

Senior member
Oct 22, 2004
679
973
136
Recent Khronos SYCL Webinar, with presentations from all the major SYCL implementers (oneAPI/DPC++, ComputeCPP, triSYCL, hipSYCL and neoSYCL):


Below is a short presentation on SYCL in GROMACS (a heavily used library in the supercomputer world — apparently ~4-5% of the worlds' total supercomputing time is spent running GROMACS!). Kudos to Intel for doing the right thing here!

 
Last edited:

Vattila

Senior member
Oct 22, 2004
679
973
136
Here are a few recent Intel articles on their support for SYCL:
Good to see Intel talk more about SYCL explicitly (not obscured by DPC++/oneAPI).

Here are your current options for deploying SYCL code:

1644490703201.png

PS. Note that support for AMD GPUs in DPC++/oneAPI will arrive as well, as CodePlay Software has been contracted by the USA national labs to implement such support (see announcement). Also note that the diagram above does not show the full list of SYCL implementations currently existing, as e.g. Huawei has an implementation of SYCL for their recently announced Beiming architecture.
 
  • Like
Reactions: moinmoin

moinmoin

Diamond Member
Jun 1, 2017
3,306
4,561
136
Does Nvidia support it yet?
No, but you can use hipSYCL or oneAPI's DPC++ to use SYCL on CUDA. The latter is being supported by Lawrence Berkeley National Laboratory and Argonne National Laboratory:


ANL also did a performance comparison with CUDA and hipSYCL:


 
  • Like
Reactions: Vattila

DrMrLordX

Lifer
Apr 27, 2000
19,169
7,928
136
SYCL performance looks pretty good except on N-body. Interesting that it actually speeds up a few of those kernels from the first performance suite considerably (and slows down on some others). Shows that CUDA has some room for improvement.
 

Vattila

Senior member
Oct 22, 2004
679
973
136
Intel is working on implementing the Blender renderer using SYCL:

"Opened up at the end of March is the work-in-progress Intel oneAPI back-end for Blender's Cycles renderer. This Intel GPU back-end focused for supporting the company's forthcoming Intel Arc graphics cards is targeting the open-source oneAPI Base Toolkit and making use of SYCL. There still is more code work needed, but it's good to see this coming together to complement Blender's NVIDIA CUDA and AMD HIP support. [...] With Blender 3.0 having removed OpenCL acceleration, at least until there is any viable Vulkan back-end developed it's been up to vendor-specific rendering back-ends with the NVIDIA CUDA/OptiX code leading the way followed by AMD HIP. (With Blender 3.2 this summer is where the AMD HIP acceleration on Linux will finally be in place.) Intel engineers recently sent out for review their code adding a Cycles back-end for Intel GPUs via oneAPI and the SYCL API. Via the industry standard SYCL, this back-end could potentially be used with other driver stacks in the future."

Blender Cycles Rendering Support For Intel Arc Via oneAPI + SYCL Under Review - Phoronix
 
  • Like
Reactions: moinmoin

Vattila

Senior member
Oct 22, 2004
679
973
136
GROMACS 2022 ported to SYCL:

"As part of the oneAPI optimization work, Lindahl’s team ported GROMACS’ CUDA code, which only runs on Nvidia hardware, to SYCL using the Intel DPC++ Compatibility Tool (part of the Intel oneAPI Base Toolkit), which typically automates 90-95% of the code migration. This allowed the team to create a new, single portable codebase that is cross-architecture-ready, greatly streamlining development and providing flexibility for deployment in multiarchitecture environments."

GROMACS 2022 Advances Open Source Drug Discovery with oneAPI (hpcwire.com)
 
  • Like
Reactions: moinmoin

Vattila

Senior member
Oct 22, 2004
679
973
136
Intel makes CUDA-to-SYCL porting easier with SYCLomatic, an open-source migration tool:

"There are many things Intel needs to get right with OneAPI if the chipmaker wants its multi-pronged compute strategy to work the same kind of magic Nvidia has conjured with CUDA, and the first thing it needs is an easy way for developers to port code from CUDA. Intel upped its CUDA migration efforts this month by open sourcing the technologies powering the Intel DPC++ Compatibility Tool, which is used for moving code from CUDA to OneAPI’s Data Parallel C++ language. But rather than herding developers into OneAPI, the new open source tool, called SYCLomatic, focuses on simply helping move that code to SYCL, the royalty-free, cross-architecture programming abstraction layer that underpins Intel’s parallel-friendly C++ implementation."

Intel Takes The SYCL To Nvidia’s CUDA With Migration Tool (nextplatform.com)

1653199144467.png
 
Last edited:

Vattila

Senior member
Oct 22, 2004
679
973
136

ASK THE COMMUNITY