Official SYCL 1.2 Provisional feedback thread

    Official SYCL 1.2 Provisional feedback thread

    March 19, 2014 – San Francisco, Game Developer’s Conference – The Khronos Group today announced the release of SYCL™ 1.2 as a provisional specification to enable community feedback. SYCL is a royalty-free, cross-platform abstraction layer that enables the development of applications and frameworks that build on the underlying concepts, portability and efficiency of OpenCL™, while adding the ease-of-use and flexibility of C++. For example, SYCL can provide single source development where C++ template functions can contain both host and device code to construct complex algorithms that use OpenCL acceleration - and then enable re-use of those templates throughout the source code of an application to operate on different types of data.

    The SYCL 1.2 provisional specification supports OpenCL 1.2 and has been released to enable the growing community of OpenCL developers to provide feedback before the specification is finalized. The specification and links to feedback forums are available at:

    While SYCL is one possible solution for high-level parallel programming that leverages C++ programming techniques, the OpenCL group encourages innovation in diverse programming models for heterogeneous systems, including building on top of the SPIR™ low-level intermediate representation, using the open source CLU libraries for prototyping, or through custom techniques.

    “Developers have been requesting C++ for OpenCL to help them build large applications quickly and efficiently and there are lots of useful C++ libraries that want to port to OpenCL,” said Andrew Richards, CEO at Codeplay and chair of the SYCL working group. “SYCL makes this possible and we are looking forward to the community feedback to help drive the final release and future roadmap. We are especially keen to work with C++ library developers who want to accelerate their libraries using the performance of OpenCL devices.”

    SYCL 1.2 Features
    SYCL 1.2 will enable industry innovation in OpenCL-based programming frameworks:

    • API specifications for creating C++ template libraries and compilers using the C++11 standard;
    • Easy to use, production grade API that can be built on-top of OpenCL and SPIR;
    • Compatible with standard CPU C++ compilers across multiple platforms, as well as enabling new SYCL-based device compilers to target OpenCL devices;
    • Asynchronous, low-level access to OpenCL features for high performance and low-latency – while retaining ease of use;
    • Khronos open royalty-free standard - to guarantee ongoing support and reciprocal IP coverage;
    • OpenGL® Integration to enable sharing of image and textures with SYCL as well as OpenCL;
    • Development in parallel with OpenCL – future releases are expected to support upcoming OpenCL 2.0 implementations and track future OpenCL releases.

    SYCL Homepage
    An Overview of SYCL 1.2
    OpenCL DevU at GDC 2014
    Last edited by khronos; 03-19-2014 at 06:49 AM.

    Going through the specs slowly. Very high level feedback is that we need more examples. I already mentioned this to Andrew on twitter.
    Also, compilation workflow is really unclear.
    For example:

    1. Let us say I have my favourite C++11 compiler installed. GCC, VS2013 whatever. Let's say I do NOT have any other compilers installed, nor any OpenCL drivers and just want to compile SYCL code to (parallel) native code using my native compiler. I guess this will require you to release some header files so that classes such as cl::sycl:buffer are understood by the C++ compiler to allow it to generate CPU code. This will be useful for development at least and for porting code to platforms where OpenCL drivers are not available (eg: WinRT).
    Will this supported? If so, how do things work? Should we expect a royalty-free solution for this?

    2. Single-source SYCL compilers are easy to understand. These will take all your source files, including both regular C++ and SYCL, and generate a single binary containing both host and device code. What about multi-compiler solutions mentioned? Are those solutions likely to look like, say, nvcc? I.e. compiling device code itself and inserting any required glue code for host, and then pass all the original host code as well as generated host code to available C++ compiler such as gcc, vs etc.?

    Good questions, thankyou

    Any SYCL implementation is require to support execution of any code on the host CPU using just the host compiler as well as execution of device code on one or more OpenCL devices. A host-only implementation would not be conformant, but you could use a conformant implementation of SYCL to run code only on host.

    SYCL is a royalty-free standard. Whether a specific implementation has licensing terms requiring payment or royalties is up to individual implementers.

    How SYCL is compiled is not actually defined in the spec. This was a deliberate decision to allow implementers freedom. However, an implementation could operate like this:

    You compile your source file with a SYCL device compiler and it produces a header file containing the compiled kernel and implementation-specific glue code to invoke the kernel on an OpenCL device. E.g. mysyclcompiler mysourcefile.cpp -omysyclheader.h

    Then you could compile the same source file with your host compiler and tell it where the compiled kernel header is. E.g. gcc -c -DSYCLHEADER="mysycleheader.h" mysourcefile.cpp

    The sycl header files and runtime sort out the rest.

    Alternative approaches would still be valid

    Thank you for your feedback.

    More examples would definitely help in describing the features of the SYCL specification and this is something that we are currently looking into. We will shortly be posting a series of blogs on the Codeplay website that will be aimed at describing the SYCL programming model and the available work flow solutions as well as providing more practical examples.

    A couple of typographical issues:

    p.14: "For a kernel to access local memory on a device, the user can either create a dynamically-sized local accessor object to the kernel as a parameter." -- typically "either" is followed by an "or", and also "to the kernel as a parameter" seems like it is missing something ahead of it.

    p.75: "the device." is hanging there by itself with blank space above it. It seems like something is missing prior to it.

    Thanks, we will have a look at these.

    I am happy to see something like SyCL develop. The question of how to we best program accelerators is still unanswered. I doubt there is an universal answer at all. The more things we try, the closer we will get to a satisfying solution. So I applaud your efforts. I read the provisional specification and I have a few questions and comments:

    * SyCL is not something that can be implemented as a standard C++ library but is a compiler extension or an additional compiler, not unlike C++ AMP is that correct?

    * command_group: this concept seems to try to fuse memory transfer and compute together; with command_groups are things like pipelines and double-buffering still possible? How would one go about implementing overlapping copy and compute using the command_group concept?

    * The accessor seems interesting - it's actual usefulnesses can best be assessed once we can implement code using SyCL: when can we expect a working prototype? I really dislike the name "accessor" though. C++ AMP calls this an array_view which is a lot nicer.

    * I dislike the name of the queue concept; it is too generic and usually means something completely different. I know there is the namespace but still. I obsessed about the exact name for the thing that is a stream or a command_queue and came up with the concept of a 'feed' in my GPU library Aura.


    Thank you for your feedback.

    SYCL defines two components; a C++ runtime library and a device compiler. The SYCL runtime library uses C++11 features, however as it includes OpenCL language extensions, no new language extensions are required.

    As SYCL is asynchronous; all commands defined within a command_group are enqueued asynchronously, double buffering can be achieved automatically by the runtime providing that the command_groups or the individual commands are defined such that they can be executed in parallel.

    There is currently no implementation of SYCL available, the only announced implementer is Codeplay. If you are interested in more practical examples of SYCL, a series of blogs will be posted on the Codeplay website, the first of these can be found here.

    SYCL is still in the provisional stage of specification and is therefore still subject to change based on the feedback from potential developers and implementers so any feedback regarding naming and the programming model are appreciated and will be discussed within the Khronos working group.

    At the moment I very much like using the cl.hpp and was wondering whether this will be continued to be available for future versions of OpenCL or whether SYCL
    will supersede this?

    The SYCL header sycl.hpp includes cl.h, but doesn't not include cl.hpp. SYCL is an alternative to the C++ wrappers and does not conflict with them, therefore they will continue exist as part of OpenCL as long as the OpenCL working group maintains them.

