Page 2 of 3 FirstFirst 123 LastLast
Results 11 to 20 of 22

Thread: Official SYCL 1.2 Provisional feedback thread

  1. #11
    Junior Member
    Join Date
    Feb 2012
    Posts
    11
    Do I take it correctly from the specs, that now structs or classes can be passed to kernels? Or what exactly is the difference in capturing variables and passing them as lamdba parameters?

  2. #12
    Yes, SYCL allows you to pass any struct that is POD and doesn't contain pointers.

    Variables that are captured by the lambda are kernel arguments, these can be accessors, samplers and POD data types which don't contain pointers. The lambda parameters are specific types that are constructed within the kernel that are used to give host/device compatible access to the current work item's id information. For example, the parallel_for API takes an item object as the lambda's parameter.

  3. #13
    Junior Member
    Join Date
    Feb 2012
    Posts
    11
    Is there any way to get early access to the reference implementation done by Codeplay? Or at least get an estimate, as to when it will be available? (Although I would be glad to give it a spin)

  4. #14
    The Codeplay implementation of SYCL is not a reference implementation and is currently still in development, however anyone looking to get involved with the development of SYCL can contact us to discuss it further. Additionally there is an open source implementation of SYCL called triSYCL, which is still in the early stages of development, this can be found here.

  5. #15
    Junior Member
    Join Date
    Oct 2014
    Posts
    4
    What are the restrictions on how device functions and kernels can be declared and defined? It wasn't clear to me from reading the spec.

    For example, do you allow for forward declarations? e.g.

    // a.cpp
    int f(); //declaration

    ...
    single_task(...
    f(); //kernel call
    }

    int f() {
    ///definition
    }

    I image you couldn't allow one to call a device function that's defined in another translation unit.

    By comparison c++amp, which has "restrict(auto)" that seems to match how sycl wants to implicitly treat all functions, only allows "restrict(auto)" for functions where the definition is the same as the declaration. See section 13.1.1 of their specification.

    blogs.msdn.com/b/nativeconcurrency/archive/2012/02/03/c-amp-open-spec-published.aspx

    I would recommend also adding attributes that you can use if you want to define device functions in a different translation unit from where they're called. Something like

    //a.h
    int f() [[sycl::device_function]]; //declaration

    b.cpp
    //definition
    int f() [[sycl::device_function]]
    {
    ...
    }

  6. #16
    Junior Member
    Join Date
    Oct 2014
    Posts
    4
    Why require variables shared between host and device to be PODs? This is much more restrictive than either CUDA or C++AMP. It would seriously suck in my opinion if you could not pass an object like

    class A {
    public:
    A(int a) : _a(a) {}
    private:
    int _a;
    }

    from a host to a device kernel simply because it has a constructor and is therefore not a POD.

    A common technique with C++ numerical computing is to build up small expression template objects that represent, for example, something like arithmetic with matrices. One would want to pass such objects to a kernel to evaluate them and it would be inconvenient if such object weren't allowed to have constructors.

    I would recommend instead mirroring CUDA's restrictions which are more permissive:
    (see docs.nvidia.com/cuda/cuda-c-programming-guide/#classes)

  7. #17
    Junior Member
    Join Date
    Oct 2014
    Posts
    4
    Could you please clarify this clause:

    "If the lambda function relies on template arguments, then the name of the lambda function must contain those template arguments"

    If I take this kernel as an example

    template<int N, class A>
    void f() {
    A a;
    ...
    single_task(kernel_lambda< [name] >([=] {
    int x[N];
    do_something(a);
    });
    ...
    }

    does this mean that the name has to be something like

    template<int N, class T>
    struct MyLambdaName {};

    [name] = MyLambdaName<N, A>

    Why are such requirements on the name necessary? C++AMP doesn't have anything comparable and there's already a project that implements C++AMP using OpenCL

    bitbucket.org/multicoreware/cppamp-driver-ng/wiki/Home

    so I would think you could also implement sycl without them

  8. #18
    Thank you for your response. The provisional specification was released to enable developers to comment on, so these comments are very helpful to us. We will use all of this feedback to develop the specification further.

    In response to comment #15:

    There is no requirement in SYCL to declare functions as being "host" or "device". That is auto-deduced by SYCL compilers. There is no equivalent in SYCL of C++ AMP's "restrict(auto)" as there is no requirement to annotate functions in this way.

    In the SYCL programming model, kernel functions can be defined by a functor object or a lambda expression. The definition of the kernel function must be within a command_group scope as it requires the accessors for the data that is accessed within the kernel function. This restriction only applies to kernel functions themselves and not to functions called from within a kernel.

    It is not possible to forward declare the kernel functions themselves in the normal C++ sense as they are functor objects or lambda expressions, however you can forward declare the functor type or define the functor object prior to being executed.

    In SYCL as there are different options for build systems, i.e. you can have a single compiler or a separate host and SYCL device compiler, the specifics of a SYCL build system are implementation defined. This means that the method of calling SYCL functions outside of the translation unit containing the kernel function is not defined in the SYCL specification and may vary from one implementation to another. This is complicated by the fact that in OpenCL, linking is done at runtime, not compile time.

    In response to comment #16:

    There are a couple of restrictions that SYCL needs to ensure that data can be copied between host and device. For OpenCL 1.x generation devices, which is what SYCL supports in its current provisional specification, data must be able to be copied between host and device memory. The copy is potentially performed by hardware, which means that calling constructors when doing copying cannot be guaranteed.

    Also, SYCL is a shared source programming model, meaning that the source files are compiled with both a SYCL device compiler and a host compiler. This allows users of SYCL to use their host compiler of choice, for example VisualC, GCC or clang, making the programming model more flexible and portable and giving better integration with existing build systems. However in order for SYCL to provide this programming model, a SYCL runtime implementation must be able to make certain assumptions about the data layout of the functor or lambda that defines the SYCL kernel functions.

    The restriction in the current provisional specification is non-POD. There is some change between different C++ versions on what is meant by non-POD. We are considering alternative restrictions that provide the necessary restrictions on data movement that would allow sharing in SYCL, without over-restricting what is possible. We are also considering updating the specification to more recent C++ standards than is in the SYCL 1.2 provisional specification.

    In response to comment #17:

    As SYCL is a shared source programming model, every kernel function requires a unique name, so that the host side of a SYCL runtime implementation is able to identify the binary and kernel argument information that is output from the SYCL device compiler. This allows the kernels to be compiled with a SYCL device compiler and the host code to be compiled with a CPU compiler of the user's choice. It enables maximum flexibility for users because it is possible to choose different host and device compilers according to the devices and CPUs that you want to support. This is appropriate for SYCL because it is targeting a very wide range of vendors, CPUs, operating systems and devices. However, we do need to add in a naming system to allow lambda functions compiled with one compiler to be linked with CPU code compiled with another compiler.

    The reason that the template argument is required for kernel functions defined by a lambda expression is that the C++ specification does not define a naming convention for lambda expressions, meaning that otherwise a SYCL runtime implementation wouldn't be able to make assumptions about the name of the kernel function.

    When you have a kernel function that is dependant on template types, i.e. defined within a template function or class, each instantiation of the corresponding context generates a separate kernel function definition. Therefore, every SYCL kernel function must have a uniquely identifiable name. A SYCL runtime implementation can use that name, to differentiate the kernel functions. Each of those instantiated kernel functions must have a unique name. As a result a template kernel function must contain those same template arguments in the kernel function name, i.e. either in the lambda expression name or in the functor type name in order to avoid ambiguities between different instantiations of template kernel functions.

    This does create a little extra effort for the user, but allows far greater flexibility for supporting different CPUs, operating systems and devices. So we think that the effort is worth it. We are considering ways of making the naming of lambda functions easier for the user in future versions of the specification.

    I hope this answers your questions.

  9. #19
    Junior Member
    Join Date
    Oct 2014
    Posts
    4
    The specification says that kernel function can call non-kernel functions. If they do, then of course the LLVM IR code for those functions need to be included in the SPIR binary as well. From section 5.2:

    SYCL device compiler must compile only kernels for the device, as well as any functions that the kernels call.

    Which means you absolutely do have issues with things like forward declarations. For this example:

    //a.cpp
    void f();

    class MyKernel {
    ...
    void operator() (cl::sycl::id<2> myId) {
    f();
    }
    ...
    }

    //b.cpp
    void f() {
    }

    How does the device compiler know how to generate SPIR code for f when it's compiling the file b.cpp? And for that matter, even if it did, how would the linker know to link in the SPIR code for f into the SPIR code containing MyKernel.

    And should these functions include code that generates illegal SPIR instructions (e.g. exceptions, rtti, virtual function calls, etc) how will the compiler know to generate an error? Is it suppose to traverse the entire call graph of kernels to figure out which functions they call, then check each one?

    There's a reason other frameworks (c++amp and cuda) require you to explicitly specify that such functions are going to be called on the device: implicitly determining this introduces a lot of complexities. Not saying there's no solution (c++amp's requirements on restrict(auto) might point to one), but I certainly don't see any of the potential problems addressed in the specification or even acknowledged.

  10. #20
    Junior Member
    Join Date
    Feb 2012
    Posts
    11
    The updated specs for SYCL and OpenCL 2.0 have been Neil Trevett, "president of the Khronos Group, chair of the OpenCL working group and vice president of mobile ecosystem at NVIDIA". May I inquire if SYCL will be portable if it builds atop SPIR? NVIDIA is known for evading all questions concerning OpenCL, and their OpenCL implementation has been pretty much abandoned. Both C++AMP on Linux (named Clamp and done by Multicoreware) and SYCL suffer greatly from this fact.

    Could we get an official statement from NVIDIA what their plans are for the future in terms of OpenCL?

    It would be great to see it written in plain black & white that they WILL or WILL NOT support OpenCL in the near future?? This could help all current SYCL/C++AMP projects decide upon tools to use, so everyone could decide on whether they should start taking on the aggravation of porting their app to CUDA if they want to remain portable.

Page 2 of 3 FirstFirst 123 LastLast

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •