Results 1 to 5 of 5

Thread: memory alignment for struct members

  1. #1
    Junior Member
    Join Date
    Apr 2011
    Posts
    13

    memory alignment for struct members

    I had some troubles to use a structure in the constant memory to pass kernel parameters. I posted my code at nvidia's forum at

    http://forums.nvidia.com/index.php?showtopic=197734

    but so far nobody has replied me. I did quite a bit search in the meantime, including reading the specification, but unable to find an example for how to use a struct in a kernel.

    Can someone take a look at my example and tell me what I should do in this case? I tried replacing all member types with cf_float4/cf_float etc in my host code, but it does not work either on an nvidia card

    You comments on this is very much appreciated.

  2. #2
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: memory alignment for struct members

    I don't see anything wrong with your code and you already said that the number of constant kernel arguments is 4 so that's not an issue either.

    Between that and the fact that it works on ATI, it looks quite clearly like a bug in NVidia's OpenCL drivers.

    I'm sorry I don't have any advice on how to work around the issue. You could try commenting out some of the struct fields and see if at some point the problem goes away.
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

  3. #3
    Junior Member
    Join Date
    Apr 2011
    Posts
    13

    Re: memory alignment for struct members

    Quote Originally Posted by david.garcia
    I don't see anything wrong with your code and you already said that the number of constant kernel arguments is 4 so that's not an issue either.

    Between that and the fact that it works on ATI, it looks quite clearly like a bug in NVidia's OpenCL drivers.

    I'm sorry I don't have any advice on how to work around the issue. You could try commenting out some of the struct fields and see if at some point the problem goes away.
    great, thank you for confirming on this. I feel a lot better now

  4. #4
    Junior Member
    Join Date
    Apr 2011
    Posts
    13

    Re: memory alignment for struct members

    Quote Originally Posted by david.garcia
    I don't see anything wrong with your code and you already said that the number of constant kernel arguments is 4 so that's not an issue either.

    Between that and the fact that it works on ATI, it looks quite clearly like a bug in NVidia's OpenCL drivers.

    I'm sorry I don't have any advice on how to work around the issue. You could try commenting out some of the struct fields and see if at some point the problem goes away.

    hi David

    I printed sizeof(KParam) inside the host and device and found the two sizes are different for the code I posted at nvidia's forum: for the host code, it is 180, for cl kernel, it is 192. I prepended all type names by cl_ for the host definition, and now their sizes are the same.

    In your opinion, if I don't prepend cl_ in the types, will there be misalignment when passing the 180-byte host struct to the 192-byte device struct? where the paddings happen? are they at the very end of the struct or can be in between two elements?

    I also found out that the segfault error may not solely be caused by the constant parameter, but by some bugs in the nvidia's compiler in handling nested if-statements. I am still investigating on this.

  5. #5
    Senior Member
    Join Date
    May 2010
    Location
    Toronto, Canada
    Posts
    845

    Re: memory alignment for struct members

    In your opinion, if I don't prepend cl_ in the types, will there be misalignment when passing the 180-byte host struct to the 192-byte device struct? where the paddings happen? are they at the very end of the struct or can be in between two elements?
    Ah, I missed that. Yes, you must always use cl_xxx types on the API side as they are guaranteed to match the size of their cousins in OpenCL C (except for size_t and bool). For example, cl_long in the API side is equivalent to ulong on OpenCL C (a 64-bit signed integer).

    Padding can happen either between struct members or at the end of the struct. This comes from C99 actually.
    Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

Similar Threads

  1. OpenCL struct alignment on host and device
    By shantanu in forum OpenCL
    Replies: 4
    Last Post: 10-20-2012, 03:15 PM
  2. Global memory alignment
    By guy.brush in forum OpenCL
    Replies: 1
    Last Post: 04-22-2010, 04:55 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •