subbuffer understanding (alignment / usefulness)
I just tried created to cl::Buffer subbuffer-objects, where for the second creation I received an error:
bufReg.origin = 0;
bufReg.size = x;
cl::Buffer subBuf1 = someBuf.createSubBuffer(CL_MEM_READ_WRITE, CL_BUFFER_CREATE_TYPE_REGION, &bufReg, &error);
bufReg.origin = 100;
bufReg.size = x;
cl::Buffer subBuf2 = someBuf.createSubBuffer(CL_MEM_READ_WRITE, CL_BUFFER_CREATE_TYPE_REGION, &bufReg, &error);
// error: code -13, CL_MISALIGNED_SUB_BUFFER_OFFSET
some research revealed that on my platform (Intel SDK) the origin field has to be a multiple of 128 to avoid the alignment error. Iis that serious that I can create sub-buffer objects only if they fit to this really restrictive alignment boundary? Or is there some misunderstanding on my side?
I suppose the alignment is very much platform specific, but in general in my code I could not guarantee any specific alignment at all. So in short does it mean that I better stop trying to use sub-buffers at all, and just use ordinary buffers and appropriate offset / cb arguments? What are useful applications for sub-buffers given these tight restrictions?
The offset is in Byte, not in number of elements i think. Do you really want to skip the firts 100 byte? Or just the first 100 Elements (or 100 * sizeof(TYPE)).
yes, the offset is in bytes, I know. What puzzles me is why the alignment requirement:
i) is there after all (I can well come along with a performance penalty in the case of misalignment; but an error ??)
ii) is so extremely restrictive [for that implementation at least]
I don't see much (any ?) use for sub-buffers any more. Also note that examples from textbooks illustrating sub-buffers just would not work unless they luckily happen to fit the alignment requirement. I haven't seen lots of discussions on the issue though (e.g. warnings in textbooks, that unless you are very lucky and/or careful, you will run into problems) and still wondering if I am overlooking something.
really strange indeed. Have looked it up myself and have seen that my gtx580 has an alinment of 4096 Bit or 512 byte. please keep me up to date when you find a hint to that topic.
I have the same problem. I have done a test that goes perfectly with Nvidia OpenCL (Nvidia C2050 with origin = 0 and !=0), but when I use the AMD OpenCL (AMD 6970 with origin != 0) -- > error -13