I'm currently trying to improve the "culling"-algorithm of an existing program (called Celestia).
Right now i try to increase the performance of the code i have so far.
The thing that really keeps bugging me, is that I can't seem to find a way to read only a specific part of a memory object. This would accelerate my algorithm by a great amount.
Is it possible to read only specific parts of a buffer?
If I try to call the enqueueReadBuffer(), with the "size" argument being smaller than the Buffer it is trying to read, i get a "CL_INVALID_VALUE"-error.
(btw.: I use the c++ bindings for openCL)
Here is an explanation to what I'm trying to do:
I have a list of stars and one of my kernels is checking their attributes against render settings and viewer position to figure out, whether they are visible or not.
This kernel also calculates a few important variables for each visible star, necessary for later rendering.
Here's the problem: Since star catalogs may get pretty huge (easily in the range of several 100.000 to a few million) reading the buffer object with the results (about 32 byte per star) takes up massive amounts of time (because i don't know how many stars will be visible, I have to allocate memory to store results for all stars (as kind of a worst case scenario)). Now I call a new kernel afterwards, which puts all non-zero results (aka visible stars) at the beginning of a new list. Again for the worst case, that list needs to be able to contain results for every star. Therefor the Buffer object is initialized at program star-up with a size of reultSructure_size * numberOfStars.
After the first kernel is done, I also read another list which in the end tells me which star is visible and thus how many stars need to be rendered. So theoretically i would only need the first X entries of the buffer object containing the results, but OpenCL won't let me
The way I see it, I have to allocate a new buffer object every frame, which i dislike. I would much rather allocate too much memory at the start of the program than having to create a fitting buffer object (including passing the buffer to the device) at every frame :/