[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Public WebGL] WEBGL_get_buffer_sub_data_async



It's true that it is possible to optimize the current synchronous GetBufferSubData API so that in the ideal case it runs much more quickly. Jeff demonstrated on the working group's internal mailing list how this could work:

1) The browser maintains a CPU-side shadow copy of all buffers that were allocated with GL_*_READ usage.

2) The user calls FenceSync after any operation that modifies one of these buffers. For example, ReadPixels calls targeting a pixel buffer object (PBO), or draw calls performing transform feedback into one or more buffers.

3) The browser uses that FenceSync call, combined with the usage parameter of that buffer, as a hint that it should begin asynchronously polling for the completion of that fence. Once completed, it internally calls MapBuffer, memcpys the result into the shadow copy, and then unmaps the buffer. At that point it signals the user-visible fence as completed.

4) The user polls that fence until it's completed. At that point, the user calls GetBufferSubData, which memcpys from the shadow copy to user-visible memory without blocking.

It's an excellent point that it's possible (at all) to make this synchronous, blocking API much faster; Jeff, thanks for showing that it is. The advantage of optimizing the current API is that carefully written code will get good performance while still following the existing OpenGL ES 3.0 APIs with no additions.

There are however some pitfalls with this approach.

A) The user doesn't demonstrate their intent to read back from the buffer until they call GetBufferSubData. In order to make that call fast, the entire contents of these GL_*_READ buffers has to be mirrored back to the CPU any time the buffer is modified and a FenceSync is inserted afterward. Depending on how the user allocates buffers and how much they read back from those buffers, this may significantly increase memory traffic, and slow down applications.

B) A lot of tracking has to be added in order to invalidate the shadow copy if the user modifies it between their FenceSync and calling GetBufferSubData. Doing this will at most result in a warning as well as degraded performance. In Kai's extension proposal, it's an error to modify the buffer while an async readback is pending.

C) Because the shadow copy is hidden in the WebGL implementation, it's not possible to bypass it and eliminate one copy. Kai's extension proposal actually supports this, because the asynchronous intent is expressed directly by the user, as is the destination buffer, in the form of a SharedArrayBuffer.

Fundamentally, readback from the GPU needs to be asynchronous at some level in order to be efficient. I think I speak for the Chrome team in saying that we think it's best to express the asynchronous primitive directly, rather than try to optimize the existing synchronous primitive using asynchronous ones under the hood. We recognize that it'll add complexity to introduce new APIs and are concerned about this, too. Still, Kai's latest proposal is pretty minimal, and directly expresses the user's intent.



On Wed, Dec 20, 2017 at 4:03 PM, Jeff Gilbert <jgilbert@mozilla.com> wrote:

With the mechanisms we already have in WebGL 2[1], we can support
no-stall polled-async readback from the GPU. Even in the case of
poorly written content, this cannot incur any worse of pipeline stalls
than we already allow for in WebGL via non-PBO ReadPixels and
getBufferSubData. Note also that checking for stall-less behavior is
fairly (though not entirely) deterministic, since apps must explicitly
poll/wait on a fence before accessing the potentially-in-flight data.
This is what I am implementing in Firefox, since it applies to all
implementations, regardless of whether the implementation remotes
calls.

The key to this is the understanding that buffers with usage=GL_*_READ
are directly mappable client-side buffers, into which (primarily)
copyBufferSubData and readPixels enqueue writes. After the writes are
known to be complete (via FenceSync(GPU_COMMANDS_COMPLETE)), since
these are client-side buffers, they may be immediately mapped and read
from.

I do think there is room for a more ergonomic helper for handling this
behavior, though it's not that complicated for a library to implement
it.

There is room to investigate a solution to eliminating a copy.
MapBufferRange does this, but the naive implementation does create
garbage ArrayBuffer wrappers. Note however, that if you want to copy
the data into some existing ArrayBuffer (like the wasm heap),
getBufferSubData is already copy-optimal. There is only room for
improvement if you want to process the data in-place, which may be
able to save a copy with MapBufferRange or similar in both types of
implementation.

[1]: Since this is only available in WebGL 2, I have proposed
extensions to expose these mechanisms from WebGL 2 to WebGL 1:
https://github.com/KhronosGroup/WebGL/pull/2563

On Wed, Dec 20, 2017 at 11:06 AM, Kai Ninomiya <kainino@google.com> wrote:
> Good point Mark, I'll add that.
>
>
> On Wed, Dec 20, 2017, 10:57 AM Mark Callow <khronos@callow.im> wrote:
>>
>>
>>
>> On Dec 19, 2017, at 17:38, Kai Ninomiya <kainino@google.com> wrote:
>>
>> All,
>>
>> Our new proposal for WEBGL_get_buffer_sub_data_async is finally ready.
>> Please take a look and send along your comments and suggestions. Feel free
>> to request comment access if you want to comment on the doc itself.
>>
>> Note this is a design doc and not a spec, so it will hopefully be easier
>> to read but may not be explicit about every edge case yet.
>>
>> https://docs.google.com/document/d/1f65cGlfLHbKLOuvRSqTvrakNi60Swk6GCyS54v1ImKo/edit?usp=sharing
>>
>>
>> This doc should mention the core reason for this extension: the inability
>> of some WebGL implementations to support glMapBufferRange. And describe how
>> that led to gl.getBufferSubData() and then this proposal. As far as I can
>> see all the use cases listed would be solved if glMapBufferRange was
>> supported.
>>
>> Regards
>>
>>
>>     -Mark

-----------------------------------------------------------
You are currently subscribed to public_webgl@khronos.org.
To unsubscribe, send an email to majordomo@khronos.org with
the following command in the body of your email:
unsubscribe public_webgl
-----------------------------------------------------------