[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Public WebGL] Re: the commit API on the OffscreenCanvas

What is known is that some way of committing frames from a spin-loop worker is required in the spec, in order to support multithreaded rendering from WebAssembly applications. commit() has been tested in small standalone test cases. Several groups are collaborating to make multithreaded rendering work in a real-world WebAssembly application.

It's a fair point that this should be made to fully work before shipping it so we will plan to put commit() back behind a flag in Chrome for the time being.


On Thu, Jul 5, 2018 at 6:29 PM Gregg Tavares <khronos@greggman.com> wrote:

2018年7月6日(金) 5:16 Ken Russell <kbr@google.com>:
On Wed, Jul 4, 2018 at 11:33 PM Gregg Tavares <khronos@greggman.com> wrote:
I'm not sure where to bring this up but I've been trying for a couple of weeks in other places and getting zero feedback sooo I am hoping you guys in charge of things will take a few minutes and read this and take some time to thoughtfully respond.

It's possible I don't understand how OffscreenCanvas is supposed to work. I've read the spec and written several tests and a couple of short examples and this is my understanding.

There are basically 2 ways to use it. They are documented at MDN.

One is listed as  "Synchronous display of frames produced by an OffscreenCanvas". It involves using "offscreenCanvas.transferToImageBitmap" inside the worker, transferring that bitmap back to the main thread, and calling bitmapContext.transferImageBitmap. This API makes sense to me. If you want to synchronize DOM updates with WebGL updates then you need to make sure both get updated at the same time. Like say you have an HTML label over a moving 3D object.

The other is listed as "Asynchronous display of frames produced by an OffscreenCanvas". In that case you just call `gl.commit` inside the worker and the canvas back on the page will be updated. This is arguably the more common use case. The majority of WebGL and three.js apps etc would use this method. The example on MDN shows sending a message to the worker each time you want it to render. Testing that on Chrome seems to work but it currently has a significant performance penalty.

Recently 2 more things were added. One is that requestAnimationFrame was added to workers. The other is the commit as been changed to be a synchronous function. The worker freezes until the frame has been displayed.

Hi Gregg,

requestAnimationFrame on workers was added as a result of feedback from the W3C TAG. It provides a way to animate and implicitly commit frames in the same way as with HTMLCanvasElement on the main thread. It replaces the use of setTimeout() on web workers for animating OffscreenCanvases, and provides a unified mechanism to allow VR headsets to animate at higher framerates than typical monitors.

It's these last 2 things I don't understand.

First: given that rAF is now available in workers I would think this is valid code

      // in worker
      function loop() {

          // get messages related to say camera position or
          // window size or mouse position etc to affect rendering

Unfortunately testing it out in Chrome this doesn't work. The `onmessage` callback is never called regardless of how many messages are sent. I filed a bug. Was told "WONTFIX: working as intended"

In the current semantics it's an error to call commit() from inside a requestAnimationFrame callback on a worker. The spec and implementations should be changed to throw an exception from commit() in this case.

I updated your samples in your Chromium bug report http://crbug.com/859275 to remove the call to commit() from within the rAF callback and they work very well. No flickering, and work exactly as you intended. Also replied to your same questions on https://github.com/w3ctag/design-reviews/issues/141 .

Really? Is that really the intent of the spec? Apple? Mozilla? Microsoft? Do you agree that the code above is not a supported use case and is working as intended?

Second: other events and callbacks don't work

      // in worker
      fetch('someimage.png', {mode:'cors'}).then(function(response) {
         return response.blob();
      }).then(function(blob) {
         return createImageBitmap(response.blob());
      }).then(function(bitmap) {
         gl.texImage2D(gl.TEXTURE_2D, 0, gl.RGBA, gl.RGBA, gl.UNSIGNED_BYTE, bitmap);
      function loop() {

This also does not work. The fetch response never comes. My guess is this is because in Chrome commit blocks and rAF event gets pushed to the top of the event queue so no other events ever get processed. The spec has nothing to say about this. Is this supposed to work? It seems like a valid use case. Note that switching the end of loop to


also does not work.

Is that correct that it should not work? I guess I don't really understand the point of having rAF in worker if these use cases are not supposed to work. Are they? If they are not supposed to work can someone please explain rAF's use case in a worker?

rAF in a worker replaces the use of commit(). Another alternative to animating in a worker would be setTimeout(), but now that rAF is present in workers, it's the best alternative.

Third, according to various comments around the specs one use case is a spin loop on gl.commit for webassembly ports. Effectively this is supposed to work

      while(true) {

But I don't understand how this is useful given that no events come in if you do that. You can't communicate with the worker. The worker can't load files or call fetch or get a websocket message or receive input passed in from the main thread or do anything except render.

Maybe people are thinking SharedArrayBuffers are a way to pass in data to such a loop but really? How would you pass in an image? As it is you'd have write your own decoder since you can't get the raw data losslessly out of an image from any web APIs and you can't transfer images into the worker (since it's not listening for messages) then you'd need to some how parse the image yourself and copy it into a sharedarraybuffer. That would a very slow jank inducing process in the main thread so now it seems like the spec is saying to use a gl.commit spin loop you need 2 workers, one for rendering, one for loading images and other things and then you need 1 or more SharedArrayBuffers and you have to implement a bunch of synchronization stuff just so you can use WebGL in a worker using this pattern mentioned in the spec?

Is that really the intent? Is there something I'm missing? This seems like a platform breaking API. Use it and the entire rest of the platform becomes unusable without major amounts of code.

If I'm wrong I'm happy to be corrected.

commit() is mainly intended to support compiling multithreaded programs to WebAssembly. The C language's threading model is that threads start up from a start function and only return from it when the thread exits. We are trying to get real use cases working which transfer all data in to these rendering threads via the C heap from other threads. commit() and its blocking behavior are required in order to reach parity with how native platforms work in this scenario.

Supporting porting native C app seems like a huge rabbit whole. Are there going to be sync image loading APIs next? Blocking `select` sockets? Reading the clipboard without an event? This is making commit be a huge gate. The moment you use it you throw away the entire rest of the platform. Really? It's hard to believe that's being signed off on.

If commit's only use case is native apps it seems like it should not ship and should stay behind a flag until these other issues are worked out. I pointed out several above, like the example that there is no way to use the browser's native image loading with commit even through sharedarraybuffers. It seems very premature to ship such an api (commit) without actually knowing how those issues will be resolved. Tests can be made behind a flag.

Are there tests and ports running now with this feature behind a flag that show all these issues can be solved in reasonable ways?

Four: Non front tabs: rAF is currently not delivered if the page is not the front tab which is great but rAF is an event so even when rAF stops firing because the page is not on the front tab other events still arrive (fetch, onmessage, XHR, websockets, etc...) This means even though your page doesn't get a rAF callback it can still process incoming data (like your chat app's messages).

How is that supposed to work with `gl.commit` loops? It's not the front tab so you want to block the commit so the worker doesn't spin and waste time. If the worker locks then that seems to have implications for all other associated workers and the main thread. If you're using Atomics to sync up things suddenly they'll fail indefinitely even more complicating all the code you have to write to use this feature.

I think that ideally commit() would block until the tab comes back to the foreground, to minimize CPU usage. However, if that turns out to be suboptimal for some use cases, we could consider throttling commit(), to essentially block for some time period and then return control to the worker.

Shouldn't this be figured out before shipping?


Chrome has already committed to shipping the API. The code as been committed so if nothing changes it will ship automatically in a few weeks with all the issues mentioned above not behind a flag but live so it seems important to understand how to use this and if all these issues were considered and what their solutions are.