[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Public WebGL] async shader compiliation



Some user stories to help highlight different ways this issue affects users:

1. Simple WebGL app "Load and Show". In this case all content is loaded and shaders usually compiled in advance. In this scenario developer is limited to shader complexity. Additionally developer currently cannot start compiling shaders while browser is downloading assets.

2. Async WebGL app "Load, show, continue loading, show more". In this case some content is loaded and some shaders are compiled. But more content is loaded as user progresses through application, and more shaders need to be compiled. Currently developer is very enforced to be concerned about what shaders will be needed. Leading to dynamic shader componsition becomes very pricey. This enforces to rethinking async approach in web development, which is a power of web, but is restricted due to shader compilation stalls.

3. Procedural app. Shadertoy is a good example. It is possible to create resources procedurally, and is a very attractive idea - no much to download, just generate textures and geometry on GPU, and you can have rich applications with loads of content. But this is not viable. It turns out downloading assets is usually faster than generating them, and generation - is not a slow bit, it is complex shaders compilation what leads to large stalls, making procedural approach very limited as well.


We've encountered vast issues with shader compilation at PlayCanvas and it is a constant blocker for many things, that leads to reduced user experience.
We believe that currently it is most major blocker for WebGL, to allow rich content being delivered in a good manner to web users.


Can't express how important solving this is for a whole WebGL platform is.

Kind Regards,
Max

On 21 February 2017 at 11:59, Florian Bösch <pyalot@gmail.com> wrote:
The current state of affairs in regards to shader compilation is unacceptable for the following reasons:
  1. Shader compilation can take considerable time
  2. Shader compilation freezes the tab/js thread
  3. Shader compilation freezes scene rendering
The reasons for why shader compilation is slow can be:
  1. The D3D compiler takes up a lot of CPU time
  2. The GL driver takes up a lot of CPU time
In the first case of the D3D compiler, neither the js thread nor the rendering would need to be paused, because it is purely CPU bound and independent of the GPU command queue. The worst shader compile times are usually observed on the ANGLE backend anyway, and so addressing this satisfactory would improve things a lot.

In the second case of the GL driver taking up a lot of CPU time the situation is more complicated. It would not need to block the JS thread, but it would need to block rendering as the GL drivers compile operation is inside the GL command queue.

There are therefore two interpretations of asynchronous compilation:
  1. Compilation is independent of the command queue and shaders may become available independently of command queue completion. There is currently no API that can satisfy this situation.
  2. Compilation is coupled to the command queue and shaders block rendering (but not the js thread), for this case there is currently an API available to deal with that situation and that is wait/fence.
For the sake of solving all usecases, also the ones that can upgrade to better behavior over the GL-driver command queue coupled compilation, it would therefore be preferrable to come up with a completion primitive that can interact with WebGL independently of the command queue, but would not require managing micro-state completion (a callback for every shader), as shader compilation would still be in some sort of queue, just not one coupled to the command queue.

I'd propose we introduce a webgl specific wait/fence that follows the semantic of traditional wait/fence, but only applies to *async entry points that are not scheduled into the usual command queue. So to speak it would be a wait/fence for a second, independent, command queue for asynchronous commands the UA can complete (possibly independently) of the GL command queue. In the case that it can't, its behavior would be idempotent with the GL specific wait/fence, but in the case that it can, it wouldn't be identical and can complete before/after the GL specific wait/fence completes.