[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Public WebGL] async shader compiliation

Thanks for link Jukka.

So as cold start, on Ubuntu in Chrome 58, it took ~15 seconds to compile shaders.
After it shaders got cached, presumably by system, as even cache clear with incognito window on next runs gave smaller numbers ~2 seconds.

Nice demo! But loading times are just not for web at all. WebAssembly compilation is still stalling a browser for about 15 seconds.
And everything in the demo seems like speed up :)

On warm starts it takes 30+ seconds to start. Seems like data.gz (86Mb) file is not getting cached, as well as wasm.gz (13.6Mb).

It is good to see that this becomes more and more real now.
Regarding shaders in this particular demo, is there anything complicated? Because graphics looks very mobile-like with pretty much little dynamic stuff (Any PBR?).


On 7 March 2017 at 18:07, Jukka Jylänki <jujjyl@gmail.com> wrote:
Firefox 52 today releases WebAssembly support, and if you are on a browser that supports it, you can visit Epic's Unreal Engine 4 Zen Garden demo ported to WebAssembly and WebGL 2 at


That site precompiles all built shaders before launching and shows a visual progress meter while doing it. If you want to capture timing data from the run in a deterministically reproducible manner, visit


which runs a timedemo style prerecorded rendering and shows result numbers at the end.

2017-03-02 19:46 GMT+02:00 Zhenyao Mo <zmo@chromium.org>:
On Thu, Mar 2, 2017 at 2:13 AM, Florian Bösch <pyalot@gmail.com> wrote:
> On Wed, Mar 1, 2017 at 11:04 AM, Mark Callow <khronos@callow.im> wrote:
>> Throughout the time I’ve been working with OpenGL ES 2+, ubershaders have
>> generally been regarded as something to avoid.
> The problem is in no way specific to ubershaders.
> On Wed, Mar 1, 2017 at 11:58 AM, Maksims Mihejevs <max@playcanvas.com>
> wrote:
>> We do need async shader compilation, but more we need is faster shader
>> compilation in first place.
> Faster would be nice, but for technical reasons cannot be achieved quickly
> or comprehensively. Async can be achieved to some degree and work
> comprehensibly within expectations, without too much difficulty. That's why
> we need Async more, because it's a solvable problem.
> On Thu, Mar 2, 2017 at 5:04 AM, Zhenyao Mo <zmo@chromium.org> wrote:
>> Thinking more on the implementation side, currently Chrome uses
>> virtual contexts on many GPUs in Android and also on Linux NVidia due
>> to driver bugs (for example, flush order isn't guaranteed) and
>> performance (MakeCurrent is very slow). This kills the possibility of
>> implementing an efficient async shader compile.
>> Of course if we can justify the need, then we can push driver vendors
>> to fix the issues, but that's proven to be an uneasy task.
>> I am not saying I don't support async shader compile, just want to
>> point out some unpleasant reality.
> The efficiency of shader compilation, and the asynchronicity of it are two
> seperate mostly unrelated things. One is solvable to some degree
> (asynchronicity) comprehensibly, and the other is difficult/impossible to
> solve (speed). Let's focus on the one we can solve, not on objections or
> "realities" to the problem we can't, as a justification to not solve the
> problem we can solve.

You miss the point.  I am not talking about efficiency of shader
compilation, but efficiency of context switching.

But Ken's right, there is a way to make it work on bad drivers if
resource sharing is still working on them.

> On Thu, Mar 2, 2017 at 8:30 AM, Kenneth Russell <kbr@google.com> wrote:
>> Good points Mo. I do wonder though whether even on these platforms we
>> could still spawn a background thread in Chrome's GPU process dedicated to
>> shader compilation and program linking. Resource sharing still works even on
>> these badly behaved platforms, and the background thread would have its own
>> dedicated OpenGL context, sharing the compile and link results with the main
>> thread.
> Many platforms only do any actual work at the linking stage. Separate
> contexts aren't going to get you out of the clinch to implement fence/wait
> on a CPU bound process that should not block the tab compositor and
> JS-thread.