Shader compilation performance is something that comes up often when working with game developers who use Emscripten and asm.js to port their engines. Firefox has an about:config flag to enable or disable the use of ANGLE at runtime on Windows. Disabling ANGLE speeds up shader compilation by a factor of 3x-10x.
Doing async/background thread compilation of shaders would be cool, but it's not an ultimate solution since it is only a latency hiding technique which won't reduce power consumption for mobile and low end desktop. Here's some other suggestions (not mutually exclusive):
1. Microsoft has recently open sourced their HLSL shader compiler codebase, which is available at https://github.com/Microsoft/D
irectXShaderCompiler. One effect of getting open sourced is that a few months back I noticed I started getting .pdb files of shader compiler served when doing geckoprofiler/CodeXL/VTune profiles of WebAssembly applications, which gives visibility to the hotspots in d3dcompiler_xx.dll. This compiler was not originally intended to be an online compiler, so it's uncertain how much of it has been optimized for speed. Perhaps the problem could be tackled directly at the source and D3Dcompiler optimized to improve compilation times? I don't think this has ever been looked at from an online compiler perspective, so perhaps there are some low hanging fruit. Any % of benefits here would be direct wins on top of whatever other techniques are used.
2. Let's enable binary compiled shaders on the web by leveraging https://www.khronos.org/regist
ry/OpenGL/extensions/ARB/ARB_, which is in core OpenGL 4.1 and OpenGL ES 3.0. Allow one to pull out an opaque blob (nontransferrable between PCs, invalidatable by browser between page visits) of compiled shader programs and stick those to IndexedDB. This way at least warm page visits will be fast. On cold page visits one might be able to rearchitect shaders to be compiled parallel to downloading other page assets (textures, geometry, WebAssembly code) to hide most of the impact. get_program_binary.txt
3. Let's do binary SPIR-V shaders on the web. Can SPIR-V binary shaders be standardized to WebGL? There exists an extension to consume SPIR-V in desktop OpenGL at https://www.khronos.org/regist
ry/OpenGL/extensions/ARB/ARB_. Can Khronos work to standardize this on OpenGL ES as well so that it could then be translated over to WebGL? This would allow compiling all needed shaders fully offline. gl_spirv.txt
4. Cache compiled shaders in browsers internally for warm page loads to be fast. Not as ideal as 2. or 3. but could work.
To me it feels that #1 and #2 could be the best near term prospects for WebGL.2017-02-21 17:59 GMT+02:00 Maksims Mihejevs <email@example.com>:Here is one of real-world examples that we've worked on in collaboration with Mozilla and their recent WebGL 2.0 launch.After the FloodWe've been making WebGL content and high-end demos for many years now, and aware of many tricks and issues that we have to engage with during creation of such content. Such experience is not available to most of current WebGL developers, so leaving them struggling the way we had finding out caveats on the way. We happily share our experience all the time and implement best practices into our engine.Initially we had one large stall, for shader compilation, and were enforced to think smarter there, to at least compile only number of pre-cached shader programs within a buffer of time then skip to next animation frame continuing compilation. Firefox gets "unresponsive" warning after tab is frozen while compiling shaders in sync manner within single animation frame. And that is on GL platforms.On ANGLE this of course is way worse, not even mentioning mobile.So on GTX 1080, Windows, Chrome/Firefox, ANGLE. With fiber optic internet (very high quality), with servers within 10ms latency, it downloads 19.1Mb of assets (very quickly), and compilation takes even longer than downloading 19.1Mb with all nearly perfect conditions.At least async compilation in this case would allow us to initiate shader compilation right before loading most of assets, allowing to parallelize loading assets with compiling shaders. Potentially could half the loading times for such case.But 19.1Mb is actually a lot for initial download for WebGL app, so in more common cases shader compilation will take 50-95% of loading time.And we are talking not milliseconds here, but actually seconds.We have profiled complexity of our shaders and their variations very carefully, there is only few complex shader cases widely reused, but generally all shaders even most simple contribute a lot to compilation times.What is funny, is that simply inlining and minimizing string size by rewriting shader by hand preserving all same logic (so compilation result would be same), did lead to some performance improvements, in some tests we made up to 50% faster, than same shader but not inlined and not minified.Kind Regards,MaxOn 21 February 2017 at 14:29, Florian Bösch <firstname.lastname@example.org> wrote:P.S. @vendors, please solve the problems we have right now (making WebGL usable without reservations for all usecases, including low latency ones and complex ones), not the problem we wish we had, but haven't gotten to yet (WebGPU, WebNXT, WebGL 2.1, etc.)On Tue, Feb 21, 2017 at 3:23 PM, Florian Bösch <email@example.com> wrote:On Tue, Feb 21, 2017 at 1:50 PM, Maksims Mihejevs <firstname.lastname@example.org> wrote:Can't express how important solving this is for a whole WebGL platform is.I'm currently engaged with an architectural visualization startup and the rendering pipeline is of considerable complexity (though it's all up-front loaded). It generally works fine on GL backends (it might pause for maybe a few hundred milliseconds). But on the ANGLE backend, it completely freezes the tab for 15 seconds on boot. This is unacceptable. </tales from the real world>