On Thu, Sep 27, 2012 at 7:05 AM, Vladimir Vukicevic <firstname.lastname@example.org> wrote:
Benoit and I had some conversations about this yesterday; doing perf tests here is tricky.
It depends on what you're timing. If you want a benchmark that just times how long WebGL calls take, then something like:
> 2) It's not clear that setTimeout(0) has any use. If one wants to
> measure stuff unrelated to compositing, postMessage should be at least
> as good as setTimeout(0) and often better in current browsers (not
> throttled). So we should be able to do with only RAF and postMessage.
> 3) When using postMessage, one should measure only the time taken by the
> payload, instead of measuring the lapse of time between two
> consecutive callbacks.
var t0 = window.performance.now();
var elapsed = window.performance.now() - t0;
should be enough. Do enough of those, triggered via setTimeout(0) to avoid blocking the browser, and you should have a good time recording. But that will only measure the actual WebGL calls + GPU execution time; if you want to measure the entire rendering pipeline, that'll be trickier.
I originally suggested using postMessage instead of rAF because rAF will aim to give you 60fps (or whatever) and won't give you anything faster, whereas postMessage is unthrottled [for now]... but there's no guarantee that a full composite run will happen in between postMessage message events, so it doesn't actually help. Ideally you'd have:
|--- Frame 1 --------| |--- Frame 2 --------| ...
[callback] [composite] [callback] [composite]
and you could measure the start of frame 2 minus the start of frame 1 as the time, but if you're using postMessage, you could easily get:
[callback] [callback] [composite] [callback] [composite] [callback] [callback] [callback] [composite] ... etc.
since there's no guarantee that you'll get a full frame per callback.
Benoit came up with an interesting approach to actually measure this, though. We should use requestAnimationFrame, but then adjust the payload until we -just- hit 60fps (or whatever the target cap is). Pick some GPU workload that you're measuring (perhaps a really simple one if you want to measure compositing overhead) and run it in a loop in the frame callback; keep increasing the number of loop iterations until you start dropping under 60fps. The result of the test is the number of iterations of the workload you can run during a frame and still maintain 60fps. If the browser's compositing overhead increases, then the number of iterations you can do decreases; similarly, if the time it takes to execute an iteration goes up, the final score decreases as well.
So just FYI, I have a harness that does thisand a test that uses that harnessUnfortunately I've had lots of problems using it because of timing variations depending on the platform.Issue #1: Frame averagingMy first attempt to use an average the frame rates across N frames. Say N is 16 frames. If the average was below the target (60fps) then I'd double the number of things to draw. If it was above the target I'd half the number of things to draw.Basically I'd get an average frame rate that is high and the harness would keep adding more things to draw even though the instantaneous frame rate was too slow. So I'd end up increasing the number of things to draw over several frames, finally the average would be low enough to start decreasing the number but now I'd have the problem in the opposite direction, it would be decreasing the number for several frames because the average was too low even though the instantaneous was highIssue #2: Inconsistent frame ratesMy second attempt (the current perf harness) uses instantaneous frame rates. It uses a "velocity" for how much more or less to draw each frame. It doubles velocity if fps is below the target. The moment it goes above the target it cuts reverses the velocity and divides by 4. That seemed like it would quickly ping pong to the max draw count. It does on some platforms but on others (for example in my Linux box) the frame rate I get ping pongs between 30 and 60 fps (at least according to the timers) so that even when drawing only 10-50 things it's never increasing the count. You can manually lower the target frame rate to like 25fps and the count will then jump to 2k-5k, then move the target back up to 60fps and it will stay there.Maybe you guys have some ideas on how to fix it
Benoit suggested that we try to keep the result of that be time, but I don't think we should; I'm not sure if time has any specific meaning there (because unless you call glFinish you don't really know who's going to be paying the full GPU cost); better to keep the two separate.
So, that said, I think there are really only two types of tests that we need:
1 - tests that measure WebGL call speed; these can be run from setTimeout(0) and will just measure raw elapsed time for a set of calls.
2 - tests that measure full compositing performance; these should run from requestAnimationFrame, and should use the approach bjacob came up with above.
This -should- give us pretty solid perf coverage; it should be very useful as a performance regression test for WebGL implementations. In theory, all tests could be run as #2, though that will only really give us 16ms precision for time.. and for some things (texture conversion speed on upload, for example) smaller time differences could result.
You are currently subscribed to email@example.com.
To unsubscribe, send an email to firstname.lastname@example.org with
the following command in the body of your email: