[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Public WebGL] Typed Array setter for partial arrays (and typed array performance)

A few different people have mailed me off-list about the little benchmark, so I thought I'd share some notes here:
As pointed out by a commenter on the linked blog post, v8 only optimizes typed arrays if you only call a method that uses them with typed arrays and never with js arrays. This has implications for benchmarks, as most will reuse code in their tests (like the microbenchmark I linked did).
Here's a test showing this behavior:
On Chrome 12, the difference in this behavior makes typed arrays go from appearing 2x slower than native arrays to 2x faster.
Hopefully this gets fixed in the future, but for now this is important to keep in mind when profiling.
An updated version of the microbenchmark I linked earlier:
Now, Uint8Array is significantly faster than js native arrays (and interestingly, faster than CanvasPixelArray).

Also, these seem to crash certain versions of Chrome on Linux and OSX. That's... scary.

Ben Vanik

On Thu, Apr 21, 2011 at 11:57 AM, Ben Vanik <ben.vanik@gmail.com> wrote:
I agree this method would be really helpful. If the offset/count values were in bytes, it would allow for some awesome uses of typed arrays.

The primary use case that comes to mind is packed data transfer. I've been experimenting with loading geometry/textures/etc from data blobs, and being able to subset the data efficiently (without using views, as I want an actual copy for modification) would make things much nicer. The other side of this is using it to construct data blobs - saving off content from client->server that contains large typed array chunks would greatly benefit from the speed boost.

And as a nodejs user, this would be a tremendously useful thing when using protocol buffers and other sorts of binary message formats where perf really matters or doing large file system manipulations.

As for the microbenchmarks, I've noticed the same thing. I just whipped this one up yesterday for testing out some common patterns I need for image processing:
The time for creating and initializing a JS array is two orders of magnitude longer than for typed arrays, but all other operations are 2x+ faster than typed arrays.
I threw in CanvasPixelArray just to see if there were any special optimizations there - it's pretty much the same as Uint8Array (which makes sense).
Here's a great blog post on perf comparison that just came out: http://blog.n01se.net/?p=248

Ben Vanik

2011/4/21 Mark Callow <callow_mark@hicorp.co.jp>


Because I keep needing to do it, I have become irritated by the lack of a function in the typed array specification do copy the partial contents of a JS source array, something like:

     void set(type[] array, optional unsigned long offset, unsigned long srcOffset, unsigned long count)

The obvious answer is a wrapper but I suspected that a loop of, e.g, i16array[i] = src[i] in JS would be slower than something internal to the typed array implementation. I wrote a short test for this. The result on FF4 is as I expected:

Int16Array.set(2000 byte array) x 100 times took 1ms (400000000 bytes/second).

Int16Array[j] = data[j] 2000 x 100 times took 4ms (100000000 bytes/second).

Int16Array.set(200000 byte array) x 1 times took 1ms (400000000 bytes/second).

Int16Array[j] = data[j] 200000 x 1 times took 4ms (100000000 bytes/second).

To ensure the result is not influenced by smart optimizers the test is repeated with a longer array and a single iteration. The bytes copied is the same in each case.

The "wrapper" runs at one quarter the speed of a native implementation so I think the above described set function is a badly needed addition to typed arrays.

If the source is a typed array, one can always create another view or subarray so it is not so important to add a new function for this case, though there is the issue of the garbage that must then be collected.

When I ran the same test on Chromium 12.0.717.0 (79525) I got a very surprising result:

Int16Array.set(2000 byte array) x 100 times took 49ms (8163265.306122449 bytes/second).

Int16Array[j] = data[j] 2000 x 100 times took 6ms (66666666.666666664 bytes/second).

Int16Array.set(200000 byte array) x 1 times took 38ms (10526315.789473685 bytes/second).

Int16Array[j] = data[j] 200000 x 1 times took 24ms (16666666.666666666 bytes/second).

It is surprising for 3 reasons:
  • the overall poor performance
  • the fact that the Int16Array.set takes longer than a JS loop setting individual elements
  • the fact that a single loop of 200,000 setting individual elements took 4 times longer than a double loop of 100 x 2000.
The test is attached.