[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Public WebGL] A different approach to interleaved typed arrays
- To: Chris Marrin <firstname.lastname@example.org>
- Subject: Re: [Public WebGL] A different approach to interleaved typed arrays
- From: Kenneth Russell <email@example.com>
- Date: Fri, 4 Jun 2010 12:12:50 -0700
- Cc: public webgl <firstname.lastname@example.org>
- Dkim-signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=google.com; s=beta; t=1275678773; bh=NfROU1aGxmA0DKcYXmTYBLR+eC4=; h=MIME-Version:In-Reply-To:References:Date:Message-ID:Subject:From: To:Cc:Content-Type:Content-Transfer-Encoding; b=ernaKbSaNx3yZ5I0SUX8AY1tGSzJcjXLSfLAwFqitpk3MQzE/v+hzp1B+eqia0L0I Fs3y4s0HVCmQfkAAkFuAw==
- Domainkey-signature: a=rsa-sha1; s=beta; d=google.com; c=nofws; q=dns; h=mime-version:in-reply-to:references:date:message-id:subject:from:to: cc:content-type:content-transfer-encoding:x-system-of-record; b=UN/rPmkyKAl8wYsu4JZkCkvi4M+7ukqSw7AB44RXFzsr02sADYZXx+IUW4UkT1X+E 0csZQEUQ+RKYs1NGqu2iA==
- In-reply-to: <2BC44ECB-EE51-40D6-A631-3BD51E711E46@apple.com>
- References: <2BC44ECB-EE51-40D6-A631-3BD51E711E46@apple.com>
- Sender: email@example.com
On Fri, Jun 4, 2010 at 10:12 AM, Chris Marrin <firstname.lastname@example.org> wrote:
> I'd like to propose changes to the current Typed Array proposal which would solve the problem of exposing endianness, but would still be efficient:
> We've discussed in the past the idea of making the views non-overlapping. Mapping a range of an ArrayBuffer to one view would make it inaccessible to any other view. Furthermore, when mapping a range of an ArrayBuffer, that range is clear, preventing the author from, for instance:
> 1) Mapping a range to bytes
> 2) Filling that range with data, which happens to be floating point values in little endian order
> 3) removing that mapping
> 4) Mapping that same range to floats to access the previously loaded data
> This would all be well and good, and the implementation isn't really that complex. It would simply require remembering which views are mapped to which buffer and doing a validation whenever a new mapping is made.
> But it has a problem because it would not be practical to do interleaved mapping. You'd essentially need a new view for each component of each element. For instance, if you had 1000 vertices, made up of 3 floats followed by 4 bytes, you'd need 1000 Float32Arrays and 1000 Int8Arrays.
> But we could add an explicit interleaved mapping capability to the Typed Arrays. It would require something like this:
> var buf = new ArrayBuffer(1000 * 16 /* 3 floats + 4 bytes */);
> var floats = new Float32Array(buf, 0, 1000, 3, 4);
> var colors = new UInt8Array(buf, 12, 1000, 4, 12);
> The extra 2 parameters added to the Typed Array constructors are:
> elementsPerGroup - number of elements in each group of elements
> bytesToNextGroup - number of bytes to skip to get from the end of a group of elements to the start of the next
> There are other ways to phrase those parameters. For instance you might want the second one to be 'stride', which is the number of bytes from the start of one group to the next. But the result is the same. In the above example the views don't overlap because the extra parameters define which parts of the buffer belong to which view. Checking for validity is essentially the same as for the non-stride case, although perhaps a bit more complex. But I don't believe creating views into a buffer is likely to happen frequently, so it can be a relatively expensive operation without hurting performance.
> var values = [ ... array of 3000 floating point values ... ];
> We also discussed the notion of separating ArrayBuffers that are used for incoming data from those that are used to prepare data for uploading to the GPU. I think we should repurpose the DataView concept for this functionality. We can call it a DataBuffer, or NetworkBuffer, or whatever. It would have the API's from the DataView, but it would actually be backed by an internal buffer. This object would be returned from XMLHttpRequest, a BLOb object or any other object that accesses data in some external (not machine specific) endianness. When getting data from this buffer, you specify the endianness you know the data to be in and it is returned to you in the native machine endianness. When writing data to the buffer (for eventual output to an external destination) you specify endianness and the data is converted from internal endianness to the one specified.
> We should probably add methods to this object which can deal with arrays of data and we might even want to make it aware of interleaved Typed Arrays. For instance, if you have a DataBuffer 'inData' with floats and bytes interleaved the same as in the above example, but in little endian order, you might say:
> inData.setArray(floats, 0, 1000, true);
> inData.setArray(colors, 12, 1000, true);
> These calls would handle the byte swapping for matching endianness (if needed) and the interleaving the values into groups according to the layout of Typed Arrays.
> This solves the problem of undesired endian errors and provides a fast API for interleaving data.
We have discussed similar proposals earlier in the working group, and
during the TC-39 meeting a similar scatter/gather idea was discussed.
The problem with this proposal is that setting and getting individual
elements is too slow, which is why you've worked around that by adding
bulk setters and getters. This does not solve the basic problem that
manipulating large numbers of vertices. Supporting this capability is
essential for interesting applications to be developed with WebGL.
NVIDIA's vertex_buffer_object demo  is a reasonable example and
Chromium nightly builds on my laptop it generates roughly 4 million
vertices per second. This is not bad, but is still roughly a factor of
seven slower than the Java HotSpot server compiler on the same demo. I
but there is much work to be done.
It is absolutely essential that random access reads and writes of
single elements be efficiently supported. Having one array reference
multiple contiguous elements with gaps in between groups of elements
will perform too poorly. We could consider having each array reference
only one element per group, with a fixed stride between elements. This
will still add at least a load and if-test, or load and shift, to each
array load and store, to fetch the stride and either compare it to
zero for the fastest case, or to always shift the index to compute the
byte offset. I am not in favor of making this change. I personally
don't believe that the data aliasing issues exposed by Typed Arrays
are as severe as some do, and I am not willing to sacrifice any of the
performance gains we have already achieved with Typed Arrays because
we still have a long way to go. For reference, the
vertex_buffer_object is running over seven times faster than it used
to in Chromium before any optimizations to the Typed Array
implementation were done.
You are currently subscribed to email@example.com.
To unsubscribe, send an email to firstname.lastname@example.org with
the following command in the body of your email: