[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Public WebGL] A different approach to interleaved typed arrays



----- "Kenneth Russell" <kbr@google.com> wrote:

> On Fri, Jun 4, 2010 at 10:12 AM, Chris Marrin <cmarrin@apple.com>
> wrote:
> >
> > The TC39 folks are making a valiant effort at defining structured
> data in JavaScript. But I don't think their efforts will be able to
> bear fruit in a timeframe that would be convenient for us. But still I
> feel their pain. Their main complaint of our current Typed Arrays
> proposal is that it exposes endian issues to the author. Not only
> that, but it makes it easy to accidentally write endian dependent
> code. Since the majority of machines today are little endian, this is
> a ticking time bomb waiting for the day that someone tries it on a big
> endian machine.
> >
> > I'd like to propose changes to the current Typed Array proposal
> which would solve the problem of exposing endianness, but would still
> be efficient:
> >
> > ==========
> >
> > We've discussed in the past the idea of making the views
> non-overlapping. Mapping a range of an ArrayBuffer to one view would
> make it inaccessible to any other view. Furthermore, when mapping a
> range of an ArrayBuffer, that range is clear, preventing the author
> from, for instance:
> >
> > 1) Mapping a range to bytes
> > 2) Filling that range with data, which happens to be floating point
> values in little endian order
> > 3) removing that mapping
> > 4) Mapping that same range to floats to access the previously loaded
> data
> >
> > This would all be well and good, and the implementation isn't really
> that complex. It would simply require remembering which views are
> mapped to which buffer and doing a validation whenever a new mapping
> is made.
> >
> > But it has a problem because it would not be practical to do
> interleaved mapping. You'd essentially need a new view for each
> component of each element. For instance, if you had 1000 vertices,
> made up of 3 floats followed by 4 bytes, you'd need 1000 Float32Arrays
> and 1000 Int8Arrays.
> >
> > But we could add an explicit interleaved mapping capability to the
> Typed Arrays. It would require something like this:
> >
> >        var buf  = new ArrayBuffer(1000 * 16 /* 3 floats + 4 bytes
> */);
> >        var floats = new Float32Array(buf, 0, 1000, 3, 4);
> >        var colors = new UInt8Array(buf, 12, 1000, 4, 12);
> >
> > The extra 2 parameters added to the Typed Array constructors are:
> >
> >        elementsPerGroup - number of elements in each group of
> elements
> >        bytesToNextGroup - number of bytes to skip to get from the
> end of a group of elements to the start of the next
> >
> > There are other ways to phrase those parameters. For instance you
> might want the second one to be 'stride', which is the number of bytes
> from the start of one group to the next. But the result is the same.
> In the above example the views don't overlap because the extra
> parameters define which parts of the buffer belong to which view.
> Checking for validity is essentially the same as for the non-stride
> case, although perhaps a bit more complex. But I don't believe
> creating views into a buffer is likely to happen frequently, so it can
> be a relatively expensive operation without hurting performance.
> >
> > The next question is, how does access change? In the current scheme,
> writing to an interleaved buffer involves computing offsets to each
> group in JavaScript. With this scheme, each view would appear to be a
> contiguous array of the given type. Loading a view is simply a matter
> of:
> >
> >        var values = [ ... array of 3000 floating point values ...
> ];
> >        floats.set(values);
> >
> > and have the values "scattered" to the appropriate groups of
> elements. Doing this in native code would be significantly faster that
> doing a JavaScript loop, computing offsets and loading single values
> at a time.
> >
> > How would you change mappings? How would you disassociate a Typed
> Array with a range in an ArrayBuffer and then associate a different
> one? One simple answer is that you can't. Once an association is made,
> no other association with those bytes of the ArrayBuffer can ever be
> made. This seems reasonable given the automatic garbage collection
> that occurs in JavaScript. ArrayBuffers are not precious resources
> that need to be shared. It would be just as easy to create a new
> ArrayBuffer when new mappings are needed. This is especially true if
> the data in the ArrayBuffer doesn't survive across mappings. It also
> makes it unnecessary to clear the data in a newly mapped range. If you
> assume the ArrayBuffer was cleared on creation, the data in the newly
> mapped range is guaranteed to be valid.
> >
> > We also discussed the notion of separating ArrayBuffers that are
> used for incoming data from those that are used to prepare data for
> uploading to the GPU. I think we should repurpose the DataView concept
> for this functionality. We can call it a DataBuffer, or NetworkBuffer,
> or whatever. It would have the API's from the DataView, but it would
> actually be backed by an internal buffer. This object would be
> returned from XMLHttpRequest, a BLOb object or any other object that
> accesses data in some external (not machine specific) endianness. When
> getting data from this buffer, you specify the endianness you know the
> data to be in and it is returned to you in the native machine
> endianness. When writing data to the buffer (for eventual output to an
> external destination) you specify endianness and the data is converted
> from internal endianness to the one specified.
> >
> > We should probably add methods to this object which can deal with
> arrays of data and we might even want to make it aware of interleaved
> Typed Arrays. For instance, if you have a DataBuffer 'inData' with
> floats and bytes interleaved the same as in the above example, but in
> little endian order, you might say:
> >
> >        inData.setArray(floats, 0, 1000, true);
> >        inData.setArray(colors, 12, 1000, true);
> >
> > These calls would handle the byte swapping for matching endianness
> (if needed) and the interleaving the values into groups according to
> the layout of Typed Arrays.
> >
> > This solves the problem of undesired endian errors and provides a
> fast API for interleaving data.
> >
> > Comments?
> 
> We have discussed similar proposals earlier in the working group, and
> during the TC-39 meeting a similar scatter/gather idea was discussed.
> The problem with this proposal is that setting and getting individual
> elements is too slow, which is why you've worked around that by
> adding
> bulk setters and getters. This does not solve the basic problem that
> arrays in JavaScript are tremendously inefficient for storing and
> manipulating large numbers of vertices. Supporting this capability is
> essential for interesting applications to be developed with WebGL.
> NVIDIA's vertex_buffer_object demo [1] is a reasonable example and
> benchmark of dynamic vertex generation in JavaScript. In Firefox and
> Chromium nightly builds on my laptop it generates roughly 4 million
> vertices per second. This is not bad, but is still roughly a factor
> of
> seven slower than the Java HotSpot server compiler on the same demo.
> I
> am convinced that better performance can be attained in JavaScript,
> but there is much work to be done.
> 
> It is absolutely essential that random access reads and writes of
> single elements be efficiently supported. Having one array reference
> multiple contiguous elements with gaps in between groups of elements
> will perform too poorly. We could consider having each array reference
> only one element per group, with a fixed stride between elements.
> This will still add at least a load and if-test, or load and shift, to
> each array load and store, to fetch the stride and either compare it to
> zero for the fastest case, or to always shift the index to compute the
> byte offset. I am not in favor of making this change. I personally
> don't believe that the data aliasing issues exposed by Typed Arrays
> are as severe as some do, and I am not willing to sacrifice any of the
> performance gains we have already achieved with Typed Arrays because
> we still have a long way to go. For reference, the
> vertex_buffer_object is running over seven times faster than it used
> to in Chromium before any optimizations to the Typed Array
> implementation were done.

I'm in general agreement with Ken here -- I just don't see the aliasing/endianness issues as a significant showstopper problem.  I'm interested in the TC-39 struct proposal purely from a developer convenience point of view, because being able to access interleaved array data in a more natural form (like foo[i].vertex.x, foo[i].normal.y, foo[i].color.r, etc.) would be nice, though only if the implementations can get that indexing to be close to typed array speed.  Having it fix the endianness exposure issues is, to me, only a nice side benefit.

Current typed array indexing is:

  base_ptr + index<<elemsize_shift

The struct proposal could be, with enough parser work:

  (base_ptr + structsize*index) + member_offset

(Though base_ptr+offset will be constant for each element, so maybe there's something useful that can be done there.)

The interleaved proposal above would be, I believe:

  stride = elementsPerGroup*elemsize + bytesToNextGroup
  (ptr + stride*(index/elementsPerGroup)) + (index % elementsPerGroup)

I think that's way too much overhead.  One-element access with a fixed stride would be better (Ken, not sure what the if/shift you're referring to above would be -- correct me if I'm wrong here):

  ptr + stride*index

But would be a good bit harder to use in some ways:

  VertexArrayX[0] = 1.0;
  VertexArrayY[0] = 1.0;
  VertexArrayZ[0] = 1.0;
  ColorArrayR[0] = 255; 
  ... etc.

Actually, now that I look at it, what this is basically doing is implementing the struct proposal "backwards" -- the base pointer for VertexArrayY is just the base_ptr+offset, and structsize==stride.  Either of the interleaved array approaches also has significant bookkeeping costs associated with the clearing out/disallowing additional views if they overlap existing ones.

So, I suggest that we keep things as is, though that we work with our JS folks (and TC-39) to figure out what would make the current typed array proposal better, as far as language fit goes -- ignoring aliasing and endianness issues.  Not with an eye towards getting it in to ES5, but more towards improving it as is.  Nothing in the current typed array/webgl proposals requires the developer to do anything that's endian-specific, with the exception of loading external data.  So, at the expense of a copy when loading from an external data source, decoupling the DataView/DataBuffer from ArrayBuffer, and adding some streaming copy methods to it (copy 30 floats into this Float32Array) could resolve those.

    - Vlad

-----------------------------------------------------------
You are currently subscribed to public_webgl@khronos.org.
To unsubscribe, send an email to majordomo@khronos.org with
the following command in the body of your email: