[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Public WebGL] String support in DataView

On Fri, Nov 4, 2011 at 5:48 PM, John Tamplin <jat@google.com> wrote:
It doesn't need to decode twice; it only needs to pass over the data twice (strnlen or wcsnlen, effectively), which is much cheaper and easily optimized than a full decode.

For most encodings, you are saving at most a single shift over actually doing the decoding (plus of course whatever work is involved in doing something with the decoded value), so I am not sure I buy this is fundamentally cheaper.

No, because decoding a non-Unicode encoding requires table lookups.  UTF-8 requires multiple branches per byte.  strnlen can be optimized to less than one branch per byte (typically; glibc's optimization is somewhat more complex than that), and no memory access aside from linear access to the string itself.

FYI, the two fundamental differences between Blob and ArrayBuffer are that Blob is asynchronous (in order to allow very large blocks of data, which may be swapped to disk), and immutable.  ArrayBuffer is synchronous, which is more convenient but can't scratch data to disk, and mutable.

So it would seem that having to convert data to a Blob in order to use its character encoding support would mean giving up all the reasons that ArrayBuffer exists in the first place, right?

The basic reasons for ArrayBuffer existing aren't related to strings at all, but (as far as I understand) for things like allowing _javascript_ engines to more easily optimize algorithmic code.  The original reason for ArrayBuffer isn't really related to this one way or the other.

Anyway, I think I agree that having direct string encoding and decoding for ArrayBuffer makes sense, despite it duplicating File API stuff, as long as they're consistent with each other.

Glenn Maynard