[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Public WebGL] String support in DataView

On Fri, Nov 4, 2011 at 7:23 PM, John Tamplin <jat@google.com> wrote:
On Fri, Nov 4, 2011 at 6:22 PM, Glenn Maynard <glenn@zewt.org> wrote:
No, because decoding a non-Unicode encoding requires table lookups.  UTF-8 requires multiple branches per byte.  strnlen can be optimized to less than one branch per byte (typically; glibc's optimization is somewhat more complex than that), and no memory access aside from linear access to the string itself.

The way I usually write it is one switch of the byte anded with 0xF8, so there is one branch (aside from validation checks, which you still have to do when computing the length).  Single-byte encodings can obviously be done in constant time, but I wouldn't expect them to be used much anyway.

You need an exit branch, too.  You don't need any of that, or validation, or codepoint reconstruction, just to find the first null terminator.

Anyhow, the point was that strnlen is extremely fast.  strnlen() processes about 2GB/sec on my system on data in cache.  16-bit equivalents for UTF-16 are also fast.  I don't think having a separate call to find the null terminator is an actual performance issue.

> Not wanting to offer a synchronous API for the spooled to disc case complicates that of course.

The Blob API is the base class for the File API, too.  I agree it would have been nice for them to have been merged somehow, but I think that ship has sailed.  I sure hope ECMAScript doesn't NIH yet another buffer type.

Glenn Maynard