[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Public WebGL] String support in DataView

On Wed, Nov 9, 2011 at 12:42 PM, Joshua Bell <jsbell@chromium.org> wrote:
I've updated the doc - http://wiki.whatwg.org/wiki/StringEncoding - to reflect the discussion on this thread, most notably:

* Removed detectEncoding
* Added stringLength, removed special-case null termination
* Reordered arguments so input is first
* Updated JS "shim" implementation
* Sprinkled a few more "ISSUES" in where the doc still needs updating and/or a decision made

Further feedback is appreciated.

Looks better.  My major concern is about using U+0000 as a terminator -- that still ensures that I can't send strings that might contain U+0000 unmolested, and either I must do my own quoting or filter it out.  I would prefer either saying the terminator is encoding-specific (for example, in UTF8 you can use 0xFF and be sure it won't collide with any real data) and provide a flag to terminate the string according to the encoding when writing, or to provide a way for specifying a terminating byte sequence.

Other minor issues:
- do we want to say anything about canonical forms?  For example, are over-long UTF8 sequences allowed?  How are combining marks/etc represented?
- decode should specify the behavior if byteLength stops inside a multi-byte sequence for a character

John A. Tamplin
Software Engineer (GWT), Google