On Fri, Nov 4, 2011 at 2:53 PM, Joshua Bell <firstname.lastname@example.org>
I absolutely agree that for new protocols that is what should be done. However, some potential users have expressed the desire to use this API for parsing data files that have poorly defined string encodings, and they make use of encoding guessing. The example given is ID3 tags in MP3 files, which are often e.g. in Big5 encoding. Current applications rely on native libraries to do encoding detection.
Guessing makes me queasy, but it's been pointed out that browsers already implement encoding detection logic, although this is usually under the end-user's control. I'm not averse to removing this from the spec, but there is demand for it.
Ok, though I would hope there would be admonitions against using it unless you really need to in the docs.
Can you point where you see in the spec that U+0000 / NULL characters can't be sent, and I'll reword.
Sorry, I missed the "If bytelength is a negative number..." prefix to the statement.
BTW, in re-reading the decode description, it mentions throwing an exception when writing past the end of the buffer -- cut and paste error?
Fixed length is definitely supported. Padding isn't, but Typed Arrays buffers are born zero-filled. Is that adequate?
A couple of pieces that seem missing:
Â- if I have a fixed 20 byte area to contain a UTF8 string, how do I decode it? ÂIf I pass the length as 20, I will get all the padding bytes and have to manually remove them. ÂIf I give length as -1, then if the 20 bytes is full it will overflow past it. ÂThat is why it seems useful to have a padded vs unpadded mode separate from the buffer length.
Â- when writing to a fixed buffer, how do I truncate the string? ÂGiven this API, it looks like I would have to loop through encodedLength, chopping the string until it fits.