[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Public WebGL] Typed Arrays in W3C Specifications | Fwd: Updates to File API



So in thinking about this more, here's a few comments/problems.  A number of these are really comments about the File API itself; let me know if you want me to forward this elsewhere (or feel free to do so).

I don't think Blob makes sense as a base class for a File -- a Blob isn't a File, especially once we can talk about slicing blobs and whatnot.  But continuing that thought, then it doesn't make sense for a Blob to have a type or url -- what does it mean to have a "type" for a fragment of data read from the network?  Or a URL?  I think as Chris was getting at on the call this morning, we're really just talking about a bare ArrayBuffer when we talk about such a chunk.  But, a Blob can be useful when allowing access using different data types.

In my thinking, a blob would look like this:

interface Blob {
   readonly long offset;  // offset from the start of the stream, -1 if not available
   readonly unsigned long size;

   // if the implementation has Typed Arrays:
   readonly ArrayBuffer buffer;

   // for all implementations
   readonly DOMString binaryString;
   readonly DOMString text;
   readonly DOMString dataURL;
   DOMString getAsTextWithEncoding(in DOMString encoding);

   // the above could also be "ArrayBuffer asArrayBuffer();"  "DOMString asBinaryString();" etc. so
   // that we don't need the encoding bit as a separate function
  
   Blob slice(unsigned long startIndex, unsigned long endIndex);
};

Now -- /if/ we were to make Typed Arrays a requirement for File API (which I don't think we can), then we could consider adding ways to convert from an ArrayBuffer to a binary string, data URL, etc. and not need any of the above, though even then having the offset would be handy when you have a lot of reads in flight.

A File would look like:

interface File {
   readonly DOMString type;
   readonly DOMString uri;
   readonly DOMString name;
   readonly unsigned long long length;
};

with no attachment to a Blob; just an object that represents a File, obtained from an <input> or other element.

FileReader (which, I'll be honest, doesn't really make much sense to me -- why do we need a separate object to read from Files, as opposed to reading using the File directly?  But, ok, that's not relevant here, and I guess it does isolate reading.)

interface FileReader {
   void read(in File file, [optional] in unsigned long long startOffset, [optional] in unsigned long long length);

   readonly attribute Blob result;
};

interface FileReaderSync {
    Blob read(in File file, [optional] in unsigned long long startOffset, [optional] in unsigned long long length);
};

Note that the above explicitly takes File elements as input -- it's a FileReader after all -- and has Blobs as the result from the read operation.

That seems like a much cleaner separation to me -- you have File objects that have associated name/uri/type/etc.  You use a FileReader to read a Blob from a file; then you can ask that Blob to give you the data that was read in one of a number of representations.  No need to decide up front how you want to read a file -- with the current API it would be hard to do something like charset detection... you wouldn't be able to read something as a binary string first, try to guess a charset, and then ask for it again with an encoding without doing another FileReader, even though you already have the data.

Given interfaces like the above, putting Blob to work for XHR and even structured storage seems very straightforward -- for XHR, you'd have a responseBlob property in the result.  There'd be some overlap, for example responseBlob.text is likely to be the same as responseText (I don't know the details, so don't know if they're specified differently), but that's not an issue.

     - Vlad


----- "Arun Ranganathan" <arun@mozilla.com> wrote:
Greetings WebGL WG,

The latest editor's draft of the File API now includes a reference to the Typed Arrays work; basically, the underlying Blob interface exposes an ArrayBuffer property.  I believe this is the right way forward; web developers can make use of the view they wish to use.

-------- Original Message --------
Subject: Updates to File API
Date: Thu, 13 May 2010 05:27:47 -0700
From: Arun Ranganathan <arun@mozilla.com>
Reply-To: arun@mozilla.com
Organization: Mozilla Corporation
To: Web Applications Working Group WG <public-webapps@w3.org>
CC: public-device-apis <public-device-apis@w3.org>


Greetings WebApps WG,

I have updated the editor's draft of the File API to reflect changes 
that have been in discussion.

http://dev.w3.org/2006/webapi/FileAPI

Notably:

1. Blobs now allow further binary data operations by exposing an 
ArrayBuffer property that represents the Blob.  ArrayBuffers, and 
affiliated "Typed Array" views of data, are specified in a working draft 
as a part of the WebGL work [1].  This work has been proposed to ECMA's 
TC-39 WG as well.  We intend to implement some of this in the Firefox 4 
timeframe, and have reason to believe other browsers will as well.  I 
have thus cited the work as a normative reference [1].  Eventually, we 
ought to consider further read operations given ArrayBuffers, but for 
now, I believe exposing Blobs in this way is sufficient.

2. url and type properties have been moved to to the underlying Blob 
interface.  Notably, the property is now called 'url' and not 'urn.'  
Use cases for triggering 'save as' behavior with Content-Disposition 
have not been addressed[2], although I believe that with FileWriter and 
BlobBuilder[3] they may be addressed differently.  This change reflects 
lengthy discussion (e.g. start here[4])

3. The renaming of the property to 'url' also suggests that we should 
cease to consider an urn:uuid scheme.  I solicited implementer feedback 
about URLs vs. URNs in general.  There was a general preference to 
URLs[5], though this wasn't a strong preference.   Moreover, Mozilla's 
implementation currently uses moz-filedata: .  The current draft has an 
editor's note about the use of HTTP semantics, and origin issues in the 
context of shared workers.  This is work in progress; I have removed the 
section specifying urn:uuid and hope to have an update with a section 
covering the filedata: scheme (with filedata:uuid as a suggestion).  I 
welcome discussion about this.  I'll point out that we are coining a new 
scheme, which we originally sought to avoid :-)

4. I have changed event order; loadend now fires after an error event [6].

-- A*

[1] 
https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/doc/spec/TypedArray-spec.html
[2] http://www.mail-archive.com/public-webapps@w3.org/msg06137.html
[3] http://dev.w3.org/2009/dap/file-system/file-writer.html
[4] http://lists.w3.org/Archives/Public/public-webapps/2010JanMar/0910.html
[5] http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0462.html
[6] http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/0062.html