Page 4 of 5 FirstFirst 12345 LastLast
Results 31 to 40 of 47

Thread: Memory and Speed issues

  1. #31
    Senior Member
    Join Date
    Aug 2004
    Location
    California
    Posts
    771
    I think external referencing is the best approach for using binary data with XML. The CDATA tag doesn't live up to expectations and base 64 encoding just makes things 33% bigger.

    Just as with texture image data a URL to the file is preferred over embedding in COLLADA documents. I envision that large chunks of binary data can be in well known formats just as image data is stored in .png, .tif, .jpg, .bmp and so forth. None of those binary formats are within the scope of the COLLADA specification. I think everyone understands that that is not necessary.

    In COLLADA, a URL from an <accessor> element expects to parse an <array> element. The <array> element has the meta data to describe the block of data. If that URL refers to an external, binary file we can expect the file extension to indicate the format of the data, just as with imagery. If we want to read .png data then everyone needs a library of code for that. It's the same thing with external data... if the data is in .xls format, for example, then we need those routines etc.

    We can use the <array> schema definition to create a binary form of that storage. In this case then I agree that COLLADA should define this schema. I think that is what some of you have been saying while ignoring the more general well known binary format use-case that is prevalent with image data.

  2. #32
    Junior Member
    Join Date
    Sep 2004
    Location
    Bellevue, WA
    Posts
    11
    I totally agree about the external binary data issue. It should be left as external files.

    There are two issues with this though:

    <issues>
    I am not aware of a simple binary format that could be used to specify <array> like data in a file. As such, I think it would be fairly important to specify a recommended binary format so we don't make it much harder to deal with collada as an intermediate format by having varying different external files. So yeah, I agree with the assessment.

    EDIT: After looking it over, you might want to add in the p type (primitive index list) for external bin. representation because that could also add up, though less so. It does seem to be similar to <array> except with no name, count implied, and type as int... so it would be a very simple extension in the binary format.

    Also, the only issue I can think of with having separate files to reference binary data from is that you'd end up with lots of little extra files all over the place. If you could specify a block format for storing a bunch of different <array> elements, then that would leave you with 1 extra file (or as few as you want...). One possible solution to this is to have an archive that contains all the necessary info be what is imported or exported. Although, with only 1 extra file, it doesn't seem terribly necessary to have to archive them. Also, that could break things like source control if you wanted to use that for content files (though the binary data would break anyways...).
    </issues>

    Cool stuff though.

    Adruab

  3. #33
    Junior Member
    Join Date
    Sep 2004
    Posts
    6
    I disagree with external referencing being a good approach for using binary data with XML.
    It puts work from the programmer to the artist.
    The mesh data of a scene is stored inside the same file as the scene by default (In any 3d package i am aware of).
    This is done because it is nice to have single files which you can just move around as one coherent unit. This is not robustly possible with multiple files.

    Collada shouldn't change how artists normally work. Textures are stored externally, scenes including all mesh data are stored in a single file. This is how every 3D Package i am aware of handles it, so it should be how Collada handles it.

    Reasons why multiple files are bad:
    - Robustness.
    1) What happens if the user wants to move the collada file and forgets to move the binary data as well. (I can assure you every artist will do that at least once)
    2) What happens if i check in the binary data into version control but not the scene data.
    - Clutter. Having 2 files instead of one, means browsing folders takes twice as long for the eye.

    Collada is an interchange format so those issues actually matter. It wouldn't matter that much if it was just an intermediate format to import 3d data into a game engine.


    If you want the full speed of binary data use a format which supports embedding binary data. xml does not support that so if you are interested in the speed improvement the only solution that is not a pain for the user is to make the collada files non-xml.

    One way to do that is instead of embedding binary data in the xml file, Embedding xml data in a binary format.

    For example a very simple binary format would:
    store the number of bytes the XML data has.
    Store the xml data.
    Store a lookup table so the xml data can index into the binary data.
    Store all the binary data raw.

    Specifying such a binary format which embeds the xml file is not any harder than using external referencing to binary files.

  4. #34
    Junior Member
    Join Date
    Aug 2004
    Location
    SCEE, UK
    Posts
    9
    Kind of an aside: I just got a new phone (K700i) and was downloading some themes for it. Out of curiosity I opened a theme file up in a text editor, and lo and behold it seemed to be some kind of combined binary/xml format. Anyone know the details?
    Andrew Ostler
    Senior Principal Programmer
    SCEE

  5. #35
    Junior Member
    Join Date
    Sep 2004
    Location
    Bellevue, WA
    Posts
    11
    Ok, well what do you think of the possibility of a dae file being a gzipped archive with all the relevant contained files within? This is exactly what FX Composer does, and it seems to work well for that.

    It wouldn't complicate importers/exporters that much since you could just include zlib in the distribution.

    As for a non-xml format... I think it would be a bad idea considering the entire spec. is currently built off xml. The farthest I would go is the block region of the file for xml and a block for binary as you mentioned. Even then, that's pretty much the same as archiving it....

    Adruab

  6. #36
    Senior Member
    Join Date
    Aug 2004
    Location
    California
    Posts
    771

    file or schema

    I encourage you all to think less about files and their limitations in large projects. Consider instead COLLADA integration with a database system like Oracle, MS SQL, MySQL, or PostGreSQL to name a few.

    COLLADA is an XML schema for (database) transactions as much as it is a "file format".

  7. #37
    Junior Member
    Join Date
    Sep 2004
    Location
    Bellevue, WA
    Posts
    11
    Hmmm... Interesting. Obviously, many (all?) of the elements can be referenced externally. Does your comment imply that there will be an implementation of Collada that works by exporting/importing data directly to/from a database? It's certainly true that that could eliminate the extra memory overhead required by the string representation. If it doesn't export directly, however, you'd still have the file size problem for the xml representation.

    I'm still not sure that make a lot of sense though.... I suppose the xml schema specification could be translated to a database system easily enough. That leads to 2 questions:

    Was having a database interface for collada information part of the design to begin with? And if so, will a default table layout/interface be specified to keep things consistent?

  8. #38
    Junior Member
    Join Date
    Sep 2004
    Posts
    6
    Quote Originally Posted by gabor_nagy
    - I'm still not convinced that it would be significantly faster than our current decimal-ASCII <-> float converters, but we should do some tests to determine that (see earlier message)
    So did you do some ascii vs binary speed tests? What are the results?

    Joachim Ante

  9. #39
    Senior Member
    Join Date
    Aug 2004
    Location
    California
    Posts
    771
    Quote Originally Posted by adruab
    Hmmm... Interesting. Obviously, many (all?) of the elements can be referenced externally. Does your comment imply that there will be an implementation of Collada that works by exporting/importing data directly to/from a database?
    Yes that is a milestone we are planning to achieve next year.
    Quote Originally Posted by adruab
    I'm still not sure that make a lot of sense though.... I suppose the xml schema specification could be translated to a database system easily enough.
    Several database tool sets automate this process more or less, e.g. Altova's XMLSPY, and are getting better at importing schema's (something they are weak at now). Related to this and interesting as well is XDB (XML Database) technology that is where large scale business data is heading.

    Quote Originally Posted by adruab
    Was having a database interface for collada information part of the design to begin with?
    Yes as COLLADA is (also) a research project, SCEA R&D has been exploring database generation and tool integration for several months.
    Quote Originally Posted by adruab
    And if so, will a default table layout/interface be specified to keep things consistent?
    I think there will be defaults like SQL, with XPath and XQuery and such as emerging standards for access and queries. Is COLLADA data-centric or document-centric? At this point I think it is data-centric but will it remains so? Time will tell. Database systems are evolving and COLLADA is exploring their application in the digital media markets.

  10. #40
    Senior Member
    Join Date
    Aug 2004
    Location
    California
    Posts
    771
    Quote Originally Posted by joeante
    Collada shouldn't change how artists normally work. Textures are stored externally, scenes including all mesh data are stored in a single file. This is how every 3D Package i am aware of handles it, so it should be how Collada handles it.
    I under stand the sentiment, but the premise of COLLADA is that the existing art pipeline is an expensive problem for most game developers. Developing games and movies with gigabytes and terabytes of data is a real expensive and growing concern. Companies that are experiencing this first hand do not use files as primary storage even now. They use Oracle or similar. 3D packages need to catch up to the demands of their customers for data storage and asset management too. It's a challenge for everyone in our business as the rising quality expectations require us to store (and process) incredible amounts of data.

Page 4 of 5 FirstFirst 12345 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •