Page 1 of 5 12345 LastLast
Results 1 to 10 of 47

Thread: Memory and Speed issues

  1. #1
    Member
    Join Date
    Aug 2004
    Location
    SCEJ - Tokyo, Japan
    Posts
    34

    Memory and Speed issues

    continuing from a comment in the design thread...

    everything is peachy until you actually start making real data. Real data = millions of polygons. For example Jak and Daxtor's levels are millions of polys as are the levels of Unreal 3.

    What that means in terms of Collada are things like using an ascii based format for vertex positions, UVs, colors, weights and normals may not be a good idea. Maybe a well defined CDATA format would be better in those cases.

    It also can mean issues for XML parsers and the design of the format. Since most XML parses store lots of extra info PER internal element, any schema which ends up with thousands or tens of thousands of elements should be avoided. An simple example would be if every vertex was a separate element. That's not the case with Collada so far.

  2. #2
    Senior Member
    Join Date
    Aug 2004
    Location
    California
    Posts
    771
    The COLLADA schema will certainly grow in the number of element types and attributes as we add features. I don't foresee an explosion of elements though because the design is both generic and parametric at the level of containing data blocks.

    I expect to see reuse of the <param>, <source> and <array> elements and not the introduction of domain specific elements to hold data.

  3. #3
    Member
    Join Date
    Aug 2004
    Location
    SCEJ - Tokyo, Japan
    Posts
    34
    I wasn't suggesting more types of elements. In general I was suggesting 2 things. One that for example

    <Array size="5" type="xs:float">
    1.0 2.0 3.0 4.0 5.0
    </Array>

    Be changed or at least optionally be allowed to be something like

    <Array size="5" type="xs:float" binary="true">
    <![CDATA[$%%$$#"%"#$!$"$%$%&"]]>
    </Array>

    Where "$%%$$#"%"#$!$"$%$%&"" is the binary representation of those 5 floats in some format specified by Collada. Most likely the standard PC float format in PC byte order since that's the most likely place for this data to be used.

    Because while parsing 5 ascii representations of floats and converting them real floats is not a big deal, parsing 5 million of them is. If the format allowed this binary option and all the exporters supported it either by default or by option then when parcing an array, since I know the size "5" and I know the binary representation of the type because it would be specified in the Collada spec then once I hit <![CDATA[ I would instantly know exactly how many bytes to read out of the file and I could load it directly into memory and use it instantly.

    Given that polygon counts are going to up at least an order of magnitude for the next gen I think it would be a good idea to, if possible, optimize this format, while it's still possible, to be better aimed and large data sets.

    This wouldn't scarifice any usefulness or genericness as far as I can see but it would make it possible to use the data faster and bring conversion times down.

    The other thing I was suggesting is to try to avoid, where possible, massively repeating elements. The spec already does this as far as I can tell, I was just pointing it out because I wasn't sure if that was intentional or just luck

    For example IF the vertex format was something like

    <vertices num="3">
    <vert>1 2 3</vert>
    <vert>3 2 1</vert>
    <vert>5 5 6</vert>
    </vertices>

    or worse

    <vertices num="3">
    <vert><x>1</x><y>2</y><z>3</z></vert>
    <vert><x>3</x><y>2</y><z>1</z></vert>
    <vert><x>4</x><y>5</y><z>6</z></vert>
    </vertices>


    That would end up being hugely expensive to parse and most XML parsers would choke on it once the files sizes got really large since every single vertex would be stored in a separate element structure in the internal parse tree.

  4. #4

    Reading speed

    As a reference:
    on a fairly average PC, the EQUINOX-3D COLLADA importer takes less than 3 seconds to read a 526338-triangle terrain model.
    The file has more than 5.5 million floats (vertex array + normals) and about 1.6 million ints.

  5. #5
    Quote Originally Posted by greggman
    The other thing I was suggesting is to try to avoid, where possible, massively repeating elements. The spec already does this as far as I can tell, I was just pointing it out because I wasn't sure if that was intentional or just luck
    Luck, huh?

    It is very intentional. We actually had to fight some forces that wanted more verbose vertex representations...

  6. #6
    Senior Member
    Join Date
    Jul 2004
    Location
    Santa Clara
    Posts
    356
    What that means in terms of Collada are things like using an ascii based format for vertex positions, UVs, colors, weights and normals may not be a good idea.
    You can already store this information in the most compact binary form you want by using external references in COLLADA if you need to.

    The main issue is with the way the data is segmented, and having the capability of dynamic paging or/and partial update in the game engine/content tools. This provides several order of magnitude speed improvements, while speeding up floating point loading will only marginally help.

  7. #7
    Quote Originally Posted by remi
    What that means in terms of Collada are things like using an ascii based format for vertex positions, UVs, colors, weights and normals may not be a good idea.
    You can already store this information in the most compact binary form you want by using external references in COLLADA if you need to.
    Which would mean that he would have to write again a partial exporter.

    Am I right ?

    I fear that he would be not the only developers that would want to do that and then decide they do not need COLLADA.

    Making everyone happy is not possible, but an industry stabdard should aim to please at least a large chunk of developers to assure acceptance.

  8. #8
    Member
    Join Date
    Aug 2004
    Location
    SCEJ - Tokyo, Japan
    Posts
    34

    Re: Reading speed

    Quote Originally Posted by gabor_nagy
    As a reference:
    on a fairly average PC, the EQUINOX-3D COLLADA importer takes less than 3 seconds to read a 526338-triangle terrain model.
    The file has more than 5.5 million floats (vertex array + normals) and about 1.6 million ints.
    The Unreal 3 site claims 100 million polys in their outdoor levels so at 3 seconds for 0.5 million polys that would take 10 minutes to load.

    I'm not suggesting any less genericness. I'm only suggesting a simple optimization for fairly standard types. Arrays of bits, arrays of ints, arrays of floats and possibly arrays of 2 floats, 3 floats and 4 floats if just arrays of floats doesn't cover that. I wouldn't give up on collada if it wasn't added but I guess it just seemed like a pretty simple thing to ask for, it didn't seem to me like it would really break anything and it would speed up things to some small extent.

  9. #9
    Junior Member
    Join Date
    Aug 2004
    Location
    SCEE, UK
    Posts
    9

    Re: Reading speed

    Quote Originally Posted by greggman
    The Unreal 3 site claims 100 million polys in their outdoor levels so at 3 seconds for 0.5 million polys that would take 10 minutes to load.
    Reality check - if our current-gen PS2 Maya scenes loaded in 10 minutes, we'd be over the moon.
    Andrew Ostler
    Senior Principal Programmer
    SCEE

  10. #10

    Re: Reading speed

    Quote Originally Posted by os
    Quote Originally Posted by greggman
    The Unreal 3 site claims 100 million polys in their outdoor levels so at 3 seconds for 0.5 million polys that would take 10 minutes to load.
    That's the source data, before the detail bump-map generation! That's not what you see in the runtime.

    Quote Originally Posted by os
    Reality check - if our current-gen PS2 Maya scenes loaded in 10 minutes, we'd be over the moon.
    Thanks OS. The same scene (if it's the swamp) loads in ~5 seconds with the above mentioned setup (that's the non-truncated Collada version, including reading all the textures from TIFFs!).
    I'm sure that's not all the data, but hey it's XML and Maya has a binary format...

    A binary format wouldn't be orders of magnitudes faster, unless it's a direct representation of the internal format of a specific runtime (=non-interchangeable) and doesn't deal with issues such as byte-order independence.
    As a reference: my binary format is only about 2-3x faster than Collada.

    Also, I doubt that they read the whole scene into Unreal all at once.
    That would be 3.5GB with triangles that have only positions per vertex (no normals, texcoords etc.), less with triangle strips, but still...

Page 1 of 5 12345 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •