DOM guide: Importing documents: Difference between revisions

From COLLADA Public Wiki
Jump to navigation Jump to search
Line 331: Line 331:

Note that at each step we're checking the return value to make sure we have an element where we expect it to be. Without the checks we could access a null pointer and crash if the document doesn't contain the exact elements we expect. The important thing to note is that when you use the daeElement interface <extra> data can be read and processed just like normal Collada data.
Note that at each step we're checking the return value to make sure we have an element where we expect it to be. Without the checks we could dereference a null pointer and crash if the document doesn't contain the exact elements we expect. The important thing to note is that when you use the daeElement interface <extra> data can be read and processed just like normal Collada data.

Revision as of 01:31, 26 April 2008

Be sure to read the section on creating documents first. It covers some important topics relevant to this section.

A simple example

Let's begin with a simple example of reading some information from a Collada document. We'll open the document and print the ID of the first <node> we find.

DAE dae;
daeElement* root ="simpleImport.dae");
if (!root) {
    cout << "Document import failed.\n";
    return 0;

We create the DAE object then call DAE::open to open a file called "simpleImport.dae". If there is no file of that name in the current directory, or the file failed to open for some other reason, then the DAE::open method will return null. We check for that and print an error message if opening the document failed.

daeElement* node = root->getDescendant("node");
if (!node)
    cout << "No nodes found\n";
    cout << "node id: " << node->getAttribute("id") << endl;

Here we use the daeElement::getDescendant method to do a breadth-first search through the xml element tree for an element with the given name. This method will return null if it couldn't find an element with a matching name, which we check for. If it did find a matching element we use the daeElement::getAttribute method to print the value of the 'id' attribute.

The complete code.

#include <iostream>
#include <dae.h>
#include <dom/domCOLLADA.h>
using namespace std;

int main() {
	DAE dae;
	daeElement* root ="simpleImport.dae");
	if (!root) {
		cout << "Document import failed.\n";
		return 0;

	daeElement* node = root->getDescendant("node");
	if (!node)
		cout << "No nodes found\n";
		cout << "node id: " << node->getAttribute("id") << endl;

	return 0;

The simpleImport.dae document.

<?xml version="1.0" encoding="UTF-8"?>
<COLLADA xmlns="" version="1.4.1">
    <node id="hello"/>

And the results of running the program.

node id: hello

Reading data from elements

Any individual xml element has four types of data you might need: the element name, the element's attributes, the element's character data, and the element's child elements. The DOM provides easy access to all of this data via the daeElement interface.

Element name

Use the daeElement::getElementName method to get an element's name.

daeString getElementName() const; // Function signature
cout << elt->getElementName() << endl; // Example: print an element's name

Element attributes

To get the value of an attribute given the attribute's name, use the daeElement::getAttribute method.

std::string getAttribute(daeString name);

We've already seen an example of daeElement::getAttribute usage in the simple import example.

cout << "node id: " << node->getAttribute("id") << endl;

If you don't know what attributes an element has, you can iterate over its attribute list using the following methods of daeElement.

size_t getAttributeCount();
std::string getAttributeName(size_t i);
std::string getAttribute(size_t i);

This code snippet prints all the attribute names and values of the root element.

for (size_t i = 0; i < root->getAttributeCount(); i++) {
	cout << "attr " << i << " name: " << root->getAttributeName(i) << endl;
	cout << "attr " << i << " value: " << root->getAttribute(i) << endl;

Character data

You can retrieve an element's character data with the daeElement::getCharData method.

std::string getCharData();

For example, let's say you have an <asset> element and you want to tell if the <up_axis> setting is Z_UP. You could do that as follows.

daeElement* upAxis = asset->getDescendant("up_axis");
if (upAxis && upAxis->getCharData() == "Z_UP")
    // We have a match!

Child elements

If you know the name of the child element you want, you can access it with daeElement::getChild.

daeElement* getChild(daeString eltName);

This will return null if the element with the given name doesn't exist. You might use this function to test for the existence of a particular child element.

if (root->getChild("asset") == NULL)
    cout << "Missing <asset> element!\n"

If you don't have a specific element in mind you can get a list of all the child elements instead with the daeElement::getChildren method.

daeTArray< daeSmartRef<daeElement> > getChildren();

It returns an array of smart pointers to daeElement objects, which you can simply treat like ordinary daeElement pointers. You can use daeElement::getChildren to print a list of all the child elements of root like this.

daeTArray<daeElementRef> children = root->getChildren();
for (size_t i = 0; i < children.getCount(); i++)
	cout << "child " << i << " name: " << children[i]->getElementName() << endl;

daeElementRef is just a typedef for daeSmartRef<daeElement> that's made available to DOM clients to keep code simpler.

The dom* classes

As was mentioned in the creating documents section, the dom* classes provide an alternative interface to working with elements in the DOM. All of the operations discussed so far can be done with the dom* classes instead of the daeElement interface. For example, the code to print the id attribute of the first <node> in the document could've been written like this instead:

domNode* node = (domNode*)root->getDescendant("node");
if (!node)
	cout << "No nodes found\n";
	cout << "node id: " << node->getId() << endl;

The dom* classes provide a more strongly typed interface to the Collada elements, and sometimes this can be convenient. Use your judgment to decide between the daeElement interface and a dom* class for a given task.

Element hierarchy traversal

An xml document contains a tree of elements. Each element has a list of children, and each child has its own list of children, and so on. The DOM provides several methods in the daeElement interface for easily navigating a document's element tree.

// Search downward
daeElement* getChild(daeString eltName);
daeElement* getDescendant(daeString eltName);
// Search upward
daeElement* getParent();
daeElement* getAncestor(daeString eltName);

The first two methods, getChild and getDescendant, are used for searching downward through the element tree. We've already seen these methods used in previous examples. getDescendant does a breadth-first search down the element tree, looking for a node with the given name. getChild works exactly the same, except that it only goes one level deep.

To search upward, use the daeElement::getParent and daeElement::getAncestor functions. getParent doesn't do a "search" exactly. Since an element only has one parent, getParent simply returns that element. getAncestor goes all the way up the element tree to the root searching for an element with the given name.

All the methods for element hierarchy traversal return null if a matching element isn't found.

Using the database to get elements by type or ID

The DOM also comes with an efficient mechanism for finding elements by type or ID. This functionality is implemented by the daeDatabase class, but calling it a 'database' might be a bit misleading. Internally the DOM uses standard C++ multimaps to implement a cache to quickly find a daeElement given the element's ID or type.

Each DAE object has an associated daeDatabase that can be retrieved with the DAE::getDatabase method.

virtual daeDatabase* getDatabase();

Finding an element by ID

Retrieving a daeElement given the element's ID is a fairly common operation, and is performed frequently by the DOM internally when working with URIs and ID references. Sometimes you'll need to do it in your own code also. The method to use is daeDatabase::idLookup.

virtual std::vector<daeElement*> idLookup(const std::string& id) = 0;

You might be surprised to see that this method returns an array of elements via std::vector. After all, an ID must be unique within an entire Collada document, so how could there be multiple elements with a given ID? The answer is that the DOM can have multiple documents loaded at the same time. So for a given ID, there might be multiple matching elements in different documents, and each of these elements is returned by the idLookup method.

More commonly you'll want to find an element by ID in a specific document. For that , another version of the idLookup method is provided.

daeElement* idLookup(const std::string& id, daeDocument* doc);

This method is just like the previous idLookup method, except that takes a daeDocument objects as the second parameter. Since there can only be one element with the given ID in the specified document, this method returns a single daeElement instead of an array of daeElements.

You can get the daeDocument from any other element in the same document with the daeElement::getDocument method. For example, you might find the element with id 'myElement' in the same document as element 'root' like this.

daeElement* elt = dae.getDatabase()->idLookup("myElement", root->getDocument());

Element types in the DOM

So far we've discussed types in the DOM very little. I've explained that each type in the Collada schema gets mapped to a dom* class, and that each of these classes implement the daeElement interface. In the DOM, every dom* class has an associated type ID which can be queried at runtime using the 'ID' method. For example, to get the type ID of the domNode class (which corresponds to the <node> Collada element), you would write domNode::ID(), to get the domGeometry type ID you would write domGeometry::ID(), etc.

The daeElement interface provides a method typeID to query the type of any daeElement. This is useful when you want to confirm that a daeElement is of a particular type, for example to cast to a dom* class, like this.

daeElement* elt = root->getDescendant("surface");
if (elt->typeID() == domFx_surface_common::ID()) {
    // We have a match!
    domFx_surface_common* surface = (domFx_surface_common*)elt;

Checking the type of the returned element is especially important in this case because the Collada schema uses the element name "surface" with many different schema types. The getDescendant call could return an element of a type other than domFx_surface_common, in which case casting to domFx_surface_common would be invalid. By checking the type first we guard against any problems.

Type checking in this fashion is common enough that the DOM provides a cast operator daeSafeCast, which could be used to shorten the previous above.

domFx_surface_common* surface = daeSafeCast<domFx_surface_common>(root->getDescendant("surface"));
if (surface) {
    // We have a match!

Finding elements by type

Sometimes it's useful to perform an operation on all elements of a specific type. For example when writing a Collada conditioner you might want to find all the <geometry> elements and do some processing on them. The method daeDatabase::typeLookup is useful for these types of tasks.

std::vector<daeElement*> typeLookup(daeInt typeID, daeDocument* doc = NULL);
template<typename T> std::vector<T*> typeLookup(daeDocument* doc = NULL);

The first method returns an array of daeElements, while the second returns an array of dom* elements. For example, you could print the ID's of all nodes like this.

vector<daeElement*> nodes = dae.getDatabase()->typeLookup(domNode::ID());
for (size_t i = 0; i < nodes.size(); i++)
	cout << "node " << i << " id: " << nodes[i]->getAttribute("id") << endl;

You could also do it using the second typeLookup method instead.

vector<domNode*> nodes = dae.getDatabase()->typeLookup<domNode>();
for (size_t i = 0; i < nodes.size(); i++)
	cout << "node " << i << " id: " << nodes[i]->getId() << endl;

Note that the typeLookup methods search through all documents by default, but take an optional document argument to restrict the search to that document.

Working with URIs

URIs are used all throughout Collada to establish references to elements and external resources (such as texture files). The DOM represents URIs with the daeURI class. Wherever the schema uses a URI, the DOM creates a daeURI object. Detailed information about the daeURI class can be found in daeURI.h, but we'll cover some of the more common uses of the daeURI class here.

Retrieving the URI components

As is discussed in the URI spec, all URIs can be broken down into five component parts: schema, authority, path, query, and fragment. Sometimes you need to access these components, and the daeURI provides convenient accessor methods for that purpose.

const std::string& scheme() const;
const std::string& authority() const;
const std::string& path() const;
const std::string& query() const;
const std::string& fragment() const;

The DOM offers some utility functions to break the path component down further.

// Individual path component accessors. If you need access to multiple path
// components, calling pathComponents() will be faster.
std::string pathDir() const;      // daeURI("/folder/file.dae").pathDir() == "/folder/"
std::string pathFileBase() const; // daeURI("/folder/file.dae").pathFileBase() == "file"
std::string pathExt() const;      // daeURI("/folder/file.dae").pathExt() == ".dae"
std::string pathFile() const;     // daeURI("/folder/file.dae").pathFile() == "file.dae"

All of these functions should be fairly self explanatory.

Obtaining daeElements from URI element references

Many (but not all) of the URIs in Collada are element references. That is, they're meant to point to Collada elements. For these types of URIs, you can use the daeURI::getElement method to retrieve the daeElement referenced by a URI. Internally the DOM uses the daeDatabase to do a quick lookup of the element based on the URI's fragment, which is the element's ID.

daeElementRef getElement();

An example of an element reference URI is the 'url' attribute of the <instance_geometry> element. That attribute is a URI that points to a <geometry> element. Here's an example of finding an <instance_geometry> element in a document and then using the daeURI class to get the referenced <geometry> element.

domInstance_geometry* geomInst = dae.getDatabase()->typeLookup<domInstance_geometry>().at(0);
daeElement* geom = geomInst->getUrl().getElement();

External document references

URIs enable you to reference elements in external documents, which is an important feature of Collada. When the DOM loads a document that contains external references, the referenced documents are left unloaded at first. When you attempt to call daeURI::getElement to obtain a daeElement from another document, that document is loaded and the element is found in the other document and returned. This means that calling daeURI::getElement can trigger a document load and is therefore a potentially expensive operation. This is all handled behind the scenes for you and is one of the nice conveniences provided by the DOM.

In some cases though it might be useful to check if a URI is a local reference or an external reference. The daeURI class provides the isExternalReference method for this purpose.

daeBool isExternalReference() const;

This method returns true if the URI references a document other than the document the URI lives in (i.e. it's an external reference), and false if the URI is a normal local reference.

Converting a URI to a file path

In some cases it's necessary to convert a URI to/from a file path. It's important to note that only file scheme URIs can be converted to file paths, for other URIs (like an http URI) it makes no sense to convert it to a file path. The DOM provides functions to convert in both directions.

namespace cdom {
    std::string nativePathToUri(const std::string& nativePath,
                                systemType type = getSystemType());
    std::string uriToNativePath(const std::string& uriRef,
                                systemType type = getSystemType());

The first function converts a native file system path to a URI, and the second function converts a URI to a native file system path. The type parameters allow you to specify a path type other than the native system type (Posix (Linux, Mac) and Windows paths are supported). It can usually be left alone.

An example of when you might want to use these functions is when you want to load a texture image. Like all external resources in Collada, textures are referenced using URIs. Most texture loading libraries don't understand URIs though, they work with file paths. You can use the uriToNativePath function to convert the URI reference to a file path for loading.

domImage* image = dae.getDatabase()->typeLookup<domImage>().at(0);
string uri = image->getInit_from()->getValue().str();
string filePath = cdom::uriToNativePath(uri);
if (filePath.empty())
	cout << "The uri couldn't be represented as a file path. Perhaps an http scheme uri.\n";
	cout << filePath << endl;

Reading <extra> data

The daeElement interface provides a schema independent mechanism to work with xml data, and this works perfectly for reading <extra> data, for which there is no schema.

In the creating documents section I showed how you could use the DOM to create a document with <extra> data. The resulting document looked like this.

        <technique profile="steveT">
            <myElement myAttr="myValue">this is some text</myElement>

Now let's read that document back in and parse the <extra> content. Here's a complete annotated program that shows how you could do that.

#include <iostream>
#include <dae.h>
#include <dom/domCOLLADA.h>
using namespace std;

int main() {
    DAE dae;
    daeElement* root ="extra.dae");
    if (!root) {
        cout << "Document import failed.\n";
        return 0;

    // Get a daeElement pointer to the <extra> element
    if (daeElement* extra = root->getDescendant("extra")) {

        // Check for a <technique> child element
        if (daeElement* technique = extra->getChild("technique")) {

            // Check the <technique>'s 'profile' attribute and make sure it matches what we expect.
            // This info could also be encoded in the 'type' attribute on the <extra> element.
            if (technique->getAttribute("profile") == "steveT") {

                // Get our custom element and print some info
                if (daeElement* elt = technique->getChild("myElement")) {
                    cout << "myAttr = " << elt->getAttribute("myAttr") << endl;
                    cout << "char data = " << elt->getCharData() << endl;

    return 0;

Note that at each step we're checking the return value to make sure we have an element where we expect it to be. Without the checks we could dereference a null pointer and crash if the document doesn't contain the exact elements we expect. The important thing to note is that when you use the daeElement interface <extra> data can be read and processed just like normal Collada data.