Khronos Public Bugzilla
Bug 1924 - SIDREF syntax is unsane as described and used
Summary: SIDREF syntax is unsane as described and used
Status: NEW
Alias: None
Product: COLLADA
Classification: Unclassified
Component: Schema (show other bugs)
Version: 1.4.1
Hardware: All All
: P3 normal
Target Milestone: ---
Assignee: COLLADA Work Group email alias
QA Contact: COLLADA Work Group email alias
Depends on:
Reported: 2017-01-31 08:03 PST by Mick P.
Modified: 2017-02-01 02:54 PST (History)
0 users

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description Mick P. 2017-01-31 08:03:50 PST
From a design perspective the existing SIDREF is not a good foundation as is. It requires slight changes to conform to the description in the specification and to not allow some unsightly degenerate cases.

Examples in the PDF manual in places use the ID part of the SIDREF to refer to a SID. The explanation at the top of the manual says nothing about this usage. It seems like a useful construction, but to be implemented it seems like 2 rules should be followed:

1) A SIDREF must always begin with an ID. This conforms with the guidelines in the manual, but not with examples.

2) To obviate the ID part it should be necessary to begin the SIDREF with a / character. The manual suggests the ID part is required. From an application perspective it is messy to first look for an ID and default to a SID as an ill defined countermeasure. If beginning the reference with / is possible, it clearly indicates that the root of the document is the ID, or the # fragment is the ID.

Another issue is using the member-selection feature on the ID itself. This seems ill-formed and especially produces strange parsing edge-cases when the ID is the special ./ relative addressing syntax, or if it was the / syntax proposed above.

It simply should not be possible to use member-selection syntax on and ID and it should not be possible to omit the SID path part. If the SID path part is present then the member-selection part is well defined.

Having software meet all of these strange requirements seems like it is very counterproductive and not future forward enough. The specifications should adopt these restrictions, and also consider changing the type of SIDREF to validate a name beginning with /. And should recommend that software deprecate and not support degenerate syntax and ambiguous resolution countermeasures by default.
Comment 1 Mick P. 2017-01-31 08:16:45 PST
A third issue with SIDREF is the double parentheses 2D-array like member-selection syntax. I feel strongly that it should not be allowed going forward, and should only be back-compatible to <matrix> and <lookat> and only to support legacy documents.

What this would mean is no longer is (3)(3) allowed for selecting the last element in a 4x4 matrix. Instead (15) would be required. XML Schema doesn't conceive of 2D arrays. 

The following forum discussion discusses SIDREFs.

In order to support this it would be necessary to maintain lists of names of elements which would be very error prone, or to require the existence of a schema and to parse the names of the datatypes looking for strings like something2x4something. This is too error-prone, computationally intensive, difficult to maintain, and frustrates users' who would do novel things with COLLADA.
Comment 2 Mick P. 2017-02-01 02:54:48 PST
<texture texture> is highly visible source of these kinds of degenerate SIDREFs. However in its case it is NOT a SIDREF at all, since its type is xs:NCName, although its target is <newparam sid>. Because its type cannot contain a slash (/) it cannot work according to the descriptions of SIDREF addressing, and no alternative addressing scheme is provided.

<bind target> is another anomaly. Its type in both extant schemas is xs:token. Its annotation and the manual's documentation refers the reader to the "Addressing Syntax" section, which does not exist. In the provided example a slash (/) is present, so it seems safe to assume it is a SIDREF, albeit with spaces that would fail to parse because both ID and SID cannot contain spaces. There are likely many more like this. 1.5.0 has a sidref-type with a matching pattern.

In the former example the user is tempted to use SIDREF resolution, but it is not clear if that is correct. For if the "sid" attribute is duplicated anywhere in the document, there will be problems. If the user is wise to this, they may limit the scope of the "sid" reference. No guidelines are offered in this case. In fact the description of <texture> is very poor. It doesn't even have its own section.

1.4.1's <sampler*><source>value</source> is similarly confused, although the manual provides a better description that should also apply to <texture> where "value" is the name of a SID local to? and <texture texture> is the same albeit the value is inconsistently stored in an attribute. The manual also conflates the SID of the <newparam> with the variant type housed withing the <newparam>.

IN HINDSIGHT these are not SIDREFs. It's difficult to talk about SID references when there are more than one kind, and one kind is not described. These may sound like minor issues, but when the rubber hits the code, they pose real challenges for users that are frustrating and make the schemas easy target for ridicule.

NOTE: <instance_material target> is xs:anyURI whereas this attribute is called "url" for most if not all other <instance_*> types.