One of the goals of the Open Services initiative is to enable loosely coupled tools to tightly collaborate through RESTful web services, a common data linkage strategy, and common resource representations using an RDF model. The importance of focusing on good, consumable resource designs cannot be overstated. In fact, tool designers should view the resource formats as the contract among collaborating tools. The reason is simple: programming languages, architectures, APIs, and server platforms will come and go, but if the data matters it will live on for a very long time.
In our experience, many application developers are not accustomed to thinking that resource formats are this important. Often, developers live and breathe their objects and run-time data structures, with files serving only to serialize these data structures. This perspective leads to formats that are tightly bound to particular implementation technologies. These tight bindings lead to situations where the only reasonable way to access the application data is through the application itself. A more thoughtful approach to resource design is required to enable loosely coupled programs written in a variety of technologies that collaborate via shared data.
Overall, the goals of the guidance are to ensure:
This document offers guidance to resource designers on their resource formats, focusing on XML-based resources for use in RESTful applications that can participate in the Open Services for Lifecycle Collaboration initiative. We provide guidance on:
There are several reasons to make resources as granular as imaginable.
Performance is often offered as an argument for aggregating resources rather than seeking granularity. The fear is that requiring a series of GETs to retrieve all of the resources which comprise a resource (e.g. a user-visible element of work like a diagram showing many model elements) will be prohibitively expensive. However, there are strategies for mitigating this, including careful design of the containing artifact to enable the primary use cases without having to GET all of the linked-to resources. Additionally, it is easier to imagine server-side constructs (such as a good query implementation) that will return aggregations of resources with one request than it is to imagine a fine-grained PUT allowing users to update only a portion of a portion of a resource.
As a result, resource designers should choose as fine-grained a representation as possible (but no finer). In particular, consider breaking entities into individual resources when:
For example, in a glossary, individual terms are entities that users want to find by name; they are also candidates for reuse across multiple glossaries. Therefore, they should be in their own resource. In a requirements document, different permissions may apply to requirment comments and discussions. Therefore, they should be defined as resources separate from the requirement. Since it is desirable to track the revision history of each requirement individually, each requirement should be in its own resource as well.
As another example, workflow-related data (for example, the state of a requirement) introduces some interesting characteristics. Often, there are differences in the permissions for who may affect the state of a requirement versus who is allowed to change the requirement's data. Moreover, requirements are often shared across multiple projects, with the shared instances being in potentially different states in the different projects. Changes to the requirement's data might be tracked independently from changes to its state. Therefore, the state of a requirement might best be defined as a different resource than the requirement. In fact, a reasonable approach is that the requirement state be a work item resource which tracks what is needed to be done to complete a requirement.
Conversely, when there are entities with such strong cohesion that they must share a common version history, cannot sensibly be reused independently, must have identical permissions, and can only be conceived of in some common context, aggregating them into a single resource may make sense. Care must be taken, however, because once this decision is made, these entities can, in fact, only be created within that common context, which might not always correspond to a sound user experience for the application.
For data in category one, most designers will choose RDF Property and Class names that correspond to the domain data being represented. That is, a glossary editing tool would call the root element of its glossary resource glossary; the element describing the glossary would be called description, etc.
For opaque data (category 2), there are several choices, depending on the application. Often the data may be embedded in the primary resource in its native format (RDF). To enable this, resources should be designed to accept "open content." That is, tools should be completely happy with RDF Property names of the form <foons:my-attr>foo</foons:my-attr>. When the data can not be expressed in RDF Statements, it may either be embedded as CDATA or stored in its own resource and linked to from the primary resource. In this case, the name of the element carrying the link should correspond to its role in the primary resource (as in <attachment href="http:\\example.org\myattachment.pdf"/>).
Generic extension data provides an interesting case. As an example, many XML formats are defined to include a generic extension mechanism in the form <string-attribute key="my-attr" value="foo"/>. This closely mimics what many developers have historically done in their run-time data structures. However, RDF is already an extensible and open model; if every tool invents its own extension mechanism, then we will have lost one of the benefits of RDF.
Class names should follow upper camel case, for example a change request would be called ChangeRequest
.
Property names should follow lower camel case, for example resource shape predicate should be resourceShape
.
There are many common RDF vocabularies already defined and resource designers should try to reuse them where possible. For example, the Dublin Core Metadata Initiative (http://purl.org/dc)]] describes a common set of metadata (title, description, creator, etc.), and a corresponding RDF vocabulary. By including the http://purl.org/dc/terms/ namespace and using terms from that namespace in documents, designers can avoid re-inventing terminology for these common concepts. Additionally, designers will have enabled tools that already recognize that vocabulary to understand their resources. If a family of tools agrees to support Dublin Core, any tool in this family can at least discover, for example, the title and description of a resource, even if it does not know the details of the format. Reusing common RDF vocabularies in this way helps tools interoperate while remaining loosely coupled.
See also Core URI Naming Guidance.
It's important that we consider the stability of resource representations as a part of their design and evolution. As we have indicated, designers should view resource formats as the contract between tools and as such, the handling of resources must be documented and managed through the life of the product. More importantly, resources must be designed with evolution in mind, evolution must happen with backwards compatibility in mind, and processors must be aware of these trade-offs.
Concretely:
See also the OSLC Primer for some additional guidelines.