DEPRECATED and inactive proposal, NOT recommended for implementation
Tracked Resource Set Specification
Draft 3 of May 13, 2011
STATUS: DEPRECATED and inactive proposal, NOT recommended for implementation
The Tracked Resource Set protocol allows a server to expose a set of resources in a way that allows clients to discover the exact set of resources in the set, to track all additions to and removals from the set, and to track state changes to all resources in the set. The protocol does not assume that clients will dereference the resources. The protocol is suitable for dealing with large sets containing a large number of resources, as well as highly active resource sets that undergo continual change. The protocol is HTTP-based and follows RESTful principles.
Terminology
- Resource Set - an enumerable, finite, collection of Resources
- Resource - web resource identified by URI; the Resource Set members
- Server - party playing the role of Resource Set provider
- Client - party playing the role of consumer; interacts with a Server to enumerate and track Resources in the Server's Resource Set
- Tracked Resource Set - describes the set of Resources in a Resource Set, expressed as a Base and a Change Log
- Base - portion of a Tracked Resource Set representation that lists member Resources
- Change Log - portion of a Tracked Resource Set representation detailing a series of Change Events
- Change Event - describes the addition, removal, or state change of a member Resource
Overview
The Server maintains a Resource Set. A Resource Set consists of a finite, enumerable set of Resources. Each Resource is identified by a URI. The Server will have its own well-defined criteria for determining the exact set of member Resources at any point in time. However, clients need not be aware of the Server's criteria, and will instead discover a Resource Set's members by interacting with the Server using the Tracked Resource Set protocol.
The Server MUST provide an HTTP(S) URI corresponding to its Resource Set. This is referred to as the Tracked Resource Set URI. (Mechanisms for discovering Tracked Resource Set URIs is outside the scope of the Tracked Resource Set specification.)
A GET request sent to the Tracked Resource Set URI returns a representation of the state of the Resource Set. A Tracked Resource Set representation characterizes the Resource Set in terms of a Base and a Change Log: the Base provides an initial approximation of the membership of the Resource Set, and the Change Log provides a time series of adjustments describing changes to members of the Resource Set. When the Base is empty, the Change Log describes a history of how the Resource Set has grown and evolved since its inception. When the Change Log is empty, the Base is an ahistorical enumeration of the Resources in the Resource Set. This hybrid base+delta form gives the Server flexibility to structure the representation in ways that are most useful to its Clients.
The Base portion of a Tracked Resource Set representation is represented as an RDF container where each member references a Resource that was in the Resource Set at the time the Base was computed. The Change Log portion is represented as an RDF collection, where the entries correspond to Change Events arranged in reverse chronological order. A “cutoff” property of the Base identifies the most recent Change Event that is already covered by the Base portion. There must not be a gap between the Base portion and the Change Log portion of a Tracked Resource Set representation; however, the Change Log portion may contain earlier Change Event entries that would be accounted for by the Base portion.
Tracked Resource Set
An HTTP GET on a Tracked Resource Set URI returns a representation structured as follows (note: for exposition, the example snippets show the RDF information content using Turtle; the actual representation of these resources “on the wire” is ordinarily RDF/XML):
@prefix oslc_trs: <http://open-services.net/ns/core/trs#> .
<https://.../myTrackedResourceSet>
a oslc_trs:TrackedResourceSet ;
oslc_trs:base <https://.../myResources> ;
oslc_trs:changeLog [
a oslc_trs:ChangeLog ;
oslc_trs:changes ( ... )
] .
A Tracked Resource Set MUST provide references to the Base and Change Log using the
oslc_trs:base
and
oslc_trs:changeLog
predicates respectively. The Change Log MUST be a local reference, allowing Servers to include the most recent Change Events as part of the Tracked Resource Set’s HTTP response. The Base portion is always in the form of an external reference (i.e., a resource URI), which requires another HTTP GET to access but is generally only of interest to a Client during initialization.
The Server SHOULD support etags, caching, and conditional GETs for Tracked Resource Set resources.
Change Log
A Change Log provides a list of changes organized in inverse chronological order, most recent first. The following example illustrates the contents of a Change Log:
@prefix oslc_trs: <http://open-services.net/ns/core/trs#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
_:myChangeLog
a oslc_trs:ChangeLog ;
oslc_trs:changes (
[ a oslc_trs:Creation ;
oslc_trs:changed <https://.../WorkItem/23> ;
oslc_trs:order "103"^^xsd:integer ;
dcterms:identifier "2010-10-27T17:39:33.000Z#103"
]
[ a oslc_trs:Modification ;
oslc_trs:changed <https://.../WorkItem/22> ;
oslc_trs:order "102"^^xsd:integer ;
dcterms:identifier "2010-10-27T17:39:32.000Z#102"
]
[ a oslc_trs:Deletion ;
oslc_trs:changed <https://.../WorkItem/21> ;
oslc_trs:order "101"^^xsd:integer ;
dcterms:identifier "2010-10-27T17:39:31.000Z#101"
]) .
As shown, a Change Log provides a set of Change Event entries in a single-valued RDF collection-type property called
oslc_trs:changes
. An RDF collection, i.e., a linked list (reference:
RDF Collections), is used in the Change Log to ensure that the entries retain their correct (inverse chronological) order.
Each Change Event has a unique identifier,
dcterms:identifier
, as well as a sequence number,
oslc_trs:order
; sequence numbers are non-negative integer values that increase over time. A Change Event entry carries the URI of the changed Resource,
oslc_trs:changed
, and an indication (i.e., via
rdf:type
) of whether the Resource was added to the Resource Set, removed from the Resource Set, or changed state while a member of the Resource Set. The first entry in the Change Log, i.e., "103" in this example, is the most recent change. As changes continue to occur, a Server MUST add new Change Events to the front of the list. The sequence number (i.e.,
oslc_trs:order
) of newer entries MUST be greater than previous ones. The sequence numbers MAY be consecutive numbers but need not be.
Note that the actual time of change is not included in a Change Event. Only a sequence number, representing the "sequence in time" of each change, and a unique identifier are provided. The identifier of a Change Event MUST be guaranteed unique, even in the wake of a Server roll back. A time stamp MAY be used to generate such an identifier, as in the above example, although other ways of generating a unique value are also possible.
A Change Log represents a series of changes to its corresponding Resource Set over some period of time. The Change Log MUST contain Change Events for every Resource creation, deletion, and modification during that period. A Server MUST report a Resource modification event if a GET on it would return a significantly different response from previously. For a resource with RDF content, a modification is anything that would affect the set of RDF triples in a significant way. Unlike creations and deletions, a Server MAY safely report a modification event even in cases where there would be no significant difference in response.
The Server SHOULD NOT report unnecessary Change Events. A Client SHOULD ignore a creation or modification event for a Resource that is already a member of the Resource Set, and SHOULD ignore a deletion or modification event for a Resource that is not a member of the Resource Set.
Change Log Segmentation
The Change Log in the previous example consisted of a single
oslc_trs:ChangeLog
resource. Typically, however, the Change Log will be very large, requiring the changes to be segmented into multiple smaller
oslc_trs:ChangeLog
resources:
@prefix oslc_trs: <http://open-services.net/ns/core/trs#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
_:myChangeLog
a oslc_trs:ChangeLog ;
oslc_trs:changes ( ... ) ;
oslc_trs:previous <https://.../myChangeLog/1> .
As shown, the
oslc_trs:previous
reference is used in this case to connect to the Change Log resource containing the next group of chronologically earlier Change Events. The most recent Change Events SHOULD be included in the Tracked Resource Set itself. This allows a Client to easily discover the most recent Change Event, and retrieve successively older Change Log resources until it encounters a Change Event that has already been processed (on a previous check). The protocol does not attach significance to where a Server breaks the Change Log into separate parts, i.e., the number of entries in an
oslc_trs:ChangeLog
is entirely up to the Server.
Truncated Change Logs
A chain of Change Logs MAY continue all the way back to the inception of the Resource Set and contain Change Events for every change made since then. However, to avoid maintaining this ever growing list of Change Logs indefinitely, a Server MAY truncate the log at a suitable point in the chain. This can be accomplished by removing the target of an
oslc_trs:previous
reference and/or removing the reference itself. In either case, Clients MUST be prepared to receive HTTP Error 404, Not found, when navigating the "previous" reference from a final or stale Change Log segment.
To ensure that a new Client can always get started, the Change Log MUST contain the base cutoff event of the corresponding Base, and all Change Events more recent than it. Thus the Server is only allowed to truncate Change Events older than the base cutoff event, because these duplicate information contained in the Base. When the Base has no base cutoff event (i.e., the Base enumerates the Resource Set at the start of time), the Change Log MUST contain all Change Events back to the start of time; i.e., no truncation is allowed.
To minimize the likelihood of Clients falling too far behind and losing information, it is highly RECOMMENDED that a Server retain a minimum of seven days worth of Change Events.
Base Resources
The Base resources of a Tracked Resource Set are represented by an RDF container where each member references a Resource that was in the Resource Set at the time the Base was computed. HTTP GET on a Base URI returns an RDF container with the following structure:
@prefix oslc_trs: <http://open-services.net/ns/core/trs#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
<https://.../myResources>
oslc_trs:cutoffIdentifier "2010-10-27T17:39:31.000Z#101" ;
rdfs:member <https://.../WorkItem/1> ;
rdfs:member <https://.../WorkItem/2> ;
rdfs:member <https://.../WorkItem/3> ;
...
rdfs:member <https://.../WorkItem/199> ;
rdfs:member <https://.../WorkItem/200> .
Each Resource in the Resource Set MUST be referenced from the container using an
rdfs:member
predicate. The Base MAY be broken into multiple pages, in which case the standard OSLC paging (reference:
Resource Paging) mechanism is used to connect one page to the next. (Note that an OSLC queryBase satisfies these requirements, which may allow a Server to use existing queryBases as Base containers, but there is no requirement that the Tracked Resource Set Base also be a queryBase.) The Tracked Resource Set protocol does not attach significance to the order in which a Server enumerates the resources in the Base or breaks the Base up into pages.
As shown above, a Base usually provides an
oslc_trs:cutoffIdentifier
property, whose value is the identifier (i.e.,
dcterms:identifier
) of the most recent Change Event in the corresponding Change Log that is already reflected in the Base. This corresponds to the latest point in the Change Log from which a Client can begin incremental monitoring/updating if it wants to remain synchronized with further changes to the Resource Set. As mentioned above, the cutoff Change Event MUST appear in the non-truncated portion of the Change Log. When the
oslc_trs:cutoffIdentifier
is omitted, the Base enumerates the (possibly empty) Resource Set at the beginning of time.
The Base is only an approximation of the Resource Set. A Base might omit mention of a Resource that ought to have been included or include a Resource that ought to have been omitted. For each erroneously reported Resource in the Base, the Server MUST at some point include a corrective Change Event in the Change Log more recent that the base cutoff event. The corrective Change Event corrects the picture for that Resource, allowing the Client to compute the correct set of member Resources. A corrective Change Event might not appear in the Change Log that was retrieved when the Client dereferenced the Tracked Resource Set URI. The Client might only see a corrective Change Event when it processes the Change Log resource obtained by dereferencing the Tracked Resource Set URI on later occasions.
A Server MUST refer to a given resource using the exact same URI in the Base (
rdfs:member
reference) and every Change Event (
oslc_trs:changed
reference) for that resource.
Resources
This section defines the resources of the Tracked Resource Set specification. Implementations MUST support RDF/XML (i.e.,
application/rdf+xml
) and MAY support Turtle (i.e.,
text/turtle
or
application/x-turtle
) representations of these resources. Normal HTTP content negotiation is used to select the representation actually used.
Tracked Resource Set Namespace
The namespace used for resources and properties defined in this specification is as follows:
Resource: Tracked Resource Set
A Tracked Resource Set provides a representation of the current state of a Resource Set.
- Name:
TrackedResourceSet
- Type URI:
http://open-services.net/ns/core/trs#TrackedResourceSet
Prefixed Name |
Occurs |
Value-type |
Representation |
Range |
Description |
oslc_trs:base |
exactly-one |
Resource |
Reference |
n/a |
An enumeration of the Resources in the Resource Set. |
oslc_trs:changeLog |
exactly-one |
Local Resource |
n/a |
oslc_trs:ChangeLog |
A Change Log providing a time series of incremental adjustments to the Resource Set. |
A Base resource (i.e., target of the
oslc_trs:base
predicate) has the following properties:
Prefixed Name |
Occurs |
Value-type |
Representation |
Range |
Description |
rdfs:member |
zero-or-many |
Resource |
Reference |
n/a |
A member Resource of the Resource Set. |
oslc_trs:cutoffIdentifier |
zero-or-one |
String |
n/a |
n/a |
The value of dcterms:identifier of the most recent Change Log entry that is accounted for in this Base. When omitted, the Base is an enumeration at the start of time. |
Resource: Change Log
A Change Log describes what resources have been created, modified or deleted, and when.
- Name:
ChangeLog
- Type URI:
http://open-services.net/ns/core/trs#ChangeLog
Prefixed Name |
Occurs |
Value-type |
Representation |
Range |
Description |
oslc_trs:changes |
exactly-one |
Local Resource |
n/a |
rdf:List |
The list of Change Event entries, ordered by decreasing Change Event oslc_trs:order . Events that occurred later appear earlier in the list. |
oslc_trs:previous |
zero-or-one |
Resource |
Reference |
oslc_trs:ChangeLog |
The continuation of the Change Log, containing the next group of chronologically earlier Change Events. |
Each entry in an
oslc_trs:changes
list is an anonymous resource (blank node) representing a Change Event with the following properties:
Prefixed Name |
Occurs |
Value-type |
Representation |
Range |
Description |
rdf:type |
exactly-one |
Resource |
n/a |
oslc_trs:Creation , oslc_trs:Modification , oslc_trs:Deletion |
The type of the Change Event. |
oslc_trs:changed |
exactly-one |
Resource |
Reference |
any |
The Resource that has changed. |
dcterms:identifier |
exactly-one |
String |
n/a |
n/a |
The unique identifier for the Change Event. |
oslc_trs:order |
exactly-one |
Non-negative Integer |
n/a |
n/a |
The sequence in time of the Change Event. |
Client Behavior
This section describes one (relatively straightforward) way that a Client can use the Tracked Resource Set protocol to build and maintain its own local internal representation of a Server’s Resource Set.
Initialization procedure
A Client wishing to determine the complete collection of Resources in a Server's Resource Set, so that it can build its own local internal representation, proceeds as follows:
- Send a GET request to the Tracked Resource Set URI to retrieve the Tracked Resource Set representation to learn the URI of the Base.
- Use GET to retrieve successive pages of the Base, adding each of the member Resources to the Client's local internal representation of the Resource Set.
- Invoke the Incremental Update procedure (below). The sync point event is either the
oslc_trs:cutOffIdentifier
property (on the first page of the Base) or the beginning of time (when the Base has no trs:cutOffIdentifier
property). A clever Client might run this step in parallel with the previous one in an effort to prevent the case where the Client can’t catch up to the current state of the Resource Set using the Change Log (after initial processing) because initial processing takes too long.
The overall work to build the local internal representation of the Resource Set is linear in the size of the Base plus the number of Change Events that occurred after the base cutoff event. The Server can help Clients building new local internal representations of its Resource Set by providing as recent a Base as possible, because that means the Client will have to process fewer Change Events. It is entirely up to the Server how often it computes a new Base, if ever. It is also up to the Server how it compute the members of a Base, whether by enumerating its Resource Set directly (e.g., by querying an underlying database), or perhaps by coalescing its internal change log entries into a previous base.
Incremental Update procedure
Suppose now that a Client has a local internal representation of the Server's Resource Set that is accurate as of a particular sync point event known to the Client. A Client wishing to update its local internal representation of the Server's Resource Set acts as follows:
- Send a GET request to the Tracked Resource Set URI to retrieve the Tracked Resource Set representation to learn its current Change Log.
- Search through the chain of Change Logs from newest to oldest to find the sync point event. The incremental update fails if the Client is unable to locate the sync point.
- Process all Change Events after the sync point event, from oldest to newest, making corresponding changes to the Client's local internal representation of the Resource Set. Record the latest event processed as the new sync point event. A clever Client might record (some number of) recently processed events for possible future undo.
When the procedure succeeds, the Client will have updated its own local internal representation of the Server's Resource Set to be an accurate reflection of the set of resources as described by the retrieved representation of the Tracked Resource Set. Of course, the Server’s actual Resource Set may have undergone additional changes since then. While the Client may never catch up to the Server, it can at least keep its local internal representation of the Resource Set almost up to date. By choosing the interval at which it polls for updates, a Client controls how long the two are allowed to drift apart. The overall work to maintain the local internal representation of the Resource Set is linear in the length of the Change Event stream.
In the (hopefully rare) situation that the Client fails to find its sync point event, one of two things is likely to have happened on the Server: either the Server has truncated its Change Log, or the Server has been rolled back to an earlier state.
A Client can detect a Server rollback when it notices the Server reusing a range of event sequence numbers that it used before but with distinct event identifiers. If the Client had been retaining a local record of previously processed events, the Client may be able to work out a substitute sync point event, undo changes to its local internal representation back to that sync point, and then pick up processing from there.
Once the Incremental Update procedure fails, it is unlikely to succeed in the future. The Client has reached an impasse. The Client’s only way forward is to discard its local internal representation and start over.
References