HistoryViewLinks to this page Revision from: 2013 June 12 | 08:39 am
This is the revision from 2013 June 12 at 08:39 amView the current live version of the article.
OSLC_logo.png

Open Services for Lifecycle Collaboration
Tracked Resource Set Specification Version 2.0

Status: Finalizing Draft

This Version

Latest Version

Previous Version

  • This is the first version of this specification.

Authors

Contributors

Table of Contents

Contents


Notation and Conventions

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC2119. Domain name examples use RFC2606.

Introduction

The Tracked Resource Set protocol allows a server to expose a set of resources in a way that allows clients to discover the exact set of resources in the set, to track all additions to and removals from the set, and to track state changes to all resources in the set. The protocol does not assume that clients will dereference the resources. The protocol is suitable for dealing with large sets containing a large number of resources, as well as highly active resource sets that undergo continual change. The protocol is HTTP-based and follows RESTful principles.

Terminology

Resource Set - an enumerable, finite, collection of Resources

Resource - web resource identified by URI; the Resource Set members

Server - party playing the role of Resource Set provider

Client - party playing the role of consumer; interacts with a Server to enumerate and track Resources in the Server’s Resource Set

Tracked Resource Set (TRS) - describes the set of Resources in a Resource Set, expressed as a Base and a Change Log

Base - portion of a Tracked Resource Set representation that lists member Resources

Change Log - portion of a Tracked Resource Set representation detailing a series of Change Events

Change Event - describes the addition, removal, or state change of a member Resource

Overview

The Server maintains a Resource Set. A Resource Set consists of a finite, enumerable set of Resources. Each Resource is identified by a URI. The Server will have its own criteria for determining the exact set of member Resources at any point in time. However, clients need not be aware of the Server’s criteria, and will instead discover a Resource Set’s members by interacting with the Server using the Tracked Resource Set protocol.

The Server MUST provide an HTTP(S) URI corresponding to its Resource Set. This is referred to as the Tracked Resource Set URI.

A HTTP GET request sent to the Tracked Resource Set URI returns a representation of the state of the Resource Set characterized in terms of a Base and a Change Log: the Base provides a point-in-time enumeration of the members of the Resource Set, and the Change Log provides a time series of adjustments describing changes to members of the Resource Set. When the Base is empty, the Change Log describes a history of how the Resource Set has grown and evolved since its inception. When the Change Log is empty, the Base is a simple enumeration of the Resources in the Resource Set. This hybrid base+delta form gives the Server flexibility to structure the representation in ways that are most useful to its Clients.

The Base portion of a Tracked Resource Set representation is an RDF container where each member references a Resource that was in the Resource Set at the time the Base was computed. The Change Log portion is represented as multiple same-subject and same-predicate triples, where the objects correspond to Change Events. The order information is indicated within the Change Event entry itself. There MUST NOT be a gap between the Base portion and the Change Log portion of a Tracked Resource Set representation; however, the Change Log portion may contain earlier Change Event entries that would be accounted for by the Base portion. A “cutoff” property of the Base identifies the point in the Change Log at which processing of Change Events can be cut off because older changes are already covered by the Base portion.

Tracked Resource Set

An HTTP GET on a Tracked Resource Set URI returns a representation structured as follows (note: for exposition, the example snippets show the RDF information content using Turtle; the actual representation of these resources “on the wire” may vary):

# Resource: http://cm1.example.com/trackedResourceSet
@prefix trs: <http://open-services.net/ns/core/trs#> .

<http://cm1.example.com/trackedResourceSet>
  a trs:TrackedResourceSet ;
  trs:base <http://cm1.example.com/baseResources> ;
  trs:changeLog [
    a trs:ChangeLog ; 
    trs:change  ...  .
  ] .

A Tracked Resource Set MUST provide references to the Base and Change Log using the trs:base and trs:changeLog predicates respectively.

A typical Client will periodically poll the Tracked Resource Set looking for recent Change Events. In order to cater to this usage, the Tracked Resource Set’s HTTP response MUST contain the triples for the referenced Change Log (i.e., via a Blank Node, or an inline named Resource). The Server SHOULD also support etags, caching, and conditional GETs for Tracked Resource Set resources and relegate the Base to separate resources.

Change Log

A Change Log provides a set of changes, the ordering of the changes is included with each change event. The following example illustrates the contents of a Change Log:

# Resource: http://cm1.example.com/trackedResourceSet
@prefix trs: <http://open-services.net/ns/core/trs#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://cm1.example.com/trackedResourceSet>
  a trs:TrackedResourceSet ;
  trs:base <http://cm1.example.com/baseResources> ;
  trs:changeLog [
    a trs:ChangeLog ;
    trs:change <#3>, <#2>, <#1>.
  ] .

<#3> 
  a trs:Creation ;
  trs:changed <http://cm1.example.com/bugs/23> ;
  trs:order "103"^^xsd:integer .

<#2>
  a trs:Modification ;
  trs:changed <http://cm1.example.com/bugs/22> ;
  trs:order "102"^^xsd:integer .

<#1>
  a trs:Deletion ;
  trs:changed <http://cm1.example.com/bugs/21> ;
  trs:order "101"^^xsd:integer .

As shown, a Change Log provides a set of Change Event entries in a multi-valued RDF property called trs:change.

Change Events MUST have URIs (i.e., they cannot be Blank Nodes) to allow Clients to recognize entries they have seen before. The URI is only used to identify an event (i.e., it need not be HTTP GETable).

Each Change Event has a sequence number, trs:order; sequence numbers are non-negative integer values that increase over time. A Change Event entry carries the URI of the changed Resource, trs:changed, and an indication, via rdf:type (a.k.a. “a” in Turtle), of whether the Resource was added to the Resource Set, removed from the Resource Set, or changed state while a member of the Resource Set. The entry with the highest trs:order value (i.e., 103 in this example) is the most recent change. As changes continue to occur, a Server MUST add new Change Events to the newest Change Log segment. The sequence number (i.e., trs:order) of newer entries MUST be greater than previous ones. The sequence numbers MAY be consecutive numbers but need not be.

Note that the actual time of change is not included in a Change Event. Only a sequence number, representing the “sequence in time” of each change is provided. The URI of a Change Event MUST be guaranteed unique, even in the wake of a Server roll back where sequence numbers get reused. A time stamp MAY be used to generate such a URI, as in the above example, although other ways of generating a unique URI are also possible.

A Change Log represents a series of changes to its corresponding Resource Set over some period of time. The Change Log MUST contain Change Events for every Resource creation, deletion, and modification during that period. A Server MUST report a Resource modification event if a GET on it would return a semantically different response from previously. For a resource with RDF content, a modification is anything that would affect the set of RDF triples in a significant way. A Server MAY safely report a modification event even in cases where there would be no significant difference in response.

The Server SHOULD NOT report unnecessary Change Events although it might happen, for example, if changes occur while the base is being computed. A Client SHOULD ignore a creation event for a Resource that is already a member of the Resource Set, and SHOULD ignore a deletion or modification event for a Resource that is not a member of the Resource Set.

Change Log Segmentation

The Change Log in the previous example consisted of a single trs:ChangeLog resource. Typically, however, the Change Log will be very large, requiring the changes to be segmented into multiple smaller trs:ChangeLog resources:

# Resource: http://cm1.example.com/trackedResourceSet
@prefix trs: <http://open-services.net/ns/core/trs#> .

<http://cm1.example.com/trackedResourceSet>
  a trs:TrackedResourceSet ;
  trs:base <http://cm1.example.com/baseResources> ;
  trs:changeLog [
    a trs:ChangeLog ; 
    trs:change <#3>, <#2>, <#1>;
    trs:previous <http://cm1.example.com/changeLog/1> .
] .

<#3> 
...

and then…

# Resource: http://cm1.example.com/changeLog/1
@prefix trs: <http://open-services.net/ns/core/trs#> .

<http://cm1.example.com/changeLog/1>
  a trs:ChangeLog ; 
  trs:change <#0>, {more stuff} .

<#0>
...

As shown, the trs:previous reference is used in this case to connect to the Change Log resource containing the next group of chronologically earlier Change Events. The most recent Change Events are included in the Tracked Resource Set itself. This allows a Client to easily discover the most recent Change Event, and retrieve successively older Change Log resources until it encounters a Change Event that has already been processed (on a previous check). The protocol does not attach significance to where a Server breaks the Change Log into separate parts, i.e., the number of entries in a trs:ChangeLog is entirely up to the Server.

To allow Clients to retrieve the Change Events in a Change Log segment using a single HTTP GET request, Servers MUST include all of the triples corresponding to a Change Log segment in the same HTTP response (i.e., in the representation of either the Tracked Resource Set or a trs:previous Change Log). This includes triples whose subject is the Change Log, the trs:change entries, and the Change Events themselves. Other than the Change Events, all of these MAY be represented using Blank Nodes.

Truncated Change Logs

A chain of Change Logs MAY continue all the way back to the inception of the Resource Set and contain Change Events for every change made since then. However, to avoid maintaining this ever growing list of Change Logs indefinitely, a Server MAY truncate the log at a suitable point in the chain. This can be accomplished by deleting the oldest segments of the Change Log and/or by removing the triples that reference them. In any case, Clients MUST be prepared to receive HTTP status code 404 (Not found) when navigating the “previous” reference from a final or stale Change Log segment.

To ensure that a new Client can always get started, the Change Log MUST contain the base cutoff event of the corresponding Base, and all Change Events more recent than it. Thus the Server is only allowed to truncate Change Events older than the base cutoff event. When the Base has no base cutoff event (i.e., the Base enumerates the Resource Set at the start of time), the Change Log MUST contain all Change Events back to the start of time; i.e., no truncation is allowed.

To minimize the likelihood of Clients falling too far behind and losing information, it is highly RECOMMENDED that a Server retain a minimum of seven days worth of Change Events.

Base Resources

The Base of a Tracked Resource Set is a W3C Linked Data Platform (LDP) Container where each member references a Resource that was in the Resource Set at the time the Base was computed. HTTP GET on a Base URI returns an RDF container with the following structure:

# Resource: http://cm1.example.com/baseResources
@prefix trs: <http://open-services.net/ns/core/trs#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix ldp: <http://www.w3.org/ns/ldp#> .

<http://cm1.example.com/baseResources> 
  a ldp:AggregateContainer;
  trs:cutoffEvent <#1> ;
  rdfs:member <http://cm1.example.com/bugs/1> ;
  rdfs:member <http://cm1.example.com/bugs/2> ;
  rdfs:member <http://cm1.example.com/bugs/3> ;
  ...
  rdfs:member <http://cm1.example.com/bugs/199> ;
  rdfs:member <http://cm1.example.com/bugs/200> .

Each Resource in the Resource Set MUST be referenced from the container using an rdfs:member predicate. The Base MAY be broken into multiple pages in which case the Server will respond with a 30x redirect message, directing the Client to the first “page resource”. The representation of a page resource will contain a subset of the Base’s rdfs:member predicates. In addition, it will contain another triple, whose subject is the page resource itself (i.e., not the Base resource), with a reference to the next page:

# Resource: http://cm1.example.com/baseResources/page1
@prefix trs: <http://open-services.net/ns/core/trs#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix ldp: <http://www.w3.org/ns/ldp#> .

<http://cm1.example.com/baseResources> 
  a ldp:AggregateContainer;
  trs:cutoffEvent <#1> ;
  rdfs:member <http://cm1.example.com/bugs/1> ;
  rdfs:member <http://cm1.example.com/bugs/2> ;
  rdfs:member <http://cm1.example.com/bugs/3> .

<http://cm1.example.com/baseResources/page1> 
  a ldp:Page;
  ldp:pageOf <http://cm1.example.com/baseResource>;
  ldp:nextPage <http://cm1.example.com/baseResources/page2> .

The last page in the list is indicated by a ldp:nextPage value of rdf:nil. The Tracked Resource Set protocol does not attach significance to the order in which a Server enumerates the resources in the Base or breaks the Base up into pages.

The first page of a Base MUST include a trs:cutoffEvent property, whose value is the URI of the most recent Change Event in the corresponding Change Log that is already reflected in the Base. This corresponds to the latest point in the Change Log from which a Client can begin incremental monitoring/updating if it wants to remain synchronized with further changes to the Resource Set. As mentioned above, the cutoff Change Event MUST appear in the non-truncated portion of the Change Log. When the trs:cutoffEvent is rdf:nil, the Base enumerates the (possibly empty) Resource Set at the beginning of time.

Because of the highly dynamic nature of the Resource Set, a Server may have difficulty enumerating the exact set of resources at a point in time. Because of that, the Base can be only an approximation of the Resource Set. A Base might omit mention of a Resource that ought to have been included or include a Resource that ought to have been omitted. For each erroneously reported Resource in the Base, the Server MUST at some point include a corrective Change Event in the Change Log more recent that the base cutoff event. The corrective Change Event corrects the picture for that Resource, allowing the Client to compute the correct set of member Resources. A corrective Change Event might not appear in the Change Log that was retrieved when the Client dereferenced the Tracked Resource Set URI. The Client might only see a corrective Change Event when it processes the Change Log resource obtained by dereferencing the Tracked Resource Set URI on later occasions.

When a Base is broken into pages, the Client will discover and retrieve Base page resources to determine the Resources in the Base. A Client MUST retrieve all the page resources of the Base. A Client MAY retrieve the Base page resources in any order, including retrieving some Base page resources in parallel. A Client retrieves the Base page resources at its own pace, and MAY retrieve any of the Base page resources more than once. If the Server allows the representation of Base page resources to vary over time, the Server MUST ensure that the set of Resources a Client would infer as members is necessarily an approximation of the Resource Set which, when corrected by Change Events after the Base’s cutoff event, yields the correct set of member Resources in the Resource Set.

A Server MUST refer to a given resource using the exact same URI in the Base ( rdfs:member reference) and every Change Event ( trs:changed reference) for that resource.

Resources

This section defines the resources of the Tracked Resource Set specification. TRS servers MUST support Turtle (i.e., text/turtle) representations of these resources. TRS servers MAY provide representations of the requested TRS resources beyond those necessary to conform to this specification, using standard HTTP content negotiation. If the client does not indicate a preference, text/turtle MUST be returned.

Tracked Resource Set Namespace

The namespace used for resources and properties defined in this specification is as follows:

  • Namespace URI: http://open-services.net/ns/core/trs#
  • Default Prefix: trs

Resource: TrackedResourceSet

  • Name: TrackedResourceSet
  • Description: A Tracked Resource Set provides a representation of the current state of a Resource Set.
  • Type URI: http://open-services.net/ns/core/trs#TrackedResourceSet

TrackedResourceSet Properties

Prefixed Name Occurs Read-only Value-type Representation Range Description
trs:base exactly-one False Resource Either any An enumeration of the Resources in the Resource Set.
trs:changeLog zero-or-one False Local Resource Inline trs:ChangeLog A Change Log providing a time series of incremental adjustments to the Resource Set.

A Base resource (i.e., target of the trs:base predicate) has the following properties:

Prefixed Name Occurs Read-only Value-type Representation Range Description
rdfs:member zero-or-many False Resource Reference any A member Resource of the Resource Set.
trs:cutoffEvent exactly-one False Local Resource Inline trs:Creation, trs:Modification, trs:Deletion The most recent Change Log entry that is accounted for in this Base. When rdf:nil, the Base is an enumeration at the start of time.

Resource: ChangeLog

  • Name: ChangeLog
  • Description: A Change Log describes what resources have been created, modified or deleted, and when.
  • Type URI: http://open-services.net/ns/core/trs#ChangeLog

ChangeLog Properties

Prefixed Name Occurs Read-only Value-type Representation Range Description
trs:change zero-or-many False Local Resource n/a Resource The Change Event entries.
trs:previous zero-or-one False Resource n/a trs:ChangeLog The continuation of the Change Log, containing the next group of chronologically earlier Change Events.

Each object of a trs:change triple is a Local Resource representing a Change Event with the following properties:

Prefixed Name Occurs Read-only Value-type Representation Range Description
rdf:type exactly-one False Resource n/a any The type of the Change Event. One of trs:Creation, trs:Modification, or trs:Deletion.
trs:changed exactly-one False Resource n/a any The Resource that has changed.
trs:order exactly-one False Integer (Non-negative) n/a n/a The sequence in time of the Change Event.

Client Behavior

This section describes one (relatively straightforward) way that a Client can use the Tracked Resource Set protocol to build and maintain its own local replica of a Server’s Resource Set.

Initialization procedure

A Client wishing to determine the complete collection of Resources in a Server’s Resource Set, so that it can build its local replica of the Resource Set, proceeds as follows:

  • Send a GET request to the Tracked Resource Set URI to retrieve the Tracked Resource Set representation to learn the URI of the Base.
  • Use GET to retrieve successive pages of the Base, adding each of the member Resources to the Client’s local replica of the Resource Set.
  • Invoke the Incremental Update procedure (below). The sync point event is the trs:cutoffEvent property (on the first page of the Base). A clever Client might run this step in parallel with the previous one in an effort to prevent the case where the Client can’t catch up to the current state of the Resource Set using the Change Log (after initial processing) because initial processing takes too long.

The overall work to build the local replica of the Resource Set is linear in the size of the Base plus the number of Change Events that occurred after the base cutoff event. The Server can help Clients building new local replicas of its Resource Set by providing as recent a Base as possible, because that means the Client will have to process fewer Change Events. It is entirely up to the Server how often it computes a new Base. It is also up to the Server how it computes the members of a Base, whether by enumerating its Resource Set directly (e.g., by querying an underlying database), or perhaps by coalescing its internal change log entries into a previous base.

Incremental Update procedure

Suppose now that a Client has a local replica of the Server’s Resource Set that is accurate as of a particular sync point event known to the Client. A Client wishing to update its local replica of the Server’s Resource Set acts as follows:

  • Send a GET request to the Tracked Resource Set URI to retrieve the Tracked Resource Set representation to learn its current Change Log.
  • Search through the chain of Change Logs from newest to oldest to find the sync point event. The incremental update fails if the Client is unable to locate the sync point (i.e., it gets to the end of the log).
  • Process all Change Events after the sync point event, from oldest to newest, making corresponding changes to the

Client’s local replica of the Resource Set. Record the latest event processed as the new sync point event. A clever Client might record (some number of) recently processed events for possible future undo in the event of a server roll-back.

When the procedure succeeds, the Client will have updated its own local replica of the Server’s Resource Set to be an accurate reflection of the set of resources as described by the retrieved representation of the Tracked Resource Set. Of course, the Server’s actual Resource Set may have undergone additional changes since then. While the Client may never catch up to the Server, it can at least keep its local replica of the Resource Set almost up to date. By choosing the interval at which it polls for updates, a Client controls how long the two are allowed to drift apart. The overall work to maintain the local replica of the Resource Set is linear in the length of the Change Event stream. In the (hopefully rare) situation that the Client fails to find its sync point event, one of two things is likely to have happened on the Server: either the Server has truncated its Change Log, or the Server has been rolled back to an earlier state.

If the Client had been retaining a local record of previously processed events, the Client may be able to detect a Server rollback if it notices the successor event of some previously processed event has been removed or changed to one with a different identifier than before. In this case, the Client can undo changes to its local replica back to that sync point, and then pick up processing from there.

Once the Incremental Update procedure fails, it is unlikely to succeed in the future. The Client has reached an impasse. The Client’s only way forward is to discard its local replica and start over.

Discoverability

The TRS servers MUST document the URLs of its Tracked Resource Set resources in the their documentation.

In order to help an administrator of a TRS client-based application in configuring its TRS server endpoints, a TRS server MAY also make its Tracked Resource Sets discoverable. Discoverability is a convenience; an administrator can configure a TRS client with a particular Tracked Resource Set knowing just its URL.

An application declares the existence and location of a Tracked Resource Set with a service declaration of the following type:

@prefix trs: <http://open-services.net/ns/core/trs#> .

<http://cm1.example.com/app/> trs:trackedResourceSet <trackedResourceSet>.

Servers MAY contain more than one statement with the property of trs:trackedResourceSet.

Appendix A: Samples

(this section is informative)

See samples within the body of this specificiation.

Appendix B: Resource Shapes

Not applicable

Appendix C: Notices and References

Contributors

  • SteveSpeicher (IBM, OSLC-Core Lead)
  • Frank Budinsky (IBM, OSLC-Core)
  • Vivek Garg (IBM, OSLC-Core)
  • Arthur Ryman (IBM, OSLC-Core)
  • Nick Crossley (IBM, OSLC-Core)
  • Ian Green (IBM, OSLC-Core)

Reporting Issues on the Specification

The working group participants who author and maintain this working draft specification, monitor a distribution list where issues or questions can be raised, see Core Mailing List

Also the issues found with this specification and their resolution can be found at Core 2.0 Issues.

License and Intellectual Property

We make this specification available under the terms and conditions set forth in the site Terms of Use, IP Policy, and the Workgroup Participation Agreement for this Workgroup.

References

Appendix D: Changes

  • Changed property trs:changes to trs:change (dropped the s)

Categories