Resource paging

It sometimes happens that a resource is too large to reasonably transmit in a single HTTP message. A client may anticipate that a resource will be too large - for example, a client tool that accesses defects may assume that an individual defect will usually be of sufficiently constrained size that it makes sense to request all of it at once, but that the list of all the defects ever created will typically be too big (footnote #4). Alternatively, a server may recognize that a resource that has been requested is too big to return in a single message.

To address this problem, OSLC resources may support a technique called Resource Paging that enables clients to retrieve representations of resources one page at a time. For every resource whose URL is <url>, an OSLC implementation may define a companion resource whose URL is <url>?oslc.paging=true. The meaning of this resource is “the first page of <url>”. Clients that anticipate that a particular resource will be too large may instead fetch this alternate resource. Servers that determine that a requested resource is too large may respond with a 302 redirect message, directing the client to the “firstPage” resource (footnote #5).

The representation of <url>?oslc.paging=true will contain a subset of the triples that define the state of the resource whose URL is <url>. The subject of those triples will be <url>, not <url>?oslc.paging=true. In addition, the representation of <url>?oslc.paging=true may include a few triples whose subject is <url>?oslc.paging=true itself. Examples are triples whose predicate is oslc:nextPage, dcterms:description and so on.

Note that pagination is only defined for resources whose state can be expressed in RDF as a set of RDF triples. Pagination is undefined for resources whose state cannot be represented in RDF. Pure binary resources, encrypted resources, or digitally signed resources might be examples. The representation of a Page is defined by first paginating the underlying triples that express the state of the resource being paginated, and then performing whatever standard mapping is used to map from each page of triples to the requested representation. In other words, we do not paginate the representations; we paginate the RDF resource state itself and then create the representations of each page in whatever media type is requested. This provides a general specification for both RDF and non-RDF representations of pages of RDF resources. Examples of non-RDF representations are HTML and JSON.

For example, if I have an OSLC container with the URL http://acme.com/oslc/container/1, it might have the following representation (in Turtle notation):


@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
<http://acme.com/oslc/container/1>
    <rdfs:member> <http://acme.com/oslc/resource/000000000>.
    # … 999999998 more triples here …
    <rdfs:member> <http://acme.com/oslc/resource/999999999>.
				

This representation has a billion triples and over 90 billion characters, which might be a bit big. Assuming that the implementation that backs this resource supports paging, a client can chose instead to GET the related resource http://acme.com/oslc/container/1?oslc.paging=true. The representation of this latter resource would look like this:


@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
<http://acme.com/oslc/container/1> 
<rdfs:member> <http://acme.com/oslc/resource/000000000>.
# … 98 more triples here …
<rdfs:member> <http://acme.com/oslc/resource/000000099>.
# pay attention to the subject URL of the following triple
<http://acme.com/oslc/container/1?oslc:paging=true> <oslc:nextPage> <http://acme.com/oslc/xxxxxxxxx/page2>.
				

As you can see, the representation of this smaller “firstPage” resource contains the first 100 triples that you would have gotten in the representation of the large resource in exactly the same form - the same subject, predicate and object - as in the representation of the large resource. In addition, it contains another triple - whose subject is the “firstPage” resource itself, not the bigger resource - that provides the URL of a third resource that will contain the next page of triples from the bigger resource. The format of the URLs of the second and subsequent pages (if they exist) is not defined by the OSLC specification – an OSLC implementation can use whichever URL it pleases. Note that although this example shows the triples in a precise order for purposes of simplicity and clarity of the example, there is no concept of ordering of triples in RDF, so the triples can be in any order both within and across pages.

As illustrated above, when a page is returned it will include the triple:


<url of current page> oslc:nextPage <url of next page>.
                

You can tell when you are on the last page by the absence of an oslc:nextPage triple.

Because paging is unstable (see below), by the time a client follows an oslc:nextPage link there may no longer be a next page. The OSLC server implementation in this case may respond with an HTTP 404 error.

The OSLC specification permits <url>?oslc.pageSize=n as an alias for <url>?oslc.paging=true. Because it is just an alias, it has exactly the same meaning and behavior. An OSLC server implementation may (but is not obliged to) adjust the number of triples on the first and subsequent pages based on the value of n.

When Resource Paging is used, the values of a multi-valued property of a single resource may be split across resource pages. All triples that reference the same blank node, must all be contained on the same page, since a blank node cannot be referenced from a different page (this is simply an observation on how RDF works, not an OSLC policy or limitation).

Unstable Paging

Because HTTP is a stateless protocol and OSLC Services manage resources that can change frequently, OSLC clients should assume that resources can change as they page through them using the oslc:nextPage mechanism. Nevertheless, each triple of the resource that exists when the first page is returned and is not subsequently deleted during the paging interaction must be included on at least one page. [Including the same triple more than once is permissible – identical triples are always discarded in RDF - but servers need to ensure that the same triple is not returned multiple times with different object values.] Triples that are added after the first page is returned may or may not be included in subsequent pages by a server.

Learn more about Resource Paging in the OSLC Core specification

Any problems?

Ask a question on the forums or email the site administrator. If you have questions specifically about Eclipse Lyo, ask the lyo-dev mailing list.