[oslc-core] Fw: Issue with the Use of dcterms:title and dcterms:description with oslc:ResponseInfo

Mon Dec 6 18:48:17 EST 2010

I don't see the RDF information for a Page as being information about the
HTTP response. If we wanted to provide information about the HTTP response,
I would have expected us to put it in the HTTP response header. Modelling
the HTTP response in RDF seems to me like an awkward mixing of protocol
levels. The example of pagination does not require this, as we have seen.
Maybe you could motivate the "ResponseInfo" design with a different
example?

>> Each page would be a valid HTML document.

I had not thought of things this way, but I like this approach a lot. I had
a model in my head for "paginating" HTML by breaking up a single HTML
document into pages, but I like your model much better. In your model, the
resource is broken into page resources, each with its own valid HTML
representation, which is different from breaking a single HTML
representation into pieces. Ironically, your model is also better aligned
with the "page is a separate resource" model I've advocated for RDF. If we
agree your model is the right one for HTML, can we capture this somewhere
in the spec?

>> Conceptually, the initial request creates a large
[representation] of the given type which is then divided into pages and
returned
in subsequent requests.

This seems to me to me in conflict with your approach for HTML above, and I
don't like this view nearly as much. Following your model for HTML, I
prefer to think of each page as a separate resource with its own
independent representation, not as a portion of a larger representation.
The pagination is happening at the resource level, not the representation
level. I suggest we drop your wording and mental model in this section and
stick to the one you propose for HTML, which is compatible with the one for
RDF and easier to understand.

It seems to me that your desire to avoid redoing content negotiation on
subsequent pages can be implemented by the server in the nextPage URL
simply by using an URL specific to the RDF (or other) representation.
Having separate URLs for specific representations of a resource is a
standard practice documented in multiple W3C documents.

I'm not sure which is the better approach for terminating a sequence of
pages, but in the grand scheme of things, I don't think it's important
enough to argue about. Let's leave it alone.

Maybe an example would help
Best regards, Martin

Martin Nally, IBM Fellow
CTO and VP, IBM Rational
tel: +1 (714)472-2690

  From:       Arthur Ryman <ryman at ca.ibm.com>                                                                                                         

  To:         oslc-core at open-services.net                                                                                                             

  Cc:         Martin Nally/Raleigh/IBM at IBMUS, oslc-core-bounces at open-services.net                                                                     

  Date:       12/06/2010 10:32 AM                                                                                                                     

  Subject:    Re: [oslc-core] Fw: Issue with the Use of dcterms:title and dcterms:description with oslc:ResponseInfo                                  

I meant to say

Conceptually, the initial request creates a large REPRESENTATION of the
given type which is then divided into pages and returned in subsequent
requests.

Regards,
___________________________________________________________________________

Arthur Ryman, PhD, DE

Chief Architect, Project and Portfolio Management
IBM Software, Rational
Markham, ON, Canada | Office: 905-413-3077, Cell: 416-939-5063

From:
Arthur Ryman/Toronto/IBM at IBMCA
To:
Martin Nally <nally at us.ibm.com>
Cc:
oslc-core at open-services.net
Date:
12/06/2010 10:20 AM
Subject:
Re: [oslc-core] Fw: Issue with the Use of dcterms:title and
dcterms:description with oslc:ResponseInfo
Sent by:
oslc-core-bounces at open-services.net

Martin,

1. As a data modelling approach RDF is independent of HTTP, but there is
no reason why RDF can't model HTTP concepts, such as an HTTP Response.
Response is a more basic concept than Page since not all resources are
pages of some larger resource. To allow for potential non-page-related
properties being included in the response, the core wg adopted the more
general name ResponseInfo rather than PageInfo. This may have been more
generality than we needed, but that was the thought process.

2. For HTML paging, I meant that the service should be free to decide
where to create page breaks. Each page would be a valid HTML document. If
we regard HTML as being for direct human viewing, then the human will
simply look for a link labelled "Next". If we want to support machine
processing of HTML then we should specify how the next page link is
represented. I suggest we use RDFa for that and simply encode the
oslc:nextPage property. We could also use the usual HTML <link> tag, e.g.
the HTML reference on links [1] uses <link rel="next" href="objects.html">
Here "next" is a standard link type. [2]

There is one subtle point here related to content negotiation. We allow
content negotiation on the intial request, i.e. the URI is the same for
RDF or HTML or JSON. However, if the resource needs to be paged, then we
should not require the service to support content negotiation on the
subsequent pages, i.e. their content type should be determined by the
initial request. Conceptually, the initial request creates a large
resource of the given type which is then divided into pages and returned
in subsequent requests.

For example, the triples in RDF are unordered so the server could return
them in any order (unless sorting was requested). However, order is very
significant in HTML.Furthermore, HTML is more verbose so an HTML page
might contain fewer triples than an RDF page. It therefore seems
appropriate that we should not require that services support content
negotiation on subsequent pages.

3. I agree that http://example.com/bugs?oslc:firstPage would be a more
appropriate URI than http://example.com/bugs?oslc:paging=true for the
first page of a server-initiated paged response. However, as long as we
can distinguish between the page URI and the URI of the resource being
divided into pages, then the problem is solved.

4. I think the use of rdf:nil for RDF Collections is correct, but it
shouldn't be taken as a precedent in general for how to terminate a
sequence. The usual RDF approach for modelling optional properties, e.g.
"next" is to simply omit the triple rather than include a triple with a
special "nil" value. RDF Collections use rdf:nil because in list
processing rdf:nil represents the empty list, which is a very useful list
itself. It is the list of length zero and is often used in recursive
definitions of list functions. In the case of paging, it is not the case
that there is a standard empty page that always terminates any sequence of
pages. The precedent in hypertext is to omit links if the target is
absent.

[1] http://www.w3.org/TR/html4/struct/links.html
[2] http://www.w3.org/TR/html4/types.html#type-links

Regards,
___________________________________________________________________________

Arthur Ryman, PhD, DE

Chief Architect, Project and Portfolio Management
IBM Software, Rational
Markham, ON, Canada | Office: 905-413-3077, Cell: 416-939-5063

Martin Nally---12/05/2010 02:05:34 AM--->> there might be other
information associated with an HTTP response outside the context of
paging.

From:

Martin Nally/Raleigh/IBM at IBMUS

To:

Arthur Ryman/Toronto/IBM at IBMCA

Cc:

oslc-core at open-services.net

Date:

12/05/2010 02:05 AM

Subject:

Re: Fw: [oslc-core] Issue with the Use of dcterms:title and
dcterms:description with oslc:ResponseInfo

>> there might be other information associated with an HTTP response
outside the context of paging.

I don't really understand the mental model behind this statement. I
understand the mental model when I have two different resources,
http://example.com/bugs, and http://example.com/bugs?oslc:paging=true. In
my mental model for this, the second one is also poorly named - it would
be better named http://example.com/bugs?oslc:firstPage, since it is in
fact the first page of the bugs list. In this model, the properties of
this resource are not "information associated with an HTTP response", they
are just regular properties of a regular resource.

>> it's up to the server to decide how to paginate HTML

I'm not sure I understand what you mean. Clients also have to understand
what the server will do, so it would need a spec (unless it's disallowed)

>> oslc:nextPage is more easily comprehended than rdf:rest

Yes, I think that is reasonable. Since you provide a good argument for not
changing this, I'll drop the topic.

>> The explicit use of rdf:nil to signify the last page rather than simply
omitting the oslc:nextPage property means that clients need to understand
the special meaning of rdf:nil

I understand the argument, but that argument applies equally to standard
rdf collections - rdf collections could simply have omitted rdf:next,
rather than setting it to rdf:nil. Even if you are right and the folks who
did the rdf standard were wrong, I think being inconsistent with rdf
precedent does more harm than good and we would do better to fall in line
with precedent.

Best regards, Martin

Martin Nally, IBM Fellow
CTO and VP, IBM Rational
tel: +1 (714)472-2690

Arthur Ryman---12/03/2010 12:18:39 PM---Martin, Sorry for the delayed
direct response. Most of these issues have been addressed in other not

From:

Arthur Ryman/Toronto/IBM at IBMCA

To:

Martin Nally/Raleigh/IBM at IBMUS

Cc:

oslc-core at open-services.net

Date:

12/03/2010 12:18 PM

Subject:

Re: Fw: [oslc-core] Issue with the Use of dcterms:title and
dcterms:description with oslc:ResponseInfo

Martin,

Sorry for the delayed direct response. Most of these issues have been
addressed in other notes and the spec is an a good state again. Here are
my comments on some of the items in your note

I agree that clients need to understand redirects. I think 302 is OK for
paging, although not a perfect fit. 303 is intended for the case when you
are redirecting to an associated resource. You could argue that the first
page is associated with the requested resource. I think we just need to
pick one, and 302 is OK.

I believe we previously discussed other names for ResponseInfo, including
Page and PageInfo. I recall that the thinking was that there might be
other information associated with an HTTP response outside the context of
paging.

As for paging other resources types, it's up to the server to decide how
to paginate HTML. Thinking of HTML as a human-friendly representation of
the RDF data , the way to tie in totalCount is to use RDFa semantics, i.e.
the totalCount applies to the embedded RDF annotations that that full HTML
would carry. In fact, I think we should recommend that when HTML is
requested, that it include the equivalent RDF triples as RDFa.

Concerning RDF Collections, the URIs are rdf:rest and rdf:nil which are
based on classic list processing terminology. Their use to describe a
sequence of pages might seem cryptic to many users. OSLC should use
vocabulary that is natural (c.f. your comments on the cryptic nature of
ResponseInfo), and oslc:nextPage is more easily comprehended than
rdf:rest. The explicit use of rdf:nil to signify the last page rather than
simply omitting the oslc:nextPage property means that clients need to
understand the special meaning of rdf:nil. It is a valid resource, which
you can GET like any other resource, but the special meaning implies that
it is not a valid next page.

Regards,
___________________________________________________________________________

Arthur Ryman, PhD, DE

Chief Architect, Project and Portfolio Management
IBM Software, Rational
Markham, ON, Canada | Office: 905-413-3077, Cell: 416-939-5063

Martin Nally---11/26/2010 08:24:33 PM---I like where you are going with
this. I have a few comments, clarifications, and questions: One thin

From:

Martin Nally/Raleigh/IBM at IBMUS

To:

Arthur Ryman/Toronto/IBM at IBMCA

Cc:

oslc-core at open-services.net

Date:

11/26/2010 08:24 PM

Subject:

Re: Fw: [oslc-core] Issue with the Use of dcterms:title and
dcterms:description with oslc:ResponseInfo

I like where you are going with this. I have a few comments,
clarifications, and questions:

One thing I really like about the direction we appear to be heading in is
that it doesn't require a change to the core spec. Instead we would offer
a clarification/extension that explains what a server should do if it
wants to initiate pagination. The spec already explains what clients need
to do.

If the server wants to initiate paging, it could reply with a 302 or 303
redirect, or it could just go ahead and return the first page. Both seem
reasonable - the following link describes and implicitly endorses both
techniques for selecting a representation based on language -
http://www.w3.org/TR/cooluris/#conneg. Selecting non-paginated or
paginated representations based on size is a different case from selecting
the appropriate language representation based on the accept-language
header, but the analogy seems reasonable.

Using a 302 or 303 redirect seems very "classic" to me - it seems very
harsh to try forbid its use. If we allow its use, well-behaved clients
have to anticipate the possibility and code for it. Would you recommend
forbidding the server from using redirect, even though it's a common and
valid solution for these sorts of cases? This question is actually broader
than the current topic of pagination - in the real web, servers can use
redirects whenever they see fit, as described in the link above. The core
spec should probably say whether or not OSLC clients need be ready for
this. My first thought is that they do.

Regardless of the answer to the question on 302/303 redirects, I like your
suggestion that the server should be allowed to return the first page in
response to a GET on the whole list (contrary to what I originally wrote).
If we allow this, I would like us to mandate that the server provide a
content-location header in the response that indicates that the resource
that was returned is in fact http://example.org/bugs?oslc:pagination=true,
not http://example.org/bugs. This gives the client two ways to recognize
what just happened - it can notice that the resource returned is different
from the one requested, or it can look for a nextPage property in the
representation. More generally, it provides the client with a specific
indicator of the "implicit redirect" that happened on the server without
having to guess based on the representation. When we get our verification
test suite going, it should check for this.

Several people have noted that the term "ResponseInfo" does not fit the
conceptual model currently documented in the spec. I think this term is
may be a relic a different conceptual model that was previously proposed
and discarded. Any chance we can change the name to match the model? Page
would be the obvious choice of term.

RDF representations lend themselves very nicely to pagination, because an
RDF graph has no structure - it's just a set of triples, so is easy to
break into subsets. Some other representations, like HTML, are much harder
to paginate, and in fact I find it hard to imagine a satisfactory way to
paginate HTML (maybe 2 pages - one with the head and one with the body).
JSON might also be tricky to paginate. Is pagination only expected to work
with RDF? If so, the spec should say so - I didn't find anything when I
looked. If pagination is expected to work with other formats like JSON, I
think the spec needs explain how to paginate them.

It's not clear what totalCount refers to. Is it simply the number of
triples? The spec says "the number of results", which is a bit vague. I
think we should clarify.

Frank Budinsky pointed out that the sequence of pages is a collection, and
that RDF already has a vocabulary for collections. This would suggest that
we do not need oslc:nextPage - we can use rdf:next instead. Regardless of
whether or not we drop oslc:nextPage, Frank also points out that it would
be more consistent with precedent if the collection was terminated with a
value of rdf:nil for next(Page), rather than absence of the property.

Best regards, Martin

Martin Nally, IBM Fellow
CTO and VP, IBM Rational
tel: +1 (714)472-2690

Arthur Ryman---11/26/2010 09:57:22 AM---Martin, I actually like this
alternative. I have some comments.

From:

Arthur Ryman/Toronto/IBM at IBMCA

To:

Martin Nally/Raleigh/IBM at IBMUS

Cc:

oslc-core at open-services.net

Date:

11/26/2010 09:57 AM

Subject:

Re: Fw: [oslc-core] Issue with the Use of dcterms:title and
dcterms:description with oslc:ResponseInfo

Martin,

I actually like this alternative. I have some comments.

There are really two different reasons why paging might occur 1) the
client has limitations, 2) the server has limitations. The core spec only
is explicit about the client limitations. If a client does not request
paging, and the result exceeds the server limits, then there are two
alternates - the request can fail, or the server can return partial
results and a link to the rest. I think we'd agree that the later is more
friendly and has precedents. Atom works that way, and so does Insight
since it copies Atom. In both cases there is a specified way to link to
the next page without the client initiating a paging request.

I think our spec should be more explicit, i.e. a client SHOULD always
check for an oslc.nextPage property. If we adopted this, then we wouldn't
need the redirects.

Regards,
___________________________________________________________________________

Arthur Ryman, PhD, DE

Chief Architect, Project and Portfolio Management
IBM Software, Rational
Markham, ON, Canada | Office: 905-413-3077, Cell: 416-939-5063

Martin Nally---11/25/2010 08:35:33 PM---I do not believe that the problem
with the core spec that you describe exists. I think the core spec

From:

Martin Nally/Raleigh/IBM at IBMUS

To:

Arthur Ryman/Toronto/IBM at IBMCA

Cc:

oslc-core at open-services.net

Date:

11/25/2010 08:35 PM

Subject:

Fw: [oslc-core] Issue with the Use of dcterms:title and
dcterms:description with oslc:ResponseInfo

I do not believe that the problem with the core spec that you describe
exists. I think the core spec is fine in this area and should be left
alone - I think the current design is superior to the one you propose.

Your description says that the problem arises when the user requests
http://example.org/bugs, and the server decides to respond with only the
first page. In the model upon which the core spec is based, this can't
happen. "The list of bugs" and "the first page of the list of bugs" are
two different concepts and are thus two different resources with two
different URLs, and the server does not have the right to respond with the
representation of one when the other was requested. The URL for "the first
page of the list of bugs" is clearly specified in the core spec - it is
http://example.org/bugs?oslc.paging=true. Although the server may not
respond with "the first page of the list of bugs" when the client asked
for "the list of bugs", it might be acceptable for the server to perform a
302 (or 303) redirect if it decided that the requested resource is too big
to return. An argument against this would be that it is unfriendly to
surprise a client that may not understand paging in this way, but on the
other hand, returning an unworkably large representation might be worse
and so the redirect might be the lesser of two evils. If the server did
perform a redirect to http://example.org/bugs?oslc.paging=true, a
subsequent GET on that URL would produce the following representation
according to the current core spec design:
<rdf:RDF xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:dcterms="http://purl.org/dc/terms/" xmlns:oslc="
http://open-services.net/ns/core#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="http://example.org/bugs">
<dcterms:title>Bug List</dcterms:title>
<rdfs:member rdf:resource="http://example.org/bugs/1" />
<rdfs:member rdf:resource="http://example.org/bugs/2" />

<rdfs:member rdf:resource="http://example.org/bugs/1000"/>
</rdf:Description>
<oslc:ResponseInfo rdf:about="http://example.org/bugs?oslc,paging=true">
<dcterms:title>Bug List - Page 1</dcterms:title>
<oslc:totalCount>10000</oslc:totalCount>
<oslc:nextPage rdf:resource="http://example.org/bugs/pages/2" />
</oslc:ResponseInfo>
</rdf:RDF>

As you can see there is no problem because the two dcterms:title triples
have different subjects.

Best regards, Martin

From:

Arthur Ryman/Toronto/IBM at IBMCA

To:

oslc-core at open-services.net

Date:

11/23/2010 12:47 PM

Subject:

[oslc-core] Issue with the Use of dcterms:title and dcterms:description
with oslc:ResponseInfo

Sent by:

oslc-core-bounces at open-services.net

While reviewing an implementation I noticed that dcterms:title and
dcterms:description can be used with oslc:ResponseInfo. This can lead to
confusion in the case of requesting a any resource, since that resource
itself may use those properties. The resource URI of the first page of a
multi-page response is the same as the URI of the resource itself.

For example, suppose we have a resource that is a list of bugs and that it

has the dcterms:title "List of Bugs". Suppose it contains 10,000 bugs, and

this is too much to return in one response. This resource is like:

<rdf:RDF xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
       xmlns:dcterms="http://purl.org/dc/terms/" xmlns:oslc="
http://open-services.net/ns/core#"
       xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
       <rdf:Description rdf:about="http://example.org/bugs">
               <dcterms:title>Bug List</dcterms:title>
               <rdfs:member rdf:resource="http://example.org/bugs/1" />
               <rdfs:member rdf:resource="http://example.org/bugs/2" />
               <!--  etc. -->
               <rdfs:member rdf:resource="http://example.org/bugs/10000"
/>
       </rdf:Description>
</rdf:RDF>

Suppose the service will only return 1,000 or less bugs per response. When

you get the bug list URI, the response therefore gets paged. The OSLC Core

spec says that the first page looks something like:

<rdf:RDF xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
       xmlns:dcterms="http://purl.org/dc/terms/" xmlns:oslc="
http://open-services.net/ns/core#"
       xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
       <rdf:Description rdf:about="http://example.org/bugs">
               <dcterms:title>Bug List</dcterms:title>
               <rdfs:member rdf:resource="http://example.org/bugs/1" />
               <rdfs:member rdf:resource="http://example.org/bugs/2" />
               <!--  etc. -->
               <rdfs:member rdf:resource="http://example.org/bugs/1000"
/>
       </rdf:Description>
       <oslc:ResponseInfo rdf:about="http://example.org/bugs">
               <dcterms:title>Bug List - Page 1</dcterms:title>
               <oslc:totalCount>10000</oslc:totalCount>
               <oslc:nextPage rdf:resource="
http://example.org/bugs/pages/2" />
       </oslc:ResponseInfo>
</rdf:RDF>

The issue here is that now there are two dcterms:title triples associated
with the subject node <http:example.org/bugs>, which is confusing since
the second one (a child of the oslc:ResponseInfo element)  is really the
title of the response.

I can see two fixes. I prefer fix 1 since it cleanly separates the
response info from the request result data.

1. (Preferred) Introduce another property, e.g. oslc:request to identify
the request URI, and use a blank node for oslc:ResponseInfo. The result is

now:

<rdf:RDF xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
       xmlns:dcterms="http://purl.org/dc/terms/" xmlns:oslc="
http://open-services.net/ns/core#"
       xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
       <rdf:Description rdf:about="http://example.org/bugs">
               <dcterms:title>Bug List</dcterms:title>
               <rdfs:member rdf:resource="http://example.org/bugs/1" />
               <rdfs:member rdf:resource="http://example.org/bugs/2" />
               <!--  etc. -->
               <rdfs:member rdf:resource="http://example.org/bugs/1000"
/>
       </rdf:Description>
       <oslc:ResponseInfo>
               <oslc:request rdf:resource="http://example.org/bugs" />
               <dcterms:title>Bug List - Page 1</dcterms:title>
               <oslc:totalCount>10000</oslc:totalCount>
               <oslc:nextPage rdf:resource="
http://example.org/bugs/pages/2" />
       </oslc:ResponseInfo>
</rdf:RDF>

2. Use different properties for title and description, e.g.
oslc:responseTitle, oslc:responseDescription

<rdf:RDF xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
       xmlns:dcterms="http://purl.org/dc/terms/" xmlns:oslc="
http://open-services.net/ns/core#"
       xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
       <rdf:Description rdf:about="http://example.org/bugs">
               <dcterms:title>Bug List</dcterms:title>
               <rdfs:member rdf:resource="http://example.org/bugs/1" />
               <rdfs:member rdf:resource="http://example.org/bugs/2" />
               <!--  etc. -->
               <rdfs:member rdf:resource="http://example.org/bugs/1000"
/>
       </rdf:Description>
       <oslc:ResponseInfo rdf:about="http://example.org/bugs">
               <oslc:responseTitle>Bug List - Page 1</oslc:responseTitle>
               <oslc:totalCount>10000</oslc:totalCount>
               <oslc:nextPage rdf:resource="
http://example.org/bugs/pages/2" />
       </oslc:ResponseInfo>
</rdf:RDF>

Regards,
___________________________________________________________________________

Arthur Ryman, PhD, DE

Chief Architect, Project and Portfolio Management
IBM Software, Rational
Markham, ON, Canada | Office: 905-413-3077, Cell: 416-939-5063

_______________________________________________
Oslc-Core mailing list
Oslc-Core at open-services.net
http://open-services.net/mailman/listinfo/oslc-core_open-services.net

_______________________________________________
Oslc-Core mailing list
Oslc-Core at open-services.net
http://open-services.net/mailman/listinfo/oslc-core_open-services.net