[oslc-core] Indexing/ChangeLog resources

Sat Mar 26 18:10:32 EDT 2011

Hi all,

One final part of the Indexing/ChangeLog proposal (
http://open-services.net/bin/view/Main/IndexingProposals) that still
requires more discussion is the scope and enumeration of the resources
being indexed. In Section 1.2.1 of the current proposal document  (
http://open-services.net/pub/Main/IndexingProposals/OSLC_indexing_0316.doc)
it says:

A fundamental requirement of the Indexing Profile is that a service
provider expose its set of indexable resources. This set of resources can,
but may not be, all of the resources owned by the service provider. This
set of resources defines the scope for the other capabilities of the
indexing profile (e.g., the resources reported by the change log).

This begs the questions: What exactly are "indexable resources" and how are
they "exposed"? I think our answer to the first question is that indexable
resources are simply "public resources", i.e., the resources that the
service provider does not consider to be internal implementation artifacts.
Anything more specific than that starts to put the service provider
implementer in the business of guessing what a clients may or may not want
to be indexing, or more generally, tracking. (Note that we've been
discussing, at last weeks WG meeting and on another thread, that the
capabilities we're defining for ChangeLog are really of more general use
than just for indexing.) Since we're already planning to rename "Indexing
Profile" something more general, I suggest we also use the term "public
resources" instead of "indexable resources" to describe what we're
exposing/enumerating here.

The second question, "how are they exposed", is more interesting. Two goals
for this design have been identified during design discussions so far:

   Decouple this as much as possible from the OLSC service discovery model
   Make it possible to leverage/reuse OSLC Query Capabilities, if they are
   available

The current version of the proposal does a better job at meeting goal #2
than #1. It says this:

In the simplest case, a service provider can provide exactly one Query
Capability (i.e., queryBase URI) which includes all of its contained
resources. On the other hand, it may instead provide several Query
Capabilities, each exposing only a subset of the indexable resources. This
is convenient if different types are most easily returned using custom
member properties.

Fundamentally, what we require in the simplest case is a single URI on
which GET can be called to retrieve the complete set of public resources. A
queryBase URI is the existing OSLC mechanism that can be used to do
basically that (and it also provides the necessary paging mechanism). I was
suggesting that we simply reuse the existing way of publishing queryBase
URIs, that is using the QueryCapabilties of the ServiceProvider. With this
approach, we only need to define one additional mechanism; a way to
identify the QueryCapability, among possibly many of a ServiceProvider,
that represents the complete set of public resources.

However, this simple single queryBase design won't work if a service
provider wants to return its public resources using several (e.g., type
specific) lists. To support this we allow the service provider to identify
several QueryCapabilities, instead of just one. The current proposal uses
the obvious mechanism for this, the oslc:usage property of QueryCapability.
The current proposal says it like this:

The one or more Query Capabilities, of a service provider, that return the
indexable resources MUST be designated with an oslc:usage property with a
value of http://open-services/ns/core#index.

(Note that the exact value of the oslc:usage property will change, probably
to something more like http://open-services/ns/core/#publicResources)

This design meets goal #2 above, but does a poor job of decoupling the
design from the OSLC model, especially Query Capability. It would seem
conceptually simpler to just list one or more URIs as the ones that
enumerate the public resources of the ServiceProvider. These could be the
same URIs that are also used as the queryBase URIs in some
QueryCapabiliies, but if no query support is implemented, there would be no
QueryCapabilities needed. This simpler model (i.e., a list of queryBase
URIs - or even better, a single queryBase URI - as opposed to a marked
subset of a ServiceProvider's QueryCapabilities), however, is not quite
enough. An important property of a QueryCapability, oslc:resourceShape, is
used to specify a type specific member property (instead of rdfs:member)
that is used to aggregate the resources. A queryBase URI on it's own does
not include a ResourceShape so, if not using QueryCapability, we would need
to provide it some other way.

A QueryCapaility contains the following properties:

dcterms:title exactly-one
oslc:label zero-or-one
oslc:queryBase exactly-one
oslc:resourceShape zero-or-one
oslc:resourceType zero-or-many
oslc:usage zero-or-many

Notice that in addition to oslc:queryBase and oslc:resourceShape, both of
which we require, it only includes one additional required property:
dcterms:titile. The others are optional (and oslc:resourceType might in
fact be useful in our use case). This makes me think it may not be such a
bad fit anyway. Otherwise we'd need to provide exactly what we need some
other way, even in the (likely) case that a QueryCapability is also
providing the same information.

Another concern with tying this to QueryCapabilty is that it can be
construed to imply that full query support will be required from every
ServiceProvider that simply wants to expose its public resources. However,
although the OSLC spec is not totally clear on this, it does say that a
Query Capability MAY support the default OSLC query syntax or it MAY
support some other query syntax. Therefore, I'm assuming it MAY support no
query syntax at all. If this is true, then a ServiceProvider is not
required to support query at all, even if it does expose queryBase URIs
using QueryCapabilites. We'd need to document clearly that we require only
the queryBase. No actual query syntax needs to be supported.

So, to summarize, there seems to be a fairly clean way to reuse existing
mechanisms to enumerate public resources, but it's not totally clear if we
should use it, or if we would be better off to come up with something else.

Please send me your thoughts.

Thanks,
Frank.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://open-services.net/pipermail/oslc-core_open-services.net/attachments/20110326/f7df57f0/attachment.html>