[oslc-cm] A Modest Proposal for Attachments

Tue Jul 19 14:18:30 EDT 2011

Hi all,

The discussion on attachments in the last couple of meetings has been very
interesting. Various issues and alternatives have been discussed, but
speaking personally, I was more trying to understand the ideas than
expressing strong preferences. I really needed to take some time to think
about it (and look into some things that came up in the discussion that I
didn't know much about) before forming opinions. I may be mistaken, but my
sense is that the same might have been true for some other participants. I
didn't get the sense that a consensus was formed, though if I'm incorrect
about that, my apologies.

Anyhow, now that I have thought things through, I thought I'd share my
thoughts. If you like, it's my proposal for what I believe is the simplest,
most straightforward approach to attachments that meets our needs.
Unfortunately, I'm going to be away over the next few days (and so unable
to attend tomorrow's meeting). Apologies for dropping this and running.
That wasn't my intention, but unfortunately, I just wasn't able to get it
done until today. If this spurs any discussion, please don't take my
silence as disinterest. I'll speak up again as soon as I'm back.

My proposal is to use a collection resource for attachments, with a URI
that's separate from the change request itself, i.e.

<http://example.com/bugs/2314>
      rdf:type oslc_cm:ChangeRequest ;
      dcterms:identifier "00002314" ;
      oslc:shortTitle "Bug 2314" ;
      ...
      oslc_cm:attachments <http://example.com/bugs/2314/attachments> .

The main reason for this approach is to allow adding attachments by simply
POSTing to this URI. That said, I think there are other benefits, too. It
means that a consumer need not be overwhelmed with attachment information
when retrieving a change request. I think that's quite reasonable, as
attachments are often handled as a sort of secondary concern. Moreover, in
cases where attachments are needed, it's just one additional GET to obtain
all of the information about the attachments (as you'll see in a moment).
Finally, I'll point out that there is precedent for the collection resource
approach in Core's discussion mechanism. I can't see any reason why it's
less appropriate here (indeed, I see reasons why it's even more
appropriate).

I think that the other proposal for creating attachments, a separate
factory, is just a bad fit here. The nice RESTful factory approach would be
to create the attachment resource by POSTing to it, then to add the
attachment to a change request. But, as we discussed, that's not possible
because some of underlying systems actually create and attach an attachment
in a single step. The proposal to deal with this was to require that a
change request be specified in/with the attachment, but I fear that might
seem non-obvious and arbitrary to consumers. The problem, I think, is that
it's backwards. Our model should make the change request the primary thing,
with the attachment subordinate to it (because that's the model used in
several systems). And the familiar RESTful approach for that scenario is to
use a collection property of the primary thing as the factory for the
subordinate thing, i.e. the approach I suggest above.

I should back up a bit and talk about what the attachments collection
resource should look like. Here's what my first suggestion would be:

<http://example.com/bugs/2314/attachments>
      rdf:type oslc_cm:AttachmentList ;
      rdf:_1  <http://example.com/bugs/2314/attachments/screenshot.png> ;
      rdf:_2  <http://example.com/bugs/2314/attachments/fix.patch> ;
	... .

I'm just using the standard RDF container membership properties here. That
said, I know they're not popular in OSLC (though I don't really understand
why), so my secondary proposal would be to use a new property:

<http://example.com/bugs/2314/attachments>
      rdf:type oslc_cm:AttachmentList ;
      oslc_cm:attachment
<http://example.com/bugs/2314/attachments/screenshot.png> ,
            <http://example.com/bugs/2314/attachments/fix.patch> , ... .

Here's where, I think, my proposal diverges from the ideas we discussed in
the meetings, and I'm strongly convinced it's an improvement: The objects
of the attachment statements represent the attachments themselves, that is,
if you did a GET on one of those URI's, you would retrieve the attachment
content itself. The reason I think that's a very good thing is that it
eliminates the intermediate per-attachment descriptive resource entirely.
That means that creating an attachment really is just one POST -- no need
for complications like multipart/mixed. So, where does that descriptive
information go? Right in the collection resource:

<http://example.com/bugs/2314/attachments>
      rdf:type oslc_cm:AttachmentList ;
      oslc_cm:attachment
<http://example.com/bugs/2314/attachments/screenshot.png> ,
            <http://example.com/bugs/2314/attachments/fix.patch> , ... .

<http://example.com/bugs/2314/attachments/screenshot.png>
      dcterms:title "screenshot.png" ;
      dcterms:format "image/png" ;
      dcterms:created "2011-07-18T13:22:30.45-05:00" ;
      ... .

<http://example.com/bugs/2314/attachments/fix.patch>
      dcterms:title "fix.patch" ;
      dcterms:format "text/x-diff" ;
      dcterms:created "2011-07-19T15:03:54.00-05:00" ;
      ... .

The purpose of this information is to help a consumer that wants to render
a UI listing all of the attachments associated with the change request, so
putting it right in the list resource is ideal. I think it's highly
preferable to the proposal where an oslc_cm:attachment property would be
used repeatedly in the change request itself, with the value of each being
a separate intermediate resource. That would require the consumer to do a
separate GET for each attachment description in order to render such a UI.

I'll point out that RDF was designed to handle this very scenario well:
we're just adding additional information about an existing resource (the
attachment) by making statements about it in another resource. There was
some talk of using reification in the last meeting, but I don't think it
would be appropriate here. These are definitely statements about the
attachment resource, not about the oslc:attachment statements themselves
(e.g., the format of the patch attachment, not the statement about it, is
text/x-diff).

So now, I think there's just one more question to answer: How did this
information get there in the first place? I mentioned that eliminating the
intermediate descriptive resource means that creating an attachment can be
a simple POST, but what about providing this extra information?

A couple of observations about this information (dare I call it metadata?):
First, some of it (creator, created) probably doesn't need to specified at
all. The provider can determine it by itself. Second, some if it (format,
contentSize) overlaps beautifully with the standard Content- HTTP headers.
So that can be specified by the consumer right in the POST request and
returned by the provider right in the GET response, along with the content
itself. Thus, duplicating that information in the collection resource is
merely a convenience for consumers (for example, to render an attachment
listing UI, as I described above).

For any information that we want to represent but for which there isn't a
corresponding standard header, we can just define a header ourselves. I
think it will be a very small set.

I haven't gotten into all the details of exactly what properties and
headers to use here (and particularly if we're defining our own headers --
I don't know if this has been done elsewhere in OSLC and there's a standard
form already). But, if people like my approach, I'm sure those details
could be worked out within the group.

So, that's my proposal. I hope people find it helpful. I'm happy to answer
any questions or address any concerns. Unfortunately, as I mentioned at the
top, I won't be able to do so for a few days.

Cheers,
Dave

--
Dave Steinberg
IBM Rational Software
davidms at ca.ibm.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://open-services.net/pipermail/oslc-cm_open-services.net/attachments/20110719/5a2d565f/attachment-0003.html>