[oslc-cm] A Modest Proposal for Attachments
Steve K Speicher
sspeiche at us.ibm.com
Wed Jul 20 07:51:58 EDT 2011
Hi Dave,
Greatly appreciate you taking the time to put together such a well thought
out proposal. I've in fact been contemplating how to eliminate the
intermediate "AttachmentDescriptor" resource type and think this is a nice
alternative. I'll include specific comments inline in the proposal.
> From: Dave Steinberg <davidms at ca.ibm.com>
> To: oslc-cm at open-services.net
> Date: 07/19/2011 02:19 PM
> Subject: [oslc-cm] A Modest Proposal for Attachments
> Sent by: oslc-cm-bounces at open-services.net
>
> Hi all,
>
> The discussion on attachments in the last couple of meetings has been
very
> interesting. Various issues and alternatives have been discussed, but
speaking
> personally, I was more trying to understand the ideas than expressing
strong
> preferences. I really needed to take some time to think about it (and
look
> into some things that came up in the discussion that I didn't know much
about)
> before forming opinions. I may be mistaken, but my sense is that the
same
> might have been true for some other participants. I didn't get the sense
that
> a consensus was formed, though if I'm incorrect about that, my
apologies.
>
> Anyhow, now that I have thought things through, I thought I'd share my
> thoughts. If you like, it's my proposal for what I believe is the
simplest,
> most straightforward approach to attachments that meets our needs.
> Unfortunately, I'm going to be away over the next few days (and so
unable to
> attend tomorrow's meeting). Apologies for dropping this and running.
That
> wasn't my intention, but unfortunately, I just wasn't able to get it
done
> until today. If this spurs any discussion, please don't take my silence
as
> disinterest. I'll speak up again as soon as I'm back.
>
> My proposal is to use a collection resource for attachments, with a URI
that's
> separate from the change request itself, i.e.
>
> <http://example.com/bugs/2314>
> rdf:type oslc_cm:ChangeRequest ;
> dcterms:identifier "00002314" ;
> oslc:shortTitle "Bug 2314" ;
> ...
> oslc_cm:attachments <http://example.com/bugs/2314/attachments> .
>
> The main reason for this approach is to allow adding attachments by
simply
> POSTing to this URI. That said, I think there are other benefits, too.
It
> means that a consumer need not be overwhelmed with attachment
information when
> retrieving a change request. I think that's quite reasonable, as
attachments
> are often handled as a sort of secondary concern. Moreover, in cases
where
> attachments are needed, it's just one additional GET to obtain all of
the
> information about the attachments (as you'll see in a moment). Finally,
I'll
> point out that there is precedent for the collection resource approach
in
> Core's discussion mechanism. I can't see any reason why it's less
appropriate
> here (indeed, I see reasons why it's even more appropriate).
>
> I think that the other proposal for creating attachments, a separate
factory,
> is just a bad fit here. The nice RESTful factory approach would be to
create
> the attachment resource by POSTing to it, then to add the attachment to
a
> change request. But, as we discussed, that's not possible because some
of
> underlying systems actually create and attach an attachment in a single
step.
> The proposal to deal with this was to require that a change request be
> specified in/with the attachment, but I fear that might seem non-obvious
and
> arbitrary to consumers. The problem, I think, is that it's backwards.
Our
> model should make the change request the primary thing, with the
attachment
> subordinate to it (because that's the model used in several systems).
And the
> familiar RESTful approach for that scenario is to use a collection
property of
> the primary thing as the factory for the subordinate thing, i.e. the
approach
> I suggest above.
>
> I should back up a bit and talk about what the attachments collection
resource
> should look like. Here's what my first suggestion would be:
>
> <http://example.com/bugs/2314/attachments>
> rdf:type oslc_cm:AttachmentList ;
> rdf:_1 <http://example.com/bugs/2314/attachments/screenshot.png> ;
> rdf:_2 <http://example.com/bugs/2314/attachments/fix.patch> ;
> ... .
>
> I'm just using the standard RDF container membership properties here.
That
> said, I know they're not popular in OSLC (though I don't really
understand
> why),
I'm not really sure there is any true apposition to it.
> so my secondary proposal would be to use a new property:
>
> <http://example.com/bugs/2314/attachments>
> rdf:type oslc_cm:AttachmentList ;
> oslc_cm:attachment <
http://example.com/bugs/2314/attachments/screenshot.png> ,
> <http://example.com/bugs/2314/attachments/fix.patch> , ... .
>
> Here's where, I think, my proposal diverges from the ideas we discussed
in the
> meetings, and I'm strongly convinced it's an improvement: The objects of
the
> attachment statements represent the attachments themselves, that is, if
you
> did a GET on one of those URI's, you would retrieve the attachment
content
> itself. The reason I think that's a very good thing is that it
eliminates the
> intermediate per-attachment descriptive resource entirely. That means
that
> creating an attachment really is just one POST -- no need for
complications
> like multipart/mixed. So, where does that descriptive information go?
Right in
> the collection resource:
I'm not sure this is a true statement. The triples have as an object the
actual attachment they are describing. You are requesting additional
semantics of when you do a GET on the attachment collection resource that
you must also return the triples about the attachments. So this indicates
a special case of GET, which may be worth doing but wanted to indicate as
such. I'll suggest an alternative in a bit.
>
> <http://example.com/bugs/2314/attachments>
> rdf:type oslc_cm:AttachmentList ;
> oslc_cm:attachment <
http://example.com/bugs/2314/attachments/screenshot.png> ,
> <http://example.com/bugs/2314/attachments/fix.patch> , ... .
>
> <http://example.com/bugs/2314/attachments/screenshot.png>
> dcterms:title "screenshot.png" ;
> dcterms:format "image/png" ;
> dcterms:created "2011-07-18T13:22:30.45-05:00" ;
> ... .
>
> <http://example.com/bugs/2314/attachments/fix.patch>
> dcterms:title "fix.patch" ;
> dcterms:format "text/x-diff" ;
> dcterms:created "2011-07-19T15:03:54.00-05:00" ;
> ... .
>
> The purpose of this information is to help a consumer that wants to
render a
> UI listing all of the attachments associated with the change request, so
> putting it right in the list resource is ideal. I think it's highly
preferable
> to the proposal where an oslc_cm:attachment property would be used
repeatedly
> in the change request itself, with the value of each being a separate
> intermediate resource. That would require the consumer to do a separate
GET
> for each attachment description in order to render such a UI.
>
> I'll point out that RDF was designed to handle this very scenario well:
we're
> just adding additional information about an existing resource (the
attachment)
> by making statements about it in another resource. There was some talk
of
> using reification in the last meeting, but I don't think it would be
> appropriate here. These are definitely statements about the attachment
> resource, not about the oslc:attachment statements themselves (e.g., the
> format of the patch attachment, not the statement about it, is
text/x-diff).
>
> So now, I think there's just one more question to answer: How did this
> information get there in the first place? I mentioned that eliminating
the
> intermediate descriptive resource means that creating an attachment can
be a
> simple POST, but what about providing this extra information?
>
> A couple of observations about this information (dare I call it
metadata?):
> First, some of it (creator, created) probably doesn't need to specified
at
> all. The provider can determine it by itself. Second, some if it
(format,
> contentSize) overlaps beautifully with the standard Content- HTTP
headers. So
> that can be specified by the consumer right in the POST request and
returned
> by the provider right in the GET response, along with the content
itself.
> Thus, duplicating that information in the collection resource is merely
a
> convenience for consumers (for example, to render an attachment listing
UI, as
> I described above).
>
> For any information that we want to represent but for which there isn't
a
> corresponding standard header, we can just define a header ourselves. I
think
> it will be a very small set.
>
> I haven't gotten into all the details of exactly what properties and
headers
> to use here (and particularly if we're defining our own headers -- I
don't
> know if this has been done elsewhere in OSLC and there's a standard form
> already). But, if people like my approach, I'm sure those details could
be
> worked out within the group.
>
> So, that's my proposal. I hope people find it helpful. I'm happy to
answer any
> questions or address any concerns. Unfortunately, as I mentioned at the
top, I
> won't be able to do so for a few days.
>
Let me suggest a slight modification to this concept and that is a what to
handle the attachment "metadata" independent of the attachment collection
it belongs.
Let's start again with your example:
<http://example.com/bugs/2314/attachments>
rdf:type oslc_cm:AttachmentList ;
oslc_cm:attachment <
http://example.com/bugs/2314/attachments/screenshot.png> ,
<http://example.com/bugs/2314/attachments/fix.patch> .
Now let's say I want to retrieve the metadata about attachment
screenshot.png.
We could use some URL math to add on a request for metadata such as: GET
http://example.com/bugs/2314/attachments/screenshot.png?metadata
This has the nice quality of being a different URL than the attachment
itself, therefore in RDF terms is a different resource. It can be
computed easily from the attachment. You can PUT on that URL to just
update the metadata and not require special request headers. The downside
is if the server really doesn't understand ?metadata, you may get back the
attachment when you don't want it.
Another alternative is HTTP content negotiation, you could request Accept:
text/turtle on the attachment URL and get just the metadata. This has the
drawback of conflicting when your attachment IS text/turtle....maybe a
rare case but still an issue.
We will discuss some on today's call and will await your feedback on this.
I'll also capture the notes from the discussion today and send them out.
I have a conflict for the August 3rd next meeting, perhaps we can
reschedule for next week to get something agreed to with attachments. I
think we are near.
Thanks,
Steve Speicher | IBM Rational Software | (919) 254-0645
More information about the Oslc-Cm
mailing list