[oslc-core] OSLC Compact representation, titles with markup

Tue Aug 23 10:40:18 EDT 2011

Adam,

Your argument for escaping HTML is based purely on the use case of compact 
rendering. However, the OSLC specs are designed to enable many other use 
cases and to promote general interoperability among different tools.

The OSLC core spec provides guidelines for the use of some common RDF 
properties, including those from the Dublin Core. [1] dcterms:title and 
dcterms:description in general contain text that may include formatting 
information. There were several candidates for syntax - HTML, Wiki, RTF. 
The RDF spec contains the means for including XML literals, and XHTML is 
the W3C standard for formatted text. All the other text formats are 
convertible to XHTML. It was therefore chosen as the recommended way to 
package rich text.

The historical design of Jazz compact rendering is optimized for 
consumption by web browsers. However, the RDF plain text datatype does NOT 
imply that the text is HTML encoded. Another processor would be justified 
in displaying the verbatim encoding.

However, for the case of OSLC preview, I see no reason why the text 
couldn't be encoded, but it should be put in a property that makes that 
clear, e.g. oslc:htmlEncodedTitle.

FYI, I work on Focal Point and its REST API provides both a plain text and 
a marked up text version of some attributes.

[1] 
http://open-services.net/bin/view/Main/OSLCCoreSpecAppendixA?sortcol=table;up=#Dublin_Core_Properties

Regards, 
___________________________________________________________________________ 

Arthur Ryman 

DE, PPM Chief Architect

IBM Software, Rational 

Toronto Lab | +1-905-413-3077 
Twitter | Facebook | YouTube

From:
Adam Archer/Toronto/IBM
To:
Arthur Ryman/Toronto/IBM at IBMCA
Cc:
Samuel Padgett <spadgett at us.ibm.com>, Randy Hudson <hudsonr at us.ibm.com>, 
"oslc-core at open-services.net" <oslc-core at open-services.net>, 
oslc-core-bounces at open-services.net
Date:
08/22/2011 05:00 PM
Subject:
Re: [oslc-core] OSLC Compact representation, titles with markup

The big concern to me is not the ability to process the RDF/XML with 
XPath, it's the ability to do so in a browser environment. Currently all 
implementations of all rich hovers in all Jazz based products encode any 
html tags in their dcterms:title attributes (and doubly encode special 
characters). For the consumer on the browser side, this means simply 
taking the content of the attribute, decoding it (which browsers are very 
good at) and slapping the result into the dom (which browsers are also 
very good at).

The alternative would be a total consumability nightmare from the point of 
view of a browser (which is the most important consumer of this entire 
spec). If the tags are actually child nodes in the xml representation, it 
means we will have child elements in the resulting document that we get 
back from the xml http request which means we will have to traverse a dom 
tree and recreate a structure which could easily be represented as an 
escaped string, like everyone is doing today.

I realize that implementation is not supposed to lead the spec, but I 
don't even think that would be the case here. The oslc compact spec grew 
organically out of the old jazz compact rendering spec which can be found 
here:

https://jazz.net/wiki/bin/view/Sandbox/CompactRenderingV1P1

If we look at the semantic description of the dc:title and jp:abbreviation 
it states explicitly that the content MUST be escaped:

> The HTML markup MUST be escaped; for example, "<b>" as "<b>". 

This decision was made consciously for very well defined technical reasons 
(discussed above) in the original spec. If that decision was reversed in 
the creation of the OSLC compact spec then I believe that to have been a 
huge mistake and would like to see the spec fixed rather than for all 
providers to have to change how their compact documents are served and all 
consumers to have to go to the trouble of walking the dom to determine 
what the provider is actually trying to show.

Adam Archer
Jazz Developer
IBM Toronto Lab

From:   Arthur Ryman/Toronto/IBM
To:     Samuel Padgett <spadgett at us.ibm.com>
Cc:     Adam Archer/Toronto/IBM at IBMCA, Randy Hudson <hudsonr at us.ibm.com>, 
"oslc-core at open-services.net" <oslc-core at open-services.net>, 
oslc-core-bounces at open-services.net
Date:   08/22/2011 04:40 PM
Subject:        Re: [oslc-core] OSLC Compact representation, titles with 
markup

Sam,

You wrote:

It's very difficult to parse the former using XPath. For instance, the
expression "/oslc:Compact/dcterms:title" takes out the "<s>" and "</s>"

I don't think problems using XPath are a valid reason to encode markup 
since RDF/XML itselt is very difficult to process using XPath. At one 
point we tried to define an OSLC-variant of RDF/XML that looked like 
"normal" XML. However, we abandonned that and now require support for 
generic RDF/XML.

The are many equivalent ways to represent a given set of triples in 
RDF/XML. It would therefore be very problematic to use XPath, XSLT, or 
XQuery to process RDF/XML. The safe way to process RDF/XML is to use an 
RDF toolkit like Jena.

Regards, 
___________________________________________________________________________ 

Arthur Ryman 

DE, PPM Chief Architect

IBM Software, Rational 

Toronto Lab | +1-905-413-3077 
Twitter | Facebook | YouTube

From:
Samuel Padgett <spadgett at us.ibm.com>
To:
"oslc-core at open-services.net" <oslc-core at open-services.net>
Cc:
Adam Archer/Toronto/IBM at IBMCA, Randy Hudson <hudsonr at us.ibm.com>
Date:
08/07/2011 01:01 PM
Subject:
[oslc-core] OSLC Compact representation, titles with markup
Sent by:
oslc-core-bounces at open-services.net

I believe the spec is a bit confusing when it comes to titles with markup
for UI Preview.

The Compact representation has a dcterms:title property. It's defined as 
an
XML Literal that can contain XHTML markup [1]. My understanding of XML
Literals as discussed in the RDF Primer [2] means a title with markup 
would
look like this,

  <dcterms:title>12345: <s>Null pointer exception during
startup</s></dcterms:title>

The example [3] of this resource has a title like this, however,

  <dcterms:title> 12345: <s>Null pointer exception during
startup</s> </dcterms:title>

The example doesn't seem to fit with the description.

It's very difficult to parse the former using XPath. For instance, the
expression "/oslc:Compact/dcterms:title" takes out the "<s>" and "</s>"
Most implementations I'm aware also follow the example where markup is
encoded. It means special characters need to be "double encoded." For
instance, "12345: Values > 1000 incorrectly calculated" would be,

  <dcterms:title>12345: Values &gt; 1000 incorrectly
calculated</dcterms:title>

I think we should add more clarity to the spec here, as getting this wrong
can open up consumers to cross-site scripting attacks. I'd also suggest we
say that providers MUST NOT use any markup with a <script> tag and 
consumer
MUST NOT display any markup with a <script> tag to guard against this
problem.

Best Regards,
Sam

[1]
http://open-services.net/bin/view/Main/OslcCoreUiPreview?sortcol=table;up=#Representation_Compact

[2] http://www.w3.org/TR/rdf-syntax/#xmlliterals
[3]
http://open-services.net/bin/view/Main/OslcCoreUiPreview?sortcol=table;up=#XML_Representation_Format

_______________________________________________
Oslc-Core mailing list
Oslc-Core at open-services.net
http://open-services.net/mailman/listinfo/oslc-core_open-services.net