[oslc-core] OSLC Compact representation, titles with markup

Steve K Speicher sspeiche at us.ibm.com
Thu Oct 6 10:32:27 EDT 2011


The Core WG discussed this issue [1] on September 9 [2] and 21 [3] on the 
UI Preview spec [4]

The initial discussion led to the proposal that spec was clear that it was 
"XML Literal" and therefore should be treated as such.  Having going 
through some research on the original intent of this property, the 
development of it in context of OSLC spec development and RDF concerns and 
where we are today (after discussion and seeing what impls are doing): we 
should consider an alternate approach than what was suggested in the 
minutes.

New proposal: change oslc:Compact's dcterms:title type to be xsd:string 
and explain the content to be XML-encoded (be very explicit on encoding).
Impact: those implementations who generate XML literals and consume them 
per spec.  After some research I was not able to locate any of such 
implementations
Justification: the UI Preview is an XML document, it is not RDF/XML (even 
though it does validate as one).  Since the UI Preview's dcterms:title is 
primarily consumed by JavaScript/XPath in browsers.

Please let me know if there is any issue with this approach (or support)

Thanks,
Steve Speicher | IBM Rational Software | (919) 254-0645

[1] - http://open-services.net/bin/view/Main/OslcCoreV2Issues (Issue #25)
[2] - http://open-services.net/bin/view/Main/OslcCoreMeeting20110907
[3] - http://open-services.net/bin/view/Main/OslcCoreMeeting20110921
[4] - http://open-services.net/bin/view/Main/OslcCoreUiPreview


> From: Dave Steinberg <davidms at ca.ibm.com>
> To: oslc-core at open-services.net, 
> Date: 09/23/2011 10:05 AM
> Subject: Re: [oslc-core] OSLC Compact representation, titles with markup
> Sent by: oslc-core-bounces at open-services.net
> 
> Hi Arthur and all,
> 
> Sorry for being slow to respond.
> 
> Given all this, I agree with you: I think it makes sense to leave things 

> alone, make this a special case in this particular content type, and 
> document it as such. I still don't think that introducing a new property 
to 
> represent the same information in another format is ideal, and it's not 
> necessary if we document title as a plain text string that may contain 
valid
> HTML markup, as you suggest. Sticking with a single property also seems 
most
> pragmatic, as it means not imposing any additional burden on consumers.
> 
> As an aside, I'm still a fan of using typed literals in actual RDF 
> representations, and I think it would be worth considering that for 
OSLC. 
> Something interesting and related that was recently pointed out to me: 
> There's a current effort to define a new version of RDF (1.1), and they 
are 
> considering removing simple literals altogether, replacing them with 
> syntactic sugar for string-typed literals (see 
http://www.w3.org/TR/2011/WD-
> rdf11-concepts-20110830/#section-Graph-Literal).
> 
> Cheers,
> Dave
> 
> -- 
> Dave Steinberg
> IBM Rational Software
> davidms at ca.ibm.com
> 
> 
> [image removed] Arthur Ryman---09/09/2011 11:28:32 AM---Dave, I recently 

> discussed this at the Core working group.

> 
> [image removed] 
> From:
> 
> [image removed] 
> Arthur Ryman/Toronto/IBM
> 
> [image removed] 
> To:
> 
> [image removed] 
> Dave Steinberg/Toronto/IBM at IBMCA
> 
> [image removed] 
> Cc:
> 
> [image removed] 
> oslc-core at open-services.net, oslc-core-bounces at open-services.net
> 
> [image removed] 
> Date:
> 
> [image removed] 
> 09/09/2011 11:28 AM
> 
> [image removed] 
> Subject:
> 
> [image removed] 
> Re: [oslc-core] OSLC Compact representation, titles with markup
> 
> 
> 
> Dave,
> 
> I recently discussed this at the Core working group.
> 
> I think the best approach is to not regard this as an RDF discussion at 
all 
> since the compact rendering format is explicitly NOT RDF. It's content 
type is 
> application/x-oslc-compact+xml . By historical accident, it happens to 
be 
> valid RDF/XML. However, it's intended use is for Web UI, so it is very 
> appropriate to have content that is only going to be presented in a Web 
browser.
> 
> Since the content is not RDF, it does not seem useful to perpetuate the 
> masquerade that it is RDF and go to the lengths of introducing a new RDF 

> datatype. I therefore favour either leaving it as is, and explicity 
> documenting the fact that the value is a plain text string that MAY 
contain 
> valid HTML markup, or adding a new property e.g. oslc:htmlTitle.
> 
> I do not think we should provide a mechanism for adding HTML values to 
> general RDF content since that leads us back to the multiple text format 

> (XHTML, HTML, RTF, wiki, ...) set of problems.
> 
> Regards, 
> 
___________________________________________________________________________ 

> 
> Arthur Ryman 
> 
> DE, PPM & Reporting Chief Architect
> 
> IBM Software, Rational 
> 
> [image removed] 
> 
> Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile) 
> 
> [image removed] 
> 
> 
> [image removed] Dave Steinberg---09/01/2011 11:10:56 AM---Hi Arthur, 
Thanks 
> for the engagement, for seeing both sides, and for figuring out what

> 
> [image removed] 
> From:
> 
> [image removed] 
> Dave Steinberg/Toronto/IBM at IBMCA
> 
> [image removed] 
> To:
> 
> [image removed] 
> oslc-core at open-services.net
> 
> [image removed] 
> Date:
> 
> [image removed] 
> 09/01/2011 11:10 AM
> 
> [image removed] 
> Subject:
> 
> [image removed] 
> Re: [oslc-core] OSLC Compact representation, titles with markup
> 
> [image removed] 
> Sent by:
> 
> [image removed] 
> oslc-core-bounces at open-services.net
> 
> 
> 
> 
> Hi Arthur,
> 
> Thanks for the engagement, for seeing both sides, and for figuring out 
what 
> was going on with the W3C Validator (and submitting a problem report).
> 
> Regarding XHTML vs. HTML in general, I still think it would have been 
> pragmatic to look at who is actually consuming/producing marked up text 
and 
> where it's coming from/what's being done with it, to choose a format 
that 
> minimizes the amount of conversion required. That said, I do see your 
> reasons for favouring XHTML from the outset, and of course I recognize 
that 
> the decision was made long ago and revisiting would have been difficult. 

> Also, I do appreciate that you considered my point of view.
> 
> On the particular issue of compact rendering, I would strongly advocate 
for 
> option 2, defining a new datatype for HTML and using it together with 
the 
> existing dcterms:title property. Defining such a type places no greater 
> practical burden on providers or consumers than defining a new property. 
In 
> either case, it's one new resource in the vocabulary to recognize, and 
they 
> can handle values in exactly the same way (either by leaving the content 
as 
> a string and leaving it to a browser to render, or by parsing and 
> interpreting it themselves). However, using a new type separates the 
> expression of the concerns in the standard RDF way: the property 
identifies 
> the characteristic of the subject that the statement specifies, and the 
type
> suggests how to interpret the lexical form of the statement's object. 
> Moreover, if we define a type, it can be reused with other properties, 
like 
> dcterms:description, if that is ever needed.
> 
> I would also suggest that the spec should explicitly provide guidance on 

> typed vs. plain literals, hopefully in favour of the former.
> 
> Cheers,
> Dave
> 
> -- 
> Dave Steinberg
> IBM Rational Software
> davidms at ca.ibm.com
> 
> 
> [image removed] Arthur Ryman---08/31/2011 09:17:40 AM---Dave/Randy, Thx 
for 
> persisting on this point. It turns out that the W3C RDF Validator is in 
fact dis
> 
> [image removed] 
> From:
> 
> [image removed] 
> Arthur Ryman/Toronto/IBM
> 
> [image removed] 
> To:
> 
> [image removed] 
> Dave Steinberg/Toronto/IBM at IBMCA
> 
> [image removed] 
> Cc:
> 
> [image removed] 
> oslc-core at open-services.net, oslc-core-bounces at open-services.net
> 
> [image removed] 
> Date:
> 
> [image removed] 
> 08/31/2011 09:17 AM
> 
> [image removed] 
> Subject:
> 
> [image removed] 
> Re: [oslc-core] OSLC Compact representation, titles with markup
> 
> 
> 
> Dave/Randy,
> 
> Thx for persisting on this point. It turns out that the W3C RDF 
Validator is
> in fact displaying markup characters in strings wrong. It is escaping 
them. 
> You can see the correct, unescaped, results by turning on the Advanced 
> option of N-Triples output.
> 
> This discussion has made me realize that my suggested name for a new 
> oslc:htmlEncodedTitle property is misleading. Encoding is only required 
when
> you put the triple in an XML document., e.g. the OSLC compact rendering 
> resource The encoding is removed by the parser and you end up with the 
> unescaped string. Since we are defining RDF predicates, the reference to 

> encoding is inappropriate because there is no encoding at the RDF value 
level.
> 
> We therefore have the following alternatives for markup in the title:
> 
> 1. Use XML Literal datatype and XHTML content.
> 2. Define a new datatype for HTML
> 3. Define a new predicate for HTML titles, e.g. oslc:htmlTitle
> 
> Using HTML within the context of the UI preview is OK since the UI is 
> expected to be a web UI and you'd just copy the string.
> 
> However, I think using HTML in RDF is not a good idea because all 
readers of
> the data would then have to cope with it, I mentioned that Tidy could be 

> used by the writer of the data to convert it to XHTML. That does not 
mean 
> this is practical for all readers of the data. In general, when you are 
> designing a format for interoperability, you should convert diverse 
formats 
> into one common format. We should therefore adopt XHTML as the one 
common 
> format for marked up text interchange.
> 
> Recall that HTML is only one alternate format. We also have sources that 

> produce rich text (RTF), and wiki text. Agreeing on XHTML is a useful 
simplification.
> 
> Regards, 
> 
___________________________________________________________________________ 

> 
> Arthur Ryman 
> 
> DE, PPM & Reporting Chief Architect
> 
> IBM Software, Rational 
> 
> [image removed] 
> 
> Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile) 
> 
> [image removed] 
> 
> 
> [image removed] Arthur Ryman---08/30/2011 05:49:46 PM---Dave, My point 
was 
> that when you use rdf:datatype, the content of the element
> 
> [image removed] 
> From:
> 
> [image removed] 
> Arthur Ryman/Toronto/IBM at IBMCA
> 
> [image removed] 
> To:
> 
> [image removed] 
> Dave Steinberg/Toronto/IBM at IBMCA
> 
> [image removed] 
> Cc:
> 
> [image removed] 
> oslc-core at open-services.net, oslc-core-bounces at open-services.net
> 
> [image removed] 
> Date:
> 
> [image removed] 
> 08/30/2011 05:49 PM
> 
> [image removed] 
> Subject:
> 
> [image removed] 
> Re: [oslc-core] OSLC Compact representation, titles with markup
> 
> [image removed] 
> Sent by:
> 
> [image removed] 
> oslc-core-bounces at open-services.net
> 
> 
> 
> 
> Dave,
> 
> My point was that when you use rdf:datatype, the content of the element 
> must be a string, not XML. When you use rdf:parseType="Literal" the 
> content is expected to be XML. In the RDF data model, the lexical space 
of 
> XML consists of well-formed XML fragments, i.e. there is no escaping 
other 
> than that required by XML.
> 
> You managed to get the rdf:datatype case to validate by escaping the XML 

> markup, i.e. turning it into a string, which seems like unnecessary work 

> if you already have an XML fragment.
> 
> BTW, I don't understand why the W3C RDF Validation service is displaying 

> the XML content as escaped. That means the data is actually 
> double-escaped. I'd be happier seeing plain text  N-Triples or Turtle.
> 
> It seems to me that since RDF/XML is well-formed XML, then the natural 
way 
> to include XML literals is as XML, not as a string that contains escaped 

> XML markup. However, I concede your point that in principle we don't 
need 
> rdf:parseType="Literal"  if you are sure that we get exactly the same 
set 
> of triples using just rdf:datatype. If so, you are correct in saying 
that 
> rdf:parseType="Literal" is just syntactic sugar.
> 
> I see where you are going with this. You want OSLC to create a new 
> datatype for HTML and you are demonstrating that rdf:datatype gives you 
> the mechanism to do this. As I said before, creating a new datatype will 

> limit interoperability since other processors will not know how to 
process 
> the new datatype. There is no standard way to define the meaning of a 
new 
> RDF datatype.
> 
> Regards, 
> 
___________________________________________________________________________ 

> 
> Arthur Ryman 
> 
> DE, PPM & Reporting Chief Architect
> IBM Software, Rational 
> Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile) 
> 
> 
> 
> 
> 
> From:
> Dave Steinberg/Toronto/IBM at IBMCA
> To:
> oslc-core at open-services.net
> Date:
> 08/26/2011 05:31 PM
> Subject:
> Re: [oslc-core] OSLC Compact representation, titles with markup
> Sent by:
> oslc-core-bounces at open-services.net
> 
> 
> 
> Hi Arthur,
> 
> Sorry, but I just don't agree. The two links you gave are both to the 
> RDF/XML spec, and they describe a special syntax for XMLLiteral-typed 
> literals and a general syntax for typed literals. They do not state that 

> the general syntax cannot be used for the case of XMLLiteral, and they 
> don't say anything that contradicts my understanding of the RDF abstract 

> data model.
> 
> Indeed, if you follow the "XML literals" link in Section 2.8, the RDF 
> Concepts spec defines XMLLiteral, like any other datatype, with a 
lexical 
> space, a value space and a mapping between the two. So, given any XML 
> value, what is to prevent you from using that mapping to compute a 
> corresponding lexical form, combining it with the datatype URI, and 
using 
> the ordinary literal notation (in any RDF concrete syntax)?
> 
> I just tried entering the following two RDF/XML documents into the 
> validation service:
> 
> <rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:rdf="
> http://www.w3.org/1999/02/22-rdf-syntax-ns#">
> <rdf:Description rdf:about="http://example.com/bugs/2314">
> <dcterms:title rdf:parseType="Literal" xmlns="
http://www.w3.org/1999/xhtml
> "> 12345: <s>Null pointer exception during startup</s></dcterms:title>
> </rdf:Description>
> </rdf:RDF>
> 
> <rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:rdf="
> http://www.w3.org/1999/02/22-rdf-syntax-ns#">
> <rdf:Description rdf:about="http://example.com/bugs/2314">
> <dcterms:title rdf:datatype="
> http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral"> 12345: <s 
> xmlns="http://www.w3.org/1999/xhtml">Null pointer exception during 
> startup</s></dcterms:title>
> </rdf:Description>
> </rdf:RDF>
> 
> It yielded exactly the same result in both cases:
> 
> 
> 
> I can also confirm Steve's claim that Jena can be configured to write 
out 
> exactly the same triples using either syntax.
> 
> Cheers,
> Dave
> 
> -- 
> Dave Steinberg
> IBM Rational Software
> davidms at ca.ibm.com
> 
> 
> Arthur Ryman---08/26/2011 03:58:51 PM---Dave, No, it's not just 
syntactic 
> sugar. You need rdf:parseType="Literal" if you include element con
> 
> 
> From:
> 
> Arthur Ryman/Toronto/IBM
> 
> To:
> 
> Dave Steinberg/Toronto/IBM at IBMCA
> 
> Cc:
> 
> oslc-core at open-services.net, oslc-core-bounces at open-services.net
> 
> Date:
> 
> 08/26/2011 03:58 PM
> 
> Subject:
> 
> Re: [oslc-core] OSLC Compact representation, titles with markup
> 
> 
> Dave,
> 
> No, it's not just syntactic sugar. You need rdf:parseType="Literal" if 
you 
> include element content. If you use rdf:datatype then only character 
> content is allowed.
> 
> This is explained in the spec at [1] and [2]. rdf:parseType="Literal" 
> allows XML Literal content. rdf:datatype="whatever" allows string 
content.
> 
> However, since specs are hard to understand, I suggest you convince 
> yourself of this, as I did, by using the W3C RDF Validation service. [3]
> 
> [1] http://www.w3.org/TR/REC-rdf-syntax/#section-Syntax-XML-literals
> [2] 
http://www.w3.org/TR/REC-rdf-syntax/#section-Syntax-datatyped-literals
> [3] http://www.w3.org/RDF/Validator/
> 
> Regards, 
> 
___________________________________________________________________________ 

> 
> Arthur Ryman 
> 
> DE, PPM & Reporting Chief Architect
> IBM Software, Rational 
> Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile) 
> 
> 
> Dave Steinberg---08/26/2011 03:22:10 PM---Arthur, I believe you're 
> mistaken. I think that parseType="Literal" is just
> 
> 
> From:
> 
> Dave Steinberg/Toronto/IBM at IBMCA
> 
> To:
> 
> oslc-core at open-services.net
> 
> Date:
> 
> 08/26/2011 03:22 PM
> 
> Subject:
> 
> Re: [oslc-core] OSLC Compact representation, titles with markup
> 
> Sent by:
> 
> oslc-core-bounces at open-services.net
> 
> 
> 
> Arthur,
> 
> I believe you're mistaken. I think that parseType="Literal" is just 
> syntactic sugar (RDF Primer: "RDF/XML provides a special notation to 
make 
> it easy to write literals of this kind"). Either way you write it, you 
end 
> up with the same statement. Two statements with the same subject, the 
same 
> predicate and a typed literal with the same type (<
> http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>) and the same 
> lexical form are indistinguishable.
> 
> Also, if you were correct, parseType="Literal" would provide RDF/XML 
with 
> some sort of privileged XMLLiteral representation that couldn't written 
> out using any other RDF notation.
> 
> Cheers,
> Dave
> 
> -- 
> Dave Steinberg
> IBM Rational Software
> 905-413-3705
> davidms at ca.ibm.com
> 
> 
> Arthur Ryman---08/26/2011 02:22:29 PM---Randy, Your example makes the 
> content a string that looks like XHTML, i.e. the
> 
> From:
> 
> Arthur Ryman/Toronto/IBM
> 
> To:
> 
> Randy Hudson/Raleigh/IBM at IBMUS
> 
> Cc:
> 
> Dave Steinberg <davidms at ca.ibm.com>, oslc-core at open-services.net, 
> oslc-core-bounces at open-services.net
> 
> Date:
> 
> 08/26/2011 02:22 PM
> 
> Subject:
> 
> Re: [oslc-core] OSLC Compact representation, titles with markup
> 
> 
> 
> Randy,
> 
> Your example makes the content a string that looks like XHTML, i.e. the 
> content contains no XHTML elements since all the markup characters are 
> encoded. A string is simply parsed character data and is valid XML.
> 
> The correct way to include the XHTML elements is:
> 
> <dcterms:title rdf:parseType="Literal"> 12345: <s xmlns="
> http://www.w3.org/1999/xhtml">Null pointer exception during 
> startup</s></dcterms:title>
> 
> The OSLC Guidelines about escaping are for the case where you need to 
> include characters that might get misinterpreted as XML markup. For 
> example, consider a math statement like "1 < 2". When you put that in an 

> XML element, you need to encode it as "1 < 2"
> 
> Regards, 
> 
___________________________________________________________________________ 

> 
> 
> Arthur Ryman 
> 
> DE, PPM & Reporting Chief Architect
> IBM Software, Rational 
> Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile) 
> 
> 
> 
> 
> 
> From:
> Randy Hudson/Raleigh/IBM at IBMUS
> To:
> Arthur Ryman <ryman at ca.ibm.com>
> Cc:
> Dave Steinberg <davidms at ca.ibm.com>, oslc-core at open-services.net, 
> oslc-core-bounces at open-services.net
> Date:
> 08/25/2011 07:06 PM
> Subject:
> Re: [oslc-core] OSLC Compact representation, titles with markup
> 
> 
> The following input is also equivalent:
> 
> <dcterms:title rdf:datatype="
> http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral"> 12345: <s 
> xmlns="http://www.w3.org/1999/xhtml">Null pointer exception during 
> startup</s></dcterms:title>
> 
> So there are (at least) two different ways to serialize a property value 

> of type XML literal.  But, the OSLC guidelines state:
> 
> 1.2 If property value is a Literal value-type 
> 1.2.1 Inside the XML element add the value as a string with any required 

> escaping 
> 
> That would seem to suggest that the above form should be used.
> 
> -Randy
> 
> 
> 
> 
> From:
> Arthur Ryman <ryman at ca.ibm.com>
> To:
> Dave Steinberg <davidms at ca.ibm.com>
> Cc:
> oslc-core at open-services.net, oslc-core-bounces at open-services.net
> Date:
> 08/25/2011 04:34 PM
> Subject:
> Re: [oslc-core] OSLC Compact representation, titles with markup
> Sent by:
> oslc-core-bounces at open-services.net
> 
> 
> 
> Dave,
> 
> 1. XML Namespaces. 
> 
> RDF/XML is well-formed XML so it must support namespaces correctly. For 
> triples whose datatype is XML Literal, the value of this literal is a 
> well-formed XML fragment, and therefore the namespaces should be present 

> in the content. If there is an enclosing <span> element, then the 
> namespace should be there. Otherwise, each element in the content should 

> have the namespace. 
> 
> The spec doesn't say "for XHTML, you need to insert an xmlns attribute 
for 
> 
> 
> http://www.w3.org/1999/xhtml" because that is part of the XHTML 
standard, 
> i.e. it's not XHTML unless the elements are in the XHTML namespace. 
> 
> 2. Jena
> 
> I loaded the sample RDF/XML  into Fuseki which uses Jena and it produced 

> the correct result. I assume the Jena API lets you get an XML DOM from 
the 
> 
> 
> literal value.
> 
> The input contained:    <dcterms:title rdf:parseType="Literal" xmlns="
> http://www.w3.org/1999/xhtml"> 12345: <s>Null pointer exception during 
> startup</s> </dcterms:title>
> 
> The output value is:   " 12345: <s xmlns="http://www.w3.org/1999/xhtml
> ">Null pointer exception during startup</s> "^^<
> http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
> 
> 3.  XHTML versus HTML
> 
> The primary reason is that RDF supports XHTML via the XMLLiteral 
datatype. 
> 
> 
> There is no parsing support for HTML built into RDF.
> 
> Another strong reason is that the syntax of HTML is very irregular and 
> hard to parse correctly - that is one of the reasons XML was invented. 
> This is very important from a security viewpoint. To guard against 
script 
> injection attacks, you really should parse the input and remove any 
> <script> elements or Javascript attributes. Doing that correctly for 
HTML 
> requires a full HTML parser. On the other hand, the XHTML is given to 
you 
> as a DOM which you can easily traverse or process using XSLT or XPATH.
> 
> 4. Datatypes
> 
> The specs do specify the datatypes for some properties. Look at the 
> Value-Type column of the tables, e.g. [1]. You need to include the 
> datatype explicitly for ints, dates, XML. etc. You specify that using 
> rdf:datatype in RDF/XML, or using ^^ in Turtle. 
> 
> I don't know what the state of adoption is. We really should get some 
test 
> 
> 
> suites written for the specs.
> 
> 5. Inventing new Datatypes
> 
> The RDF spec defines the XSD datatypes and the XMLLiteral datatype. RDF 
> parsers know how to parse those. If someone introduces a new datatype 
URI, 
> 
> 
> it could break parsers since they won't know how to parse the contents. 
> There is no standard way to define new datatypes. 
> 
> Try it with the RDF Validation service [2]
> 
> [1] http://open-services.net/bin/view/Main/OSLCCoreSpecAppendixA
> [2] http://www.w3.org/RDF/Validator/
> 
> Regards, 
> 
___________________________________________________________________________ 

> 
> 
> 
> Arthur Ryman 
> 
> DE, PPM & Reporting Chief Architect
> IBM Software, Rational 
> Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile) 
> 
> 
> 
> 
> 
> From:
> Dave Steinberg/Toronto/IBM at IBMCA
> To:
> oslc-core at open-services.net
> Date:
> 08/24/2011 03:05 PM
> Subject:
> Re: [oslc-core] OSLC Compact representation, titles with markup
> Sent by:
> oslc-core-bounces at open-services.net
> 
> 
> 
> Hi Arthur,
> 
> Thanks for the response. Apologies for being slow in replying; I've been 

> out sick for the last day and a half.
> 
> I agree that putting the XML namespace on the enclosing element would be 
a 
> 
> 
> convenience, but only if tools supported that. As far as I could find, 
> Jena provides no fine-grained access to namespace declarations (i.e. 
other 
> 
> 
> than at the model level), so I believe that one couldn't use it to 
produce 
> 
> 
> or consume the fragment that you suggested. Moreover, the other RDF 
> representations offer no such convenience, even in theory.
> 
> So, it seems to me that the suggestion to use a namespace was actually a 

> pretty significant one, and not one that's reflected in the specs, since 

> you'd always need an enclosing element for your XML content.
> 
> Thanks for the suggestion of using Tidy to convert from HTML to XHTML. 
> That was very helpful for me. But I must admit, I'm still left wondering 

> what makes XHTML superior to HTML for interchanging formatted text, 
> especially in light of the compact representation example and my own 
> experiences, where the opposite seems to be true.
> 
> One last thing that I'll emphasize is that I mentioned a lack of 
guidance 
> in the OSLC specs specifically about plain vs. typed literals. It seems 
so 
> 
> 
> odd to me that plain literals seem to be favoured everywhere, except 
when 
> in comes to using XMLLiteral with rdf:parseType="literal", but none of 
> this is acknowledged or explained anywhere. It looks like using a typed 
> literal in this one case is accepted merely as a requirement to benefit 
> from the prettier RDF/XML syntax for XML content. However, I view things 

> completely in the opposite light. To me, typed literals are a powerful 
> benefit of RDF. You can use a typed literal to decide how to handle a 
> literal value, without looking at the value itself, but that advantage 
is 
> lost without a sufficiently specific type. Thus, I don't understand how 
> defining and using a new RDF datatype to identify something as widely 
> recognized and understood as HTML would impair interoperability. I think 

> it would do the opposite.
> 
> Cheers,
> Dave
> 
> -- 
> Dave Steinberg
> IBM Rational Software
> davidms at ca.ibm.com
> 
> 
> Arthur Ryman---08/23/2011 10:09:55 AM---Dave, Thx for the comments.
> 
> 
> From:
> 
> Arthur Ryman/Toronto/IBM
> 
> To:
> 
> Dave Steinberg/Toronto/IBM at IBMCA
> 
> Cc:
> 
> oslc-core at open-services.net, oslc-core-bounces at open-services.net
> 
> Date:
> 
> 08/23/2011 10:09 AM
> 
> Subject:
> 
> Re: [oslc-core] OSLC Compact representation, titles with markup
> 
> 
> Dave,
> 
> Thx for the comments.
> 
> I agree that the guidance on using XMLLiteral is not very clear in the 
> spec. There was a lot of discussion about this at the time the spec was 
> under development, but not much of that discussion survived the 
editorial 
> process. The only place I see it is in the appendix on standard 
properties 
> 
> 
> - dcterms:title and dcterms:description. [1]
> 
> The guidance was that dcterms:title should be valid XHTML <span> content 

> and dcterms:description valid XHTML <div> content. This means that the 
RDF 
> 
> 
> datatype should be XMLLiteral and that appropriate namespaces should be 
> used for XHTML content.
> 
> Putting the XHTML namespace on the enclosing element is a convenience. 
The 
> 
> 
> parser should propagate that to the content, i.,e. when you look at the 
> triples, the XML literal node should have the inherited namespace. 
> 
> If you wanted the namespace directly in the content then you could 
enclose 
> 
> 
> the content in a <div> or <span> and put the namespace there.
> 
> Using XHTML is the best way to achieve interchange of formatted text. 
> There are converter from HTML to XHTML, e.g. Tidy. However, in the case 
of 
> 
> 
> preview, why would conversion be needed? Shouldn't we be defining 
content 
> that is XHTML?
> 
> In another use case, people wanted to use native Wiki text as the 
content. 
> 
> 
> However, that would cause a big interop problem since there are many 
Wiki 
> syntaxes. All of these are convertible to XHTML since that is what the 
> Wikis do to display the formatted result. In another use case, people 
> wanted to include Rich Text.
> 
> The general theme is that developers want to use whatever native format 
> their tool supports, e,g, HTML, wiki text, and Rich Text, since it 
avoids 
> conversions. However, this would couple the resource to the tool. OSLC 
is 
> trying to achieve interoperability among heterogeneous tools. Therefore 
a 
> common rich text format is needed.
> 
> The alternative of defining new RDF datatypes for HTML, wiki text, RTF 
> etc. would mean that OSLC resources would not be understood by other 
> applications. In general, the creation of new RDF datatypes is 
discouraged 
> 
> 
> since it impairs interoperability.
> 
> [1] 
> http://open-services.net/bin/view/Main/OSLCCoreSpecAppendixA?
> sortcol=table;up=#Dublin_Core_Properties
> 
> 
> 
> 
> Regards, 
> 
___________________________________________________________________________ 

> 
> 
> 
> Arthur Ryman 
> 
> 
> DE, PPM Chief Architect
> 
> IBM Software, Rational 
> 
> Toronto Lab | +1-905-413-3077 
> Twitter | Facebook | YouTube
> 
> 
> 
> 
> Dave Steinberg---08/23/2011 12:06:32 AM---Hi all, I've been following 
this 
> 
> 
> thread with interest, as it touches on some of the
> 
> 
> From:
> 
> Dave Steinberg/Toronto/IBM at IBMCA
> 
> To:
> 
> oslc-core at open-services.net
> 
> Date:
> 
> 08/23/2011 12:06 AM
> 
> Subject:
> 
> Re: [oslc-core] OSLC Compact representation, titles with markup
> 
> Sent by:
> 
> oslc-core-bounces at open-services.net
> 
> 
> 
> Hi all,
> 
> I've been following this thread with interest, as it touches on some of 
> the more general confusion/discomfort I've been developing over the past 

> several weeks or months about the use of XMLLiteral with 
> rdf:parseType="Literal" for HTML content.
> 
> Adam's comments below are particularly interesting. In general, it's not 

> clear to me who benefits from the use of the unescaped literal 
> representation, or in what scenario. And that approach, then, requires 
the 
> 
> 
> use of the XMLLiteral type, which I also wonder about (as I'll explain 
> further). If there is some benefit that I don't know about, perhaps it 
> derails this whole line of thought. But if there isn't, could this be a 
> case of the concrete representation tail wagging the abstract syntax 
dog?
> 
> One thing that always struck me as odd was that rdf:parseType="Literal" 
> examples were the only ones I could find anywhere in OSLC that use typed 

> literals (the XMLLiteral type is implicit with this special RDF/XM L 
> syntax). Moreover, I couldn't find any guidance in the specs about the 
use 
> 
> 
> of plain vs. typed literals at all. From the perspective of a client, 
> anyway, it would seem a very nice thing if a particular provider would 
use 
> 
> 
> a typed literal to tell you that a title, for example, should be treated 

> as a simple string or as HTML content. And that's the very thing that 
> typed literals do. It could be that the presence of an XMLLiteral type 
is 
> supposed to signal the use of XHTML content, and the absence of any type 

> is supposed to signal plain text. But I couldn't find that spelled out 
> anywhere -- if it is, perhaps it's hard to find, or perhaps I just did a 

> poor job of looking -- and I'd argue it would be better to include types 

> in both cases. [1]
> 
> It's this line of thinking that leads me to question the use of 
XMLLiteral 
> 
> 
> in the first place. I saw in some old discussions that the intention in 
> OSLC was not for XMLLiteral to imply XHTML necessarily. Using it for 
other 
> 
> 
> XML languages was considered and endorsed, in principle. But where does 
> that leave XHTML? With a type that doesn't really say what it is or what 

> you can do with it. We have specs that communicate the XHMTL intent in 
> words, but we also have a mechanism built into RDF that could 
communicate 
> this, and we're not really using it fully. Thus, I think it would be 
> preferable to define and use a type that specifically represents HTML. 
And 
> 
> 
> note, I suggest HTML, not XHTML, since using any type other than 
> XMLLiteral eliminates the "benefit" of the special 
rdf:parseType="Literal" 
> 
> 
> syntax. And without that, I don't see a particular benefit in the 
stricter 
> 
> 
> XHTML syntax.
> 
> One other possibility that I've considered, which Arthur suggested 
> previously, is using a namespace to identify that the XML is XHTML, in 
> particular, instead of doing it directly in the literal type. And I 
> believe that, strictly, the XHTML namespace is required for the elements 

> to be valid XHTML. But I found no hint of this in the spec or any 
> examples, and certainly RTC doesn't do this (I haven't checked other 
> providers). Moreover, I believe it's also a worse approach, since 
there's 
> no guarantee that your RDF runtime of choice will give you access to 
> namespaces declared on the property element (I don't believe Jena does), 

> and detecting a namespace inside the element content would require 
> actually parsing the value as XML. If all you want to do is pass markup 
> along for display in a browser, it would be unfortunate to have to 
> actually parse the content to determine that it's XHTML.
> 
> And this is where I close the loop on my thinking, by coming back to how 
a 
> 
> 
> consumer might actually want to make use of HTML content. Even outside 
of 
> the compact rendering scenario, ultimately it's probably going to get 
> displayed by a browser, whether as part of a larger Web page or in a 
> browser-backed widget in a rich client. And for that, HTML is probably 
> just as good as, if not better than, XHTML. Rather than worrying about 
> whether the content is well-formed XML, it's probably sufficient to just 

> give it to the browser and see what it can do with it. I would assert 
that 
> 
> 
> "something a browser can render" has been the working definition of HTML 

> for a good number of years now, while XHTML has largely faded in 
> importance.
> 
> Going the other way, the appeal of HTML really shows. If a provider 
> natively deals with HTML (without concern for XML well-formedness), it 
> would be attractive to not have to convert that into XHTML to expose it 
> via OSLC. Likewise, a consumer may use a rich text control that yields 
> HTML. Generalized parsing of HTML for conversion to XHTML is 
non-trivial, 
> and it seems unfortunate to impose that conversion task onto everyone, 
> just so that we can use rdf:parseType="Literal" in RDF/XML and avoid 
> applying normal XML encoding to markup content (of course, some encoding 

> will likely be required for other RDF syntaxes anyway).
> 
> So, those are my thoughts on this (admittedly enlarged) topic. Even if 
> they all do make perfect sense (and I'm not necessarily claiming they 
do), 
> 
> 
> I realize we may be well past the point of being able to act on them. 
> Still, I thought I'd put them out there and see what others make of 
them.
> 
> Cheers,
> Dave
> 
> 
> [1] In fact, I think that the consistent use of typed literals in 
general 
> would be beneficial. You could even imagine exploiting them as a 
> compatibility measure, if it was decided that the type of a property 
> needed to change. This is a related, but separate, topic, which I'd be 
> thrilled to discuss further, but I don't want to open too many cans of 
> worms at once.
> 
> [2] Or, perhaps, a less kind way of putting that is that the XHTML 
> namespace is required for the elements to 
> 
> -- 
> Dave Steinberg
> IBM Rational Software
> davidms at ca.ibm.com
> 
> 
> Adam Archer---08/22/2011 06:20:05 PM---The big concern to me is not the 
> ability to process the RDF/XML with XPath, it's the ability to do
> 
> From:
> 
> Adam Archer/Toronto/IBM at IBMCA
> 
> To:
> 
> Arthur Ryman/Toronto/IBM at IBMCA
> 
> Cc:
> 
> "oslc-core at open-services.net" <oslc-core at open-services.net>, Randy 
Hudson 
> <hudsonr at us.ibm.com>, oslc-core-bounces at open-services.net
> 
> Date:
> 
> 08/22/2011 06:20 PM
> 
> Subject:
> 
> Re: [oslc-core] OSLC Compact representation, titles with markup
> 
> Sent by:
> 
> oslc-core-bounces at open-services.net
> 
> 
> 
> The big concern to me is not the ability to process the RDF/XML with 
> XPath, it's the ability to do so in a browser environment. Currently all 

> implementations of all rich hovers in all Jazz based products encode any 

> html tags in their dcterms:title attributes (and doubly encode special 
> characters). For the consumer on the browser side, this means simply 
> taking the content of the attribute, decoding it (which browsers are 
very 
> good at) and slapping the result into the dom (which browsers are also 
> very good at). 
> 
> The alternative would be a total consumability nightmare from the point 
of 
> 
> 
> view of a browser (which is the most important consumer of this entire 
> spec). If the tags are actually child nodes in the xml representation, 
it 
> means we will have child elements in the resulting document that we get 
> back from the xml http request which means we will have to traverse a 
dom 
> tree and recreate a structure which could easily be represented as an 
> escaped string, like everyone is doing today. 
> 
> I realize that implementation is not supposed to lead the spec, but I 
> don't even think that would be the case here. The oslc compact spec grew 

> organically out of the old jazz compact rendering spec which can be 
found 
> here: 
> 
> https://jazz.net/wiki/bin/view/Sandbox/CompactRenderingV1P1 
> 
> If we look at the semantic description of the dc:title and 
jp:abbreviation 
> 
> 
> it states explicitly that the content MUST be escaped: 
> 
> > The HTML markup MUST be escaped; for example, "<b>" as "<b>". 
> 
> This decision was made consciously for very well defined technical 
reasons 
> 
> 
> (discussed above) in the original spec. If that decision was reversed in 

> the creation of the OSLC compact spec then I believe that to have been a 

> huge mistake and would like to see the spec fixed rather than for all 
> providers to have to change how their compact documents are served and 
all 
> 
> 
> consumers to have to go to the trouble of walking the dom to determine 
> what the provider is actually trying to show. 
> 
> Adam Archer
> Jazz Developer
> IBM Toronto Lab 
> 
> 
> 
> From: Arthur Ryman/Toronto/IBM 
> To: Samuel Padgett <spadgett at us.ibm.com> 
> Cc: Adam Archer/Toronto/IBM at IBMCA, Randy Hudson <hudsonr at us.ibm.com>, 
> "oslc-core at open-services.net" <oslc-core at open-services.net>, 
> oslc-core-bounces at open-services.net 
> Date: 08/22/2011 04:40 PM 
> Subject: Re: [oslc-core] OSLC Compact representation, titles with markup 

> 
> 
> Sam, 
> 
> You wrote: 
> 
> It's very difficult to parse the former using XPath. For instance, the
> expression "/oslc:Compact/dcterms:title" takes out the "<s>" and "</s>"
> 
> I don't think problems using XPath are a valid reason to encode markup 
> since RDF/XML itselt is very difficult to process using XPath. At one 
> point we tried to define an OSLC-variant of RDF/XML that looked like 
> "normal" XML. However, we abandonned that and now require support for 
> generic RDF/XML. 
> 
> The are many equivalent ways to represent a given set of triples in 
> RDF/XML. It would therefore be very problematic to use XPath, XSLT, or 
> XQuery to process RDF/XML. The safe way to process RDF/XML is to use an 
> RDF toolkit like Jena. 
> 
> Regards, 
> 
___________________________________________________________________________ 

> 
> 
> 
> Ar thur Ryman 
> 
> 
> DE, PPM Chief Architect 
> 
> IBM Software, Rational 
> 
> Toronto Lab | +1-905-413-3077 
> Twitter | Facebook | YouTube
> 
> 
> 
> 
> 
> 
> From: 
> Samuel Padgett <spadgett at us.ibm.com> 
> To: 
> "oslc-core at open-services.net" <oslc-core at open-services.net> 
> Cc: 
> Adam Archer/Toronto/IBM at IBMCA, Randy Hudson <hudsonr at us.ibm.com> 
> Date: 
> 08/07/2011 01:01 PM 
> Subject: 
> [oslc-core] OSLC Compact representation, titles with markup 
> Sent by: 
> oslc-core-bounces at open-services.net
> 
> 
> 
> 
> 
> I believe the spec is a bit confusing when it comes to titles with 
markup
> for UI Preview.
> 
> The Compact representation has a dcterms:title property. It's defined as 

> an
> XML Literal that can contain XHTML markup [1]. My understanding of XML
> Literals as discussed in the RDF Primer [2] means a title with markup 
> would
> look like this,
> 
> <dcterms:title>12345: <s>Null pointer exception during
> startup</s></dcterms:title>
> 
> The example [3] of this resource has a title like this, however,
> 
> <dcterms:title> 12345: <s>Null pointer exception during
> startup</s> </dcterms:title>
> 
> The example doesn't seem to fit with the description.
> 
> It's very difficult to parse the former using XPath. For instance, the
> expression "/oslc:Compact/dcterms:title" takes out the "<s>" and "</s>"
> Most implementations I'm aware also follow the example where markup is
> encoded. It means special characters need to be "double encoded." For
> instance, "12345: Values > 1000 incorrectly calculated" would be,
> 
> <dcterms:title>12345: Values &gt; 1000 incorrectly
> calculated</dcterms:title>
> 
> I think we should add more clarity to the spec here, as getting this 
wrong
> can open up consumers to cross-site scripting attacks. I'd also suggest 
we
> say that providers MUST NOT use any markup with a <script> tag and 
> consumer
> MUST NOT display any markup with a <script> tag to guard against this
> problem.
> 
> Best Regards,
> Sam
> 
> 
> [1]
> http://open-services.net/bin/view/Main/OslcCoreUiPreview?
> sortcol=table;up=#Representation_Compact
> 
> 
> 
> [2] http://www.w3.org/TR/rdf-syntax/#xmlliterals
> [3]
> http://open-services.net/bin/view/Main/OslcCoreUiPreview?
> sortcol=table;up=#XML_Representation_Format
> 
> 
> 
> 
> 
> _______________________________________________
> Oslc-Core mailing list
> Oslc-Core at open-services.net
> http://open-services.net/mailman/listinfo/oslc-core_open-services.net
> 
> _______________________________________________
> Oslc-Core mailing list
> Oslc-Core at open-services.net
> http://open-services.net/mailman/listinfo/oslc-core_open-services.net
> 
> _______________________________________________
> Oslc-Core mailing list
> Oslc-Core at open-services.net
> http://open-services.net/mailman/listinfo/oslc-core_open-services.net
> 
> 
> _______________________________________________
> Oslc-Core mailing list
> Oslc-Core at open-services.net
> http://open-services.net/mailman/listinfo/oslc-core_open-services.net
> 
> 
> 
> 
> _______________________________________________
> Oslc-Core mailing list
> Oslc-Core at open-services.net
> http://open-services.net/mailman/listinfo/oslc-core_open-services.net
> 
> 
> 
> 
> _______________________________________________
> Oslc-Core mailing list
> Oslc-Core at open-services.net
> http://open-services.n et/mailman/listinfo/oslc-core_open-services.net
> 
> 
> _______________________________________________
> Oslc-Core mailing list
> Oslc-Core at open-services.net
> http://open-services.net/mailman/listinfo/oslc-core_open-services.net
> 
> 
> 
> 
> _______________________________________________
> Oslc-Core mailing list
> Oslc-Core at open-services.net
> http://open-services.net/mailman/listinfo/oslc-core_open-services.net
> 
> 
> _______________________________________________
> Oslc-Core mailing list
> Oslc-Core at open-services.net
> http://open-services.net/mailman/listinfo/oslc-core_open-services.net
> 
> 
> _______________________________________________
> Oslc-Core mailing list
> Oslc-Core at open-services.net
> http://open-services.net/mailman/listinfo/oslc-core_open-services.net





More information about the Oslc-Core mailing list