[oslc-core] Need for an XML literal value type

Arthur Ryman ryman at ca.ibm.com
Fri Mar 19 16:27:31 EDT 2010


Dave,

I think we should adopt XHTML as the recommended format for rich text at 
OSLC. I assume the XHTML spec will evolve too (e.g. to add elements from 
HTML 5). The key point is that it is valid XML and is a reasonable 
interchange format for rich text.

I think we should not restrict XML literals to XHTML. 

There may be cases when a property value should be from some other XML 
vocabulary, e.g. a diagram in SVG. I think it's up to the domain to decide 
which vocabulary to use, i.e. OSLC should reuse existing standard XML 
vocabularies where appropriate.

For example, the defacto standard for project plans is Microsoft Project, 
which has an XML syntax. Many programs understand that format so it is a 
good interchange format. We could refer to documents of that format via 
URLs, but in some cases we may prefer to include it in a resource, in 
which case treating it as an XML literal would be appropriate.

Also, I've been designing formats for expressing tabular estimation data. 
Using RDF makes them very verbose, e.g. simple features such as the order 
of elements are not significant in RDF and therefore have to be expressed 
explicitly. 

There is also a deeper issue about the meaning of literal values. In RDF, 
one has to introduce blank nodes to represent complex data values. 
However, two blank nodes in different parts of the graph may not be equal 
even though they are being used to represent the same literal value. This 
means that tests for equality are more complex. This could lead to very 
complex or error-prone queries.

For example, suppose I have a property that gives the location of 
something, e.g. the latitude and longitude of a city, The following design 
represents the latitude and longitude as an XML literal, so when I compare 
2 locations I am comparing the entire contents of the <ex:Point> element, 
i.e. the whole thing is a single complex literal value:

<ex:City rdf:about="http://example.com/city/toronto">
        <ex:location rdf:parseType="Literal">
                <ex:Point>
                        <ex:latitude>43</ex:latitude>
                        <ex:longitude>79</ex:longitude>
                </ex:Point>
        </ex:location>
</ex:City>

Now consider the pure RDF version where there are no literals. Now when I 
compare 2  ex:location property values I am comparing the bnodes 
associated with the <ex:Point> element (this is the object of the triple). 
Two bnodes could contain exactly the same latitude and longitude but would 
not be equal. Hence a query for finding cities at a given location would 
have to explicitly compare latitude and longitude.

<ex:City rdf:about="http://example.com/city/toronto">
        <ex:location>
                <ex:Point>
                        <ex:latitude>43</ex:latitude>
                        <ex:longitude>79</ex:longitude>
                </ex:Point>
        </ex:location>
</ex:City>

I don't think either design is right or wrong. I think the right design 
depends on what you are trying to accomplish. Therefore we should not 
exclude the literal design.

Regards, 
___________________________________________________________________________ 

Arthur Ryman, PhD, DE


Chief Architect, Project and Portfolio Management

IBM Software, Rational

Markham, ON, Canada | Office: 905-413-3077, Cell: 416-939-5063
Twitter | Facebook | YouTube







From:
Dave <snoopdave at gmail.com>
To:
oslc-core at open-services.net
Date:
03/19/2010 01:27 PM
Subject:
Re: [oslc-core] Need for an XML literal value type
Sent by:
oslc-core-bounces at open-services.net



Arthur, thanks for your patience and good answers. I think I'm convinced.

I do have a couple more questions for the WG:

- Should we offer any guidance to folks who wish to store HTML5 or
HTML 4.0.1 content as a property value?

- Instead of an "XML Literal" value-type, what if we allowed only
"XHTML Literal"? -- thus allowing literal XML only for rich text.

Thanks,
- Dave



On Fri, Mar 19, 2010 at 12:30 PM, Arthur Ryman <ryman at ca.ibm.com> wrote:
> Dave,
>
> The data model that OSLC is adopting is RDF. We therefore need to define
> how to represent RDF in various formats. There are some native RDF
> formats, like N3 and Turtle - no problem there. For putting RDF in XML 
we
> use RDF/XML which is just as much a standard as RDF itself.
>
> In RDF/XML, the way you include XML literal values is via the attribute
> rdf:parseType="Literal" [1]. RDF defines a datatype for literal XML [2].
> This lets us put the angle brackets etc. in the RDF/XML document, and 
the
> result is a well-formed XML document which is also RDF/XML valid, and we
> can process the literal XML normally.
>
> When we define how to represent RDF as JSON, we need to define how to
> represent the literal values. An XML literal would be encoded in JSON as 
a
> string, but that string would be valid XML. What problem do you see in
> having a JSON value that contains a valid XML string?
>
> Your proposal to use xsd:string for literal XML would require us to 
escape
> all the parser-significant  XML characters, which hides the fact that 
the
> content is XML and defeats processing by standard XML tools (e.g XSLT) 
and
> would make integration with other RDF data problematic, i.e. how would
> other applications know that xsd:string values needs to be unescaped? 
How
> would we distinguish strings that were plain text (and might contain 
angle
> brackets) from text that was really esacped XML?
>
> The background for rich text is that many tools allow users to enter 
rich
> text and include that in their resource representations. This is
> especially important for Requirements tools where people do spend effort
> to highlight text, e.g. in red, to indicate some semantics. The place
> XHTML comes in is as an interchange format, i.e. OSLC resource should 
use
> XHTML for rich text. Each tool must convert it's native format to XHTML
> for purposes of interchange via OSLC resources. Uses would not type in
> XHTML directly. They would use editors provided by the tools, and the
> tools would convert it to XHTML for interchange.
>
> In the specific cases of dc:title and dc:description, we should use 
<span>
> and <div> content respectively. A JSON client that received these 
literal
> values can simply set or get this as DOM element content  via the
> innerHTML property.
>
> I'd be happy to have a telecon with you to discuss this further.
>
> [1] http://www.w3.org/TR/REC-rdf-syntax/#section-Syntax-XML-literals
> [2]
> http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#dfn-rdf-XMLLiteral
>
> Regards,
> 
___________________________________________________________________________
>
> Arthur Ryman, PhD, DE
>
>
> Chief Architect, Project and Portfolio Management
>
> IBM Software, Rational
>
> Markham, ON, Canada | Office: 905-413-3077, Cell: 416-939-5063
> Twitter | Facebook | YouTube
>
>
>
>
>
>
>
> From:
> Dave <snoopdave at gmail.com>
> To:
> Arthur Ryman/Toronto/IBM at IBMCA
> Date:
> 03/19/2010 09:53 AM
> Subject:
> Re: [oslc-core] Need for an XML literal value type
>
>
>
> On Thu, Mar 18, 2010 at 10:40 PM, Arthur Ryman <ryman at ca.ibm.com> wrote:
>> Please explain why you are concerned. Literal XML is a standard feature
> of RDF.
>
> Thanks for the helpful explanations, Steve, Ian and Arthur.
>
> I guess I still don't understand the need for XML literal in OSLC.
> XHTML can be stored as a string -- so, what are the advantages of
> storing it as literal XML? What specifically does XML literal buy us
> over using a simple string?
>
> I'm concerned here because I'd like to see non-XML representations on
> equal footing and I'd like to encourage folks to express things as
> property values and not as XML constructs. Allowing literal XML could
> open the door to XML content in places other than this XHTML case and
> that could be an issue for JavaScript developers expecting JSON and
> not wanting to ever have to parse XML.
>
> I also had concerns about embedding XML in property values because it
> would result in invalid RDF/XML, but apparently the parse-type literal
> makes this possible. Ian Green's comments seem to contradict this. Do
> we have a definitive answer here?
>
> Thanks,
> - Dave
>
>
>>
>> The pro for XML literal values is that we are using RDF in the way it
> was
>> designed. OSLC data may be combined with RDF data from other sources so
> we
>> should adhere to the standard. We do not want to create an OSLC dialect
> of
>> RDF.
>>
>> For rich text, we should adopt XHTML as the standard interchange 
format,
>> and we should transfer it in RDF/XML as literal XML, not obfuscate it 
by
>> turning it into a string.
>>
>> Many development tools capture rich text. Adopting a standard rich text
>> format for OSLC, i.e. XHTML, simplifies processing (e.g. inclusion of
> rich
>> text in UIs, documents and reports) and interchange (so tools only have
> to
>> understand one format as opposed to understand RTF, HTML, etc.). XHTML
> has
>> a simpler syntax than HTML and is XML compliant so it can be readily
>> processed by many XML technologies.
>>
>> In addition, there are other good reasons for using XML in general as
> the
>> value of a property, e.g. when there is an existing XML format, or when
>> plain old XML is a more natural way to represent a literal value
>> (otherwise you get an explosion of blank nodes).
>>
>> Regards,
>>
> 
___________________________________________________________________________
>>
>> Arthur Ryman, PhD, DE
>>
>>
>> Chief Architect, Project and Portfolio Management
>>
>> IBM Software, Rational
>>
>> Markham, ON, Canada | Office: 905-413-3077, Cell: 416-939-5063
>> Twitter | Facebook | YouTube
>>
>>
>>
>>
>>
>>
>>
>> From:
>> Dave <snoopdave at gmail.com>
>> To:
>> oslc-core at open-services.net
>> Date:
>> 03/18/2010 09:26 AM
>> Subject:
>> [oslc-core] Need for an XML literal value type
>> Sent by:
>> oslc-core-bounces at open-services.net
>>
>>
>>
>> I'm still a little concerned about adding XML literal as a value type
>> and I'm trying to understand the pros and cons. The only justification
>> that we have so far for adding an XML literal value is for storing
>> XHTML data, which we need for rich text, but we can easily store XHTML
>> data as a string.
>>
>> What specifically do we gain by putting XHTML content in-line in our
>> RDF/XML and Atom XML representations?
>>
>> And conversely, what do we lose by not doing so?
>>
>> Also, does putting XHTML content in-line in RDF/XML result in valid
>> RDF/XML?
>>
>> Thanks,
>> - Dave
>>
>> _______________________________________________
>> Oslc-Core mailing list
>> Oslc-Core at open-services.net
>> http://open-services.net/mailman/listinfo/oslc-core_open-services.net
>>
>>
>>
>>
>
>
>
>

_______________________________________________
Oslc-Core mailing list
Oslc-Core at open-services.net
http://open-services.net/mailman/listinfo/oslc-core_open-services.net







More information about the Oslc-Core mailing list