[oslc-core] XHTML vs simple text in OSLC Core's common properties
Arthur Ryman
ryman at ca.ibm.com
Thu Nov 24 16:40:29 EST 2011
John,
There are several issues here.
1. How To Represent Formatted Text
Many applications require the ability to use formatted text as the value
of properties. This is especially true in the Requirements space where
people often use textual formatting to signify meaning. Many formats are
used and have there advocates, e.g. RTF, HTML, and wiki text. In order to
promote interchange, we decided to use one format for formatted text,
namely XHTML. The reasons for chosing XHTML is that it's a W3C standard
(like RDF), it's relatively easy to parse (unlike HTML), and it can be
conveniently represented in RDF using the XMLLiteral datatype.
2. <div> versus <span>
Some properties, like dcterms:description, are intended to contain
multi-line formatted text. They should contain valid <div> content. Other
properties, like dcterms:title, are intended to contain single-line text.
They should contain valid <span> content. The OSLC wiki unfortunately has
some errors caused by careless copy and paste. I've reported this and
Steve has committed to fix the errors.
3. Formatted versus Plain Text
Plain text is a subset of formatted text. However, if the plain text
contains special XML characters, they need to be replaced by character
entities when used in XMLLiteral values.
An application that only handles plain text can easily escape and unescape
the special characters. The problem comes when a client POSTs or PUTs
formatted text. In this case, I think it is acceptable to discard the
markup if the loss of the formatting is not harmful. If the app does not
natively support formatting then it's hard to see where discarding
formatting would be harmful. Discarding formatting can easily be done
using standard XML parsing libs.
The inability to handle formatting is like other inherent limitations in
the app. For example, there may be limits to string length, integer size,
or precision of floating point numbers. The app has to decide which types
of truncation are harmless and which might cause harm. If the truncation
is harmful then the app should reject the request and provide some useful
error message.
Regards,
___________________________________________________________________________
Arthur Ryman
DE, PPM & Reporting Chief Architect
IBM Software, Rational
Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile)
From:
John Arwe <johnarwe at us.ibm.com>
To:
oslc-core at open-services.net
Cc:
Joe Ross <joeross at us.ibm.com>, Anamitra Bhattacharyya
<abhattacharyya at us.ibm.com>, Robert Uthe <uthe at us.ibm.com>, Ken Parzygnat
<kparzygn at us.ibm.com>
Date:
11/15/2011 05:40 PM
Subject:
[oslc-core] XHTML vs simple text in OSLC Core's common properties
Sent by:
oslc-core-bounces at open-services.net
A subset of common properties has value-type = XMLLiteral and a
description that says the content [should] include valid
XHTML-within-<div> [1]; description and title are examples.
Most of Tivoli's existing product set expects "just straight strings" (no
rich text/XHTML permitted; markup would be treated as strings, i.e. not
recognized as markup). I'm trying to avoid re-inventing new while still
enabling existing products that would not, as servers processing a
POST(Create)/PUT/PATCH, tolerate XHTML coming in. I see oslc:name (within
Resource Shapes) which is potentially re-usable (RS is part of Common as a
resource definition, but oslc:name is not listed in common properties).
I would be interested in thoughts from Core on how best to accomplish the
goal of enabling apps not yet ready to change to accept XHTML as described
above.
(1) One possibility would be to define a Common Property whose value is
simply "String"; one might re-use oslc:name for that purpose (to avoid
defining new) or simply define new.
(2) I wonder out loud about an alternative of using the existing common
properties with an explicit type of ^^String on the [RDF/XML at least...
:-( want JSON too though] serialized representations. But I find that not
so appealing, since (at a bare minimum) it imposes requirements on client
implementations to use serialization rules more restrictive than those
defined in Core in order for one of my servers to accept the data.
(3) Becoming even more of a language lawyer than usual and noting that the
existing descriptions of the relevant properties use a conditional
(SHOULD), and the domain specs (like CM 2.0) only impose normative
requirements on implementations (not representations). So compliant
service providers (should? must? not clear!) tolerate sans-XHTML values
(good for me), and compliant clients (should, by my reading) provide
with-XHTML values which my providers (may? should? must? not clear!)
accept... I choose the "should" reading, but do not implement that, so my
provider is compliant but less useful than ones that would accept
with-XHTML values. The reason I assert "not clear!" is: specs like CM 2.0
[2] based on Core say "OSLC CM consumers and service providers MUST be
compliant with both the core specification and this CM specification, and
SHOULD follow all the guidelines and recommendations in both these
specifications. " I.e. they talk about compliance only in terms of
consumers and providers, not resources. In a case like [1]'s
oslc:shortTitle, whose entire description is "Shorter form of
dcterms:title for the resource represented as rich text in XHTML content.
SHOULD include only content that is valid inside an XHTML <div> element.
", it is left to the reader to decide the effects of the SHOULD. There is
no clear statement of responsibilities for service providers or consumers.
While my reading would be that compliant service providers MUST tolerate
sans-XHTML values (good for me), compliant clients should provide
with-XHTML values, and compliant service providers MUST accept with-XHTML
values, if my evil twin read the last MUST as a SHOULD and challenged me
to show which normative statement was violated then I would be hard
pressed to find one. If I change my goal to practical interop rather than
trying to minimize the cost of shoe-horning my existing implementation
within the letter of the spec, I have a reasonable case to argue for MUST.
[Aside and fair disclosure: [1] does in at least one place appear to
attempt to place normative restrictions on a resource - foaf:person. But
I find no place in Core that defines compliance, so we revert to domain
specs like CM 2.0 and the identical problem.]
(4) Clarify the meaning of "... SHOULD include only content that is valid
inside an XHTML <div> element. " with respect to implementations, and then
see where I stand. The preceding seems ample evidence that the current
text is ambiguous.
(5) Define an extension property(ies) that lack the XHTML restriction and
use those until my implementations learn to recognize it as markup when
present. Which, in the case of a CM 2.0 ChangeRequest, means that it
would be a gating factor in becoming compliant (dcterms:title = 1:1) as a
service provider.
(6) Accept the with-XHTML values but do not render them in my UI. Seems
within my power at least, although not perfect. The horse/water meme :-)
One could draw the conclusion OSLC assumes a Web-based UI when it requires
these XHTML-enabled fields. Is that an explicit intent of OSLC? If it is
UNintentional, 1:1 on XHTML-enabled strings would appear to be an
anti-pattern. Requiring a value and encouraging that value to contain
XHTML but then saying "well you don't have to display them ever" seems
incoherent - if they're not for display, why XHTML?
[1]
http://open-services.net/bin/view/Main/OSLCCoreSpecAppendixA?sortcol=table;table=up#OSLC_Properties
[2]
http://open-services.net/bin/view/Main/CmSpecificationV2?sortcol=table;table=up#Compliance
Best Regards, John
Voice US 845-435-9470 BluePages
Tivoli OSLC Lead - Show me the Scenario
_______________________________________________
Oslc-Core mailing list
Oslc-Core at open-services.net
http://open-services.net/mailman/listinfo/oslc-core_open-services.net
More information about the Oslc-Core
mailing list