This wiki is locked. Future workgroup activity and specification development must take place at our new wiki. For more information, see this blog post about the new governance model and this post about changes to the website.
-- IanGreen - 05 Oct 2009

Action taken at bi-weekly on this data to investigate Matt's performance needs, and possible solution using paging of results.

Statement of concern.

that providers and consumers will suffer poor time/space performance if they are processing a RequirementCollection? resource which contains very many requirements.

how can this cause poor performance:

  1. the resource representation grows in size with the number of requirements it contains. a large number of requirements would be a large xml document.
    1. mitigation - don't put anthing more than the URI of the requirement in the collection (this way, there are lots of things, but they are small). (At the time of writing this note, the specification does just that.)
  2. in certain scenarios the client will GET a collection, then iterate over the URIs GETting each requirement resource: this will require N+1 GETs (1 for the collection and N for each of the requirements) this incurrs a high (time) penalty even on very low latency networks. it might be be better in such scenarios to provide all the requirements "in one go", perhaps by inlining the requirement into the collection.
    1. contra- the web already deals with this case, at least to some extent. caching browser/http proxy will cache each of these requirement representations, so the need for "in one go" is lessened. indeed, "in one go" can hurt performance in some cases because of this.
    2. contra-contra- this breaks down for representations which are declared as uncacheable by the orginating server. then, there is a need to use conditional GET and so the latency problem comes back.
    3. contra- the inlined collection might be quite large, so space performance may be a concern.
    4. if the inlined collection is changing frequently, clients will need to repeatedly GET this large resource.
    5. we could allow inlining at the request of the client (say, GET ../requirementcollection24?inline=dc:description, or ...?inline=*) so that selected attribues are inlined in the collection representation. the client is then taking responsibility for dealing with these state management & caching issues.
  3. another approach is to allow paging (e.g., ATOM paging, RFC5005) so that large requirementcollections are GETted in chunks. such an approach has been adopted for query results by both CM and QM specs.
    1. contra- it can be tricky for implementations to selectively expire pages based on the changing parts of a collection. i'm not aware that this is done in practice.
    2. contra- a requirementcollection is different from a query result set. whilst it makes sense to abandon paging through a query result set (perhaps the user grows impatient), doing so on a requirementcollection seems less likely. there is a quality of atomicity about a requirementcollection. also, there are consistency issues in paging through a requirementcollection. the spec would need to consider what happens when paging through a collection whose contents were changing (requirements added or removed). RFC5005 does not address this concern directly (the next/prev URIs could expire in such cases, but in highly concurrent enviroments this would make the ATOM paging useless). It might be acceptable to repeat a requirement but to omit one could be unacceptable.
  4. It is accepted (by img) that paging makes good sense on query results and we would have to have a paging mechanism when query is introduced
    1. The RFC5005 spec is a custom XML vocabulary and is not RDF/XML. The RSS 1.0 spec is RDF, but doesn't seem to include pagination.
    2. We would look into this when query is admitted into the specification.

Recommendation:

We discuss inlining resources in collections (requirementcollection, linkcollection). we document the downside of clients doing this. We continue to expose non-inlined collections so that clients can take advantage of standard web architecture infrastructure.

Edit | Attach | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r2 - 05 Oct 2009 - 19:02:07 - IanGreen
 
This site is powered by the TWiki collaboration platform Copyright � by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Contributions are governed by our Terms of Use
Ideas, requests, problems regarding this site? Send feedback