[Oslc-Automation] Thoughts on teardown scenarios - increased number of resources & length of persistence

Thu May 23 09:20:36 EDT 2013

There is another issue with modelling other actions with the 
plan/request/result model.

Charles, you said "I think there are exactly 3 plans here...  Thus, it 
doesn't actually scale out based on the number of VM Instances. ...the 
plans are likely to not exist as real resources, but rather OSLC 
Automation facades to existing functionality".

However, the number of Automation Requests and Automation Results would 
scale out based on the number of VM instances. While these might not need 
to exist for as long as the plans, they still need to be available for 
some amount of time.

For example, with the teardown of a VM instance there might be cases where 
the length of time that that teardown will take is unknown - it could 
range from less than a second up to 5 or 10 minutes, depending on what's 
running on that VM and how carefully it (or its dependencies) need to be 
torn down - and this might be an unknown value to the automation provider. 
As such, if the request and result were no longer available once the 
teardown had finished, it is possible that the consumer will receive an 
HTTP 404 "not found" error when subsequently requesting the Automation 
Request, and no results when querying for the Automation Result, in which 
case is that enough to safely infer that the action completed 
successfully, that the resource was torn down? If a failed teardown would 
result in ongoing costs building up (e.g. per-minute costs for running a 
VM) and such a failure needs to be flagged up promptly to a human user to 
deal with, I do not think the consumer could safely ignore such a response 
from the provider without possibly missing an error case that the human 
user would need to look into.
On the other hand if the resources are persisted for any length of time 
beyond the completion of the action then the fact that the resources "are 
just generated (or responded to) on the fly" is no longer true - they need 
to be persisted for perhaps longer than they would need to be in the 
provider's native model (if the native interactions were synchronous).

If the action is performed very quickly, then the result might have 
finished and been removed before the consumer even knowns its URI - 
especially if the request was created from a delegate UI, which would mean 
that the Automation Result cannot be "included" in the response.

This is perhaps an issue with the words "Providers can persist automation 
results for as long as they deem reasonable" from the spec. From the 
writing of the first version of the spec, what thoughts were there around 
the problems that might arise from results disappearing before consumers 
expect? Does a 404 (or an empty query result) necessarily mean it has 
finished? Or could that mean it hasn't started yet?

Martin

From:   Charles Rankin <rankinc at us.ibm.com>
To:     Stephen Rowles/UK/IBM at IBMGB, 
Cc:     oslc-automation at open-services.net
Date:   22/05/2013 16:40
Subject:        Re: [Oslc-Automation] Thoughts on teardown scenarios
Sent by:        "Oslc-Automation" 
<oslc-automation-bounces at open-services.net>

"Oslc-Automation" <oslc-automation-bounces at open-services.net> wrote on 
05/22/2013 02:52:44 AM:

> From: Stephen Rowles <stephen.rowles at uk.ibm.com> 
> 
> I don't see why Automation resources are (or should be) any 
> different from the other resources defined in OSLC. When you look 
> at, for example, Quality Management, the spec don't expect a Test 
> Script to simply be a pointer to another sort of resource that 
> really contains the information needed, it is a representation of 
> that information. 
> 
> I think that Automation resources should be the same, they should be
> representing the information directly not being a pointer to yet 
> another resource. I think this is more in keeping with the way other
> OSLC resources are defined. 

I agree that an Automation resource should represent its resource 
directly, and I think the description I provided is in line with that. 

> If you look at the language as defined in the spec: 
> 
> Automation Plan - Defines the unit of automation which is available 
> for execution. 
> Test Script Resource - Defines a program or list of steps used to 
> conduct a test 
> 
> The definition of both of these resources doesn't give any 
> indication that they are simple pointers to something else (at least
> to my reading). 

My feeling is that the Automation Plan is a definition of the *action* 
that is to be taken, not of the resource on which the action is to be 
taken.  Typical OSLC resources describe some form of "object" (give me a 
touch of latitude here for the sake of an upcoming analogy).  And OSLC 
describes mechanisms to do basic CRUD (Create/Read/Update/Delete) 
operations on them (in OO parlance, OSLC would provide new/delete and 
getter/setter methods).  My view is that the OSLC Automation spec provides 
a means to define arbitrary "functions" or "methods" for OSLC "objects" 
(or "actions" on "resources" if you prefer). 

In the v2 version of the spec, I think we basically worked through the 
mechanics of how to execute/invoke actions in a standardized way.  Now, as 
we look to the v3 version of the spec, we are really starting to 
understand how to apply that mechanism to various tasks and/or domains.   

> Taking the VM example that you defined I can see that having many 
> Automation plans is nice because there is little understanding 
> required about each one. However what if the running instance of the
> VM is something created many times a day, the number of Automation 
> Plans will rapidly get large, consider a VM template that is turned 
> into a real VM 20 times a day (not unreasonable if you have a large 
> scale dynamic provisioning system). 
> 
> If there needs to be 3 automation plans for each instance for 
> restart/start/stop that's 60 automation plans every day, this 
> rapidly will get out of hand. 

In the generic provider scenario, I think there are exactly 3 plans here, 
one for each of restart/start/stop.  One of the parameters into the plan 
would be the URL to the VM Instance resource upon which to act.  Thus, it 
doesn't actually scale out based on the number of VM Instances.  For the 
purpose built provider, I could easily see the same mechanism being used, 
meaning the references to the restart/start/stop plans on the VM Instance 
are pointing to the "generic" versions, and you still pass the VM Instance 
URL as one of the parameters.  And, if it's truly purpose built, then the 
plans are likely to not exist as real resources, but rather OSLC 
Automation facades to existing functionality.  So, the definitions are 
just generated (or responded to) on the fly. 

As an aside, if you take the viewpoint that the Plan/Result *is* the 
resource, I don't understand how you would otherwise account for these 
different actions.  You would invoke the Automation Plan (which would, I 
think, represent the VM Image) for instantiating the VM Instance, with, I 
presume, the Automation Result representing the actual VM Instance.  And, 
I get (I think) that the VM Instance would get deleted when the Automation 
Result goes away.  But, how do I restart/start/stop the instance in this 
scenario?   

Charles Rankin
Rational CTO Team -- Mobile Development Strategy
101/4L-002 T/L 966-2386_______________________________________________
Oslc-Automation mailing list
Oslc-Automation at open-services.net
http://open-services.net/mailman/listinfo/oslc-automation_open-services.net

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://open-services.net/pipermail/oslc-automation_open-services.net/attachments/20130523/58bfcd73/attachment.html>