[Oslc-Automation] Thoughts on teardown scenarios - increased number of resources & length of persistence
David N Brauneis
brauneis at us.ibm.com
Tue May 28 10:11:52 EDT 2013
Martin,
I do not think that you can rely on a 404 to have either the meaning that
the automation result has not finished yet (actually, I think the result
*should* be created when the automation starts thus it having a state that
lets you know where it is in the overall progress) or has not started.
I guess I'm struggling to understand why an Automation Provider would
immediately delete the result (record of the automation run and
success/failure) - what would be the point? Do you have a concrete example
of this type of provider?
In my opinion, a 404 indicates that you have either a bad URL, the item
has already been deleted, or that the Automation Result has not yet been
created.
Regards,
David
____________________________________________________________________
David Brauneis
STSM, Rational Software CTO Office, Advanced Technology & New Product
Incubation
email: brauneis at us.ibm.com | phone: 720-395-5659 | mobile: 919-656-0874
From: Martin P Pain <martinpain at uk.ibm.com>
To: David N Brauneis/Raleigh/IBM at IBMUS,
Cc: oslc-automation at open-services.net, Oslc-Automation
<oslc-automation-bounces at open-services.net>, Charles
Rankin/Austin/IBM at IBMUS
Date: 05/28/2013 04:25 AM
Subject: Re: [Oslc-Automation] Thoughts on teardown scenarios -
increased number of resources & length of persistence
David,
My point about the Automatino Results not being available as soon as they
have completed is based on the situation where the provider does not want
to persist the results any longer than necessary. e.g. if it is an adapter
to an API that does not persist information about anything that has
already finished. In this case it would not be incompatible with the spec
to delete the result as soon as it completed. "Providers can persist
automation results for as long as they deem reasonable" does not state a
minimum "reasonable time", so a provider implementation could deem zero
time after completion to be "reasonable".
So, yes, the 404 or empty query set would be when the result was deleted,
but my question is "what about if the result is deleted as soon as it
completes"?
That is, when the OSLC automation resources "are just generated (or
responded to) on the fly" as Charles mentioned, then the deletion of an
Automation Result may not be an active operation - it may happen
implicitly if the data that it is generated from is no longer available
once the automation has completed.
So I'll reword my question to clarify:
>From the writing of the first version of the spec, what thoughts were
there around the problems that might arise from results being deleted
before consumers expect? Does a 404 (or an empty query result) necessarily
mean it has finished? Or could that mean it hasn't started yet?
Martin
From: David N Brauneis <brauneis at us.ibm.com>
To: Martin P Pain/UK/IBM at IBMGB,
Cc: Charles Rankin <rankinc at us.ibm.com>,
oslc-automation at open-services.net, Oslc-Automation
<oslc-automation-bounces at open-services.net>, "Oslc-Automation"
<oslc-automation-bounces at open-services.net>
Date: 23/05/2013 17:57
Subject: Re: [Oslc-Automation] Thoughts on teardown scenarios -
increased number of resources & length of persistence
Martin,
As for the following question from your note on this subject:
> This is perhaps an issue with the words "Providers can persist
automation results for as long as they deem reasonable" from the spec.
>From the writing of the first version of the spec, what thoughts were
there around the problems that might > arise from results disappearing
before consumers expect? Does a 404 (or an empty query result) necessarily
mean it has finished? Or could that mean it hasn't started yet?
The result is not a truly transient resource but a resource that can be
deleted - the request on the other hand is a transient resource and should
not be depended upon. A result is persistent so I'm not sure I understand
why you would think that the fact that is completed/finished would cause
it to return a 404 or an empty query set - I would more likely expect that
to happen if the result had been deleted.
What I believe this was intended to mean is that a result exists for some
amount of time but not necessarily forever - for example, if the
automation plan is for continuous integration and they occur 10 times an
hour, that would mean 240 results per day (or 1680 per week or 87600 per
year)... keeping all of those result forever would eventually be costly
for most implementation of automation providers, both from a data/disk
usage and performance perspective. Most available automation providers
that we looked at had some ability to remove automation results via either
explicitly removing them or a policy to remove them after something occurs
(time based, number of results based, etc.).
Regards,
David
____________________________________________________________________
David Brauneis
STSM, Rational Software CTO Office, Advanced Technology & New Product
Incubation
email: brauneis at us.ibm.com | phone: 720-395-5659 | mobile: 919-656-0874
From: Martin P Pain <martinpain at uk.ibm.com>
To: Charles Rankin/Austin/IBM at IBMUS,
Cc: oslc-automation at open-services.net, Oslc-Automation
<oslc-automation-bounces at open-services.net>
Date: 05/23/2013 09:21 AM
Subject: Re: [Oslc-Automation] Thoughts on teardown scenarios -
increased number of resources & length of persistence
Sent by: "Oslc-Automation"
<oslc-automation-bounces at open-services.net>
There is another issue with modelling other actions with the
plan/request/result model.
Charles, you said "I think there are exactly 3 plans here... Thus, it
doesn't actually scale out based on the number of VM Instances. ...the
plans are likely to not exist as real resources, but rather OSLC
Automation facades to existing functionality".
However, the number of Automation Requests and Automation Results would
scale out based on the number of VM instances. While these might not need
to exist for as long as the plans, they still need to be available for
some amount of time.
For example, with the teardown of a VM instance there might be cases where
the length of time that that teardown will take is unknown - it could
range from less than a second up to 5 or 10 minutes, depending on what's
running on that VM and how carefully it (or its dependencies) need to be
torn down - and this might be an unknown value to the automation provider.
As such, if the request and result were no longer available once the
teardown had finished, it is possible that the consumer will receive an
HTTP 404 "not found" error when subsequently requesting the Automation
Request, and no results when querying for the Automation Result, in which
case is that enough to safely infer that the action completed
successfully, that the resource was torn down? If a failed teardown would
result in ongoing costs building up (e.g. per-minute costs for running a
VM) and such a failure needs to be flagged up promptly to a human user to
deal with, I do not think the consumer could safely ignore such a response
from the provider without possibly missing an error case that the human
user would need to look into.
On the other hand if the resources are persisted for any length of time
beyond the completion of the action then the fact that the resources "are
just generated (or responded to) on the fly" is no longer true - they need
to be persisted for perhaps longer than they would need to be in the
provider's native model (if the native interactions were synchronous).
If the action is performed very quickly, then the result might have
finished and been removed before the consumer even knowns its URI -
especially if the request was created from a delegate UI, which would mean
that the Automation Result cannot be "included" in the response.
This is perhaps an issue with the words "Providers can persist automation
results for as long as they deem reasonable" from the spec. From the
writing of the first version of the spec, what thoughts were there around
the problems that might arise from results disappearing before consumers
expect? Does a 404 (or an empty query result) necessarily mean it has
finished? Or could that mean it hasn't started yet?
Martin
From: Charles Rankin <rankinc at us.ibm.com>
To: Stephen Rowles/UK/IBM at IBMGB,
Cc: oslc-automation at open-services.net
Date: 22/05/2013 16:40
Subject: Re: [Oslc-Automation] Thoughts on teardown scenarios
Sent by: "Oslc-Automation"
<oslc-automation-bounces at open-services.net>
"Oslc-Automation" <oslc-automation-bounces at open-services.net> wrote on
05/22/2013 02:52:44 AM:
> From: Stephen Rowles <stephen.rowles at uk.ibm.com>
>
> I don't see why Automation resources are (or should be) any
> different from the other resources defined in OSLC. When you look
> at, for example, Quality Management, the spec don't expect a Test
> Script to simply be a pointer to another sort of resource that
> really contains the information needed, it is a representation of
> that information.
>
> I think that Automation resources should be the same, they should be
> representing the information directly not being a pointer to yet
> another resource. I think this is more in keeping with the way other
> OSLC resources are defined.
I agree that an Automation resource should represent its resource
directly, and I think the description I provided is in line with that.
> If you look at the language as defined in the spec:
>
> Automation Plan - Defines the unit of automation which is available
> for execution.
> Test Script Resource - Defines a program or list of steps used to
> conduct a test
>
> The definition of both of these resources doesn't give any
> indication that they are simple pointers to something else (at least
> to my reading).
My feeling is that the Automation Plan is a definition of the *action*
that is to be taken, not of the resource on which the action is to be
taken. Typical OSLC resources describe some form of "object" (give me a
touch of latitude here for the sake of an upcoming analogy). And OSLC
describes mechanisms to do basic CRUD (Create/Read/Update/Delete)
operations on them (in OO parlance, OSLC would provide new/delete and
getter/setter methods). My view is that the OSLC Automation spec provides
a means to define arbitrary "functions" or "methods" for OSLC "objects"
(or "actions" on "resources" if you prefer).
In the v2 version of the spec, I think we basically worked through the
mechanics of how to execute/invoke actions in a standardized way. Now, as
we look to the v3 version of the spec, we are really starting to
understand how to apply that mechanism to various tasks and/or domains.
> Taking the VM example that you defined I can see that having many
> Automation plans is nice because there is little understanding
> required about each one. However what if the running instance of the
> VM is something created many times a day, the number of Automation
> Plans will rapidly get large, consider a VM template that is turned
> into a real VM 20 times a day (not unreasonable if you have a large
> scale dynamic provisioning system).
>
> If there needs to be 3 automation plans for each instance for
> restart/start/stop that's 60 automation plans every day, this
> rapidly will get out of hand.
In the generic provider scenario, I think there are exactly 3 plans here,
one for each of restart/start/stop. One of the parameters into the plan
would be the URL to the VM Instance resource upon which to act. Thus, it
doesn't actually scale out based on the number of VM Instances. For the
purpose built provider, I could easily see the same mechanism being used,
meaning the references to the restart/start/stop plans on the VM Instance
are pointing to the "generic" versions, and you still pass the VM Instance
URL as one of the parameters. And, if it's truly purpose built, then the
plans are likely to not exist as real resources, but rather OSLC
Automation facades to existing functionality. So, the definitions are
just generated (or responded to) on the fly.
As an aside, if you take the viewpoint that the Plan/Result *is* the
resource, I don't understand how you would otherwise account for these
different actions. You would invoke the Automation Plan (which would, I
think, represent the VM Image) for instantiating the VM Instance, with, I
presume, the Automation Result representing the actual VM Instance. And,
I get (I think) that the VM Instance would get deleted when the Automation
Result goes away. But, how do I restart/start/stop the instance in this
scenario?
Charles Rankin
Rational CTO Team -- Mobile Development Strategy
101/4L-002 T/L 966-2386_______________________________________________
Oslc-Automation mailing list
Oslc-Automation at open-services.net
http://open-services.net/mailman/listinfo/oslc-automation_open-services.net
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
_______________________________________________
Oslc-Automation mailing list
Oslc-Automation at open-services.net
http://open-services.net/mailman/listinfo/oslc-automation_open-services.net
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://open-services.net/pipermail/oslc-automation_open-services.net/attachments/20130528/4c704e6f/attachment-0003.html>
More information about the Oslc-Automation
mailing list