HistoryViewLinks to this page 2014 April 2 | 08:41 pm

Start and stop a workload

A consumer will be able to start and stop workloads using an automation product such as Tivoli System Automation for z/OS (SA z/OS). Consumers, for example Tivoli Workload Scheduler (TWS), can include the service to start and stop workloads within an orchestrated workflow. Both, starting or stopping a workload can be a complex, long-running flow since it can involve multiple software components that could be spread on multiple servers and even different platforms. Workloads are Availability Resources whose desired status can be changed through automation.

Since the scenario looks rather similar for start and stop, the description that follows uses ‘stop’ as an example. For a workload start scenario, the word ‘stop’ has to replaced by ‘start’.

As a pre-requisite, the consumer has obtained a list of all Availability Resources from the service provider, so that he understands the workloads that can be automated and what the status of each workload is. See scenario “Obtain list of workloads” for details. From the observed (= current) status, the consumer can derive whether the Availability Resource is active or inactive. An active Availability Resource doesn’t have to be started again and an inactive Availability Resource doesn’t have to be stopped again. The actual behaviour of what happens when Availability Resources are started or stopped again even though they are already in their desired status, should be an implementation decision left to the service provider.

Generic flow

  1. The consumer is given the service provider to use.
  2. The consumer knows that the Availability Resource is active and therefore can be stopped.
  3. The consumer requests from the service provider to stop the Availability Resource.
  4. After the service provider has stopped the Availability Resource, its desired status is unavailable.

Steps

  1. The consumer is given the service provider to use.
  2. The consumer knows that the Availability Resource is active and therefore can be stopped.
  3. The consumer requests an Automation Plan for this Availability Resource from the service provider. Automation Plans are used to change the desired status of an Availability Resource.
  4. The consumer creates an Automation Request for the Automation Plan requested before and sets the input parameters such that the Availability Resource will be stopped. The Automation Request supports multiple parameters that can be interpreted by the service provider to further qualify this request. For example, one parameter could be a request priority distinguishing normal, high and force.
  5. Once created, the Automation Request is executed asynchronously by the service provider. The service provider has created an Automation Result and returned that to the consumer.
  6. The consumer periodically polls the Automation Result until the request has been fulfilled.
  7. The consumer queries the observed (= current) status of the Availability Resource from the service provider.
  8. The Availability Resource is in its desired unavailable status.

Variations

The following variation describes the situation, where the workload is already in the desired unavailable status before the Automation Request is created.

  1. The consumer is given the service provider to use.
  2. The Availability Resource that the consumer wants to stop is already stopped. In this situation, no Automation Plan is needed and no further activity is necessary on the consumer’s side.
  3. The Availability Resource is in its desired unavailable status.

Another variation is the case, where the request hasn’t completed even after polling the Availability Resource’s status multiple times.

  1. See main steps 1 to 5
  2. The consumer detects that the allotted time interval to wait before giving up on automation has expired.
  3. The Availability Resource is not in its desired unavailable status. The consumer can then do a problem analysis starting from this Availability Resource to investigate the reason(s) why the Automation Request could not succeed in time (see future scenario tbd).

Note that the allotted time interval may have been selected too small and the Automation Request continues to be executed in the background. Another reason, why the Automation Request could not complete within its allotted time interval is because other, higher prioritized Automation Requests may prevent from changing the Availability Resource’s desired status.