Business Goal
Reduce Mean Time to Root Cause a performance problem with an application middleware resource.
Technical Goals
- Performance Monitoring service providers will update, in real time, the health and performance of an application’s middleware resource components.
- Performance Monitoring consumers will query the service provider directly for data about the resource, or it may use a yellow pages-style search for the resource of interest using, for example, a central registry that provides lookup services and returns the locations of monitored resources and their providers.
- The consumers of the application’s health are able to dynamically determine the health of the components. Any time a user hovers over the icon representing the middleware resource, an OSLC client will dynamically determine the current providers and federate their data about the resource into a UI preview.
Preconditions
Postconditions
Steps
- An existing resource (e.g. web application server) is used to host an application.
- An application health consumer queries for monitoring information about the resource.
- The application health provider responds with a set of Best Practices metrics that summarize the health of the web application server and its applications.
- The end user visualizes the current health of the application server and its applications through the UI preview and uses that information to determine if the performance problem is a trend or an anomoly.
- Based on their determination, the user opens a ticket, launches into a deep-dive monitoring tool, or runs some automation to quickly resolve a known issue.
Examples
UI Preview Shows |
Implication(s) |
From the UI preview, the app owner can see that the number of users connecting to a component and doing work is trending up |
This might point to a capacity problem under peak usage or point to a tuning problem during normal usage of the application. |
The number of outstanding connections between components is trending up over time |
This points to a connection leak in the application. The app owner should open a defect against the app. |
The heap usage of a software server is trending up over time |
This points to a memory leak. The app owner should open a defect against the app. |
If the resource is an operating system/computer system, the user is presented with a list of running agents and can see that a monitor that should be running is not running |
|
The user can see that available disk space is trending down or is simply lower than expected |
This points to an application not cleaning up logs or files or a capacity problem beginning to form |
The user can see the Top 5 processes in terms of CPU utilization and one is pegging a CPU |
|
The user can see the Top 5 processes in terms of Memory utilization |
|
The user can see the database server’s operational status |
If non-Active, may require administration |
The user can see that the percentage of used buffer pool is getting close to 100, and/or the number of connection entries is higher than normal |
May point to a connection leak in an application or a capacity problem |
The user can see that a reorganization of a table is needed |
|
The user sees that a particular database has a high number of failed SQL statements |
Could point to a poorly written application |
Garbage collection count is high, the % of time garbage collection being used is large, the amount of free memory available after a garbage collection is decreasing |
Indicates a memory constraint, probably a leak in the Web Application Server |
Variations
- This scenario does not preclude other resource types besides applications (e.g. computer systems, network switches, etc.)
Detailed Steps
Assumptions: The consumer and provider have shared knowledge about a target resource. It could be a shared name or shared properties. In the example below, the consumer and provider have shared properties about a resource. The consumer does a yellow pages-style search against a central registry of resource names and locations to find target candidates.
- Consumer queries the resource registry for a monitoring service provider URL for the selected resource
- Resource registry looks up resource record for resource and determines if any monitoring service provider URLs have been registered for it
- Resource registry finds monitoring service provider URL and returns the URL to the consumer as an RDF response
- Consumer invokes a GET method on monitoring service provider URL that was returned to it by the resource registry for the selected resource
- Consumer indicates compact XML in the content-type header because the receiving client is a UI preview window/iframe
- Consumer connects to monitoring service provider and issues a GET request on its URL for the target resource
- Monitoring Service provider responds to the UI preview with an HTML page embedded in a compact XML document
- Monitoring Service provider maps OSLC resource to an internal resource name
- Monitoring Service provider gets Best Practices health metrics data for resource
- Monitoring Service provider builds an HTML page, embedding the data and any UI elements (e.g. chart, labels) needed to display it
- Monitoring Service provide encodes a response document as compact-XML and returns it to the requesting consumer
- Consumer displays HTML page in UI preview window/iframe
- Based on the content returned, user takes appropriate action