### Want to contribute?

Initiating a project: Act 1. Setup, Act 2. Estimation, Act 3. Completion, Investment Risk, Summary

Contents

• Quantile Functions
• Expected Value, Variance, and Standard Deviation

• ## Investment Risk

Aside from using a highly simplified predictor model for effort, the current scenario has up to this point omitted the important aspect of investment risk which we’ll now discuss. In general, investment risk is a measure of the uncertainty in realizing the predicated business value of an investment such as purchasing a mutual fund or, in our case, executing a software development project. Other terms such as estimate, plan, outlook, and forecast are also used to denote predictions. Any time one makes a prediction, there is some uncertainty and this must be quantified in order to make informed business decisions.

As Yogi Berra observed, “It’s tough to make predictions, especially about the future”. While this statement is true, we can at least attempt to quantify our level of uncertainty about our predictions. We refer to this level of uncertainty as risk.

Although the predicated business value of the Tsunami 1.0 proposal looked good, it was based on several assumptions which were themselves predictions. Since the assumptions were uncertain, so is the predicted business value. In order for a portfolio manager to make sound investment decisions, the risk must be quantified. High risk is not necessarily bad, as long as the associated return on the investment is correspondingly high. Portfolio managers demand a higher rate of return for risky investments.

## Point Estimates

A quantity whose estimated value is given as a single number is called a point estimate. Point estimates give the expected value of a quantity, but do not convery the degree of uncertainty in the estimate. In order to convery the uncertainty, we need to use probability distributions.

## Probability Distributions

The usual approach to quantifying risk is to view predicted business value not as a single number, but as a probability distribution, and to use the variance of this distribution as the measure of risk. A probability distribution gives the likelihood that a measurement of some quantity, such as business value or ESLOC, will have a value in a given range. Variance measures how spread out the likely measurement values are. A wide spread means we can’t predict the measured value very accurately, so our prediction has high risk. If the variance is zero then there is no uncertainty or risk, i.e. the measurement always has the same value. A quantity whose measurements are described by a probability distribution is also referred to as a random variable.

## Triangular Distributions

For example, our ESLOC size estimate might have a probability distribution that gives positive values for measurements that range from from 30,000 ESLOC to 100,000 ESLOC, peaking at 50,000 ELOC. In this case the estimator believes that it is very unlikely that the product could have a size less than 30,000 ESLOC, that it is most likely that the product would be around 50,000 ESLOC, and that it is also very unlikely to have a size greater than 100,000 ESLOC.

A probability distribution defined by a triple of low, most likely, and high values is referred to as a triangular distribution. Triangular distributions are very intuitive and easy for estimators to use.

The above scenario must therefore be modified to include probability distributions for the size, effort, and benefit. These then combine to produce a probability distribution for the business value. The uncertainty in the size estimate can be expressed as follows:

###### Listing of Scenario with TriangularDistribution
``````<?xml version="1.0"?>
<ems:Scenario xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dcterms="http://purl.org/dc/terms/" xmlns:ems="http://open-services.net/ns/ems#">
<dcterms:title rdf:parseType="Literal">Deliver Tsunami 1.0 ASAP</dcterms:title>
<dcterms:description rdf:parseType="Literal">
Deliver Tsunami 1.0 as
soon as possible so we can
capture more of the market. Do whatever it
takes to deliver it in 6
months. We think the minimum functional
content will take around
50,000 ESLOC of Java to implement.
</dcterms:description>
<ems:project rdf:resource="http://braintwistors.example.com/ems10/Project/4201" />
<ems:assumes>
<ems:MeasureDistribution>
<dcterms:title rdf:parseType="Literal">Total Duration in Months</dcterms:title>
<ems:metric
rdf:resource="http://open-services.net/ns/ems/metric#Duration" />
<ems:unitOfMeasure
rdf:resource="http://open-services.net/ns/ems/unit#Month" />
<ems:distribution>
<ems:PointEstimate>
<ems:numericValue rdf:datatype="http://www.w3.org/2001/XMLSchema#double">6</ems:numericValue>
</ems:PointEstimate>
</ems:distribution>
</ems:MeasureDistribution>
</ems:assumes>
<ems:assumes>
<ems:MeasureDistribution>
<dcterms:title rdf:parseType="Literal">Total Size in ESLOC</dcterms:title>
<ems:metric rdf:resource="http://open-services.net/ns/ems/metric#Esloc" />
<ems:unitOfMeasure
rdf:resource="http://open-services.net/ns/ems/unit#Loc" />
<ems:distribution>
<ems:TriangularDistribution>
<ems:low rdf:datatype="http://www.w3.org/2001/XMLSchema#double">30000</ems:low>
<ems:mostLikely rdf:datatype="http://www.w3.org/2001/XMLSchema#double">50000</ems:mostLikely>
<ems:high rdf:datatype="http://www.w3.org/2001/XMLSchema#double">100000</ems:high>
</ems:TriangularDistribution>
</ems:distribution>
</ems:MeasureDistribution>
</ems:assumes>
<!--
Other properties of this resource have been omitted for brevity.
-->
</ems:Scenario>
``````

In the above XML, we have introduced the TriangularDistribution element to model the uncertainty, and have used this in place of the single numeric value for the size estimate. Similarily, there is some allowed variation in the assumed schedule. Suppose the allowed schedule variation is plus or minus one month, i.e. the low is 5 months and the high is 7 months.

## Quantile Functions

The estimation tool takes these probabilistic estimates as inputs, and produces the effort estimate as output. Since the effort estimate depends on the statisical model and historic database of projects, other sources of uncertainty are introduced. Now the estimate has a probability distribution that is more complex. One general way of describing probability distributions is to use a quantile function. Quantiles divide the probabilities into equal ranges and gives the value of the metric that achieves the cumulative probability for each range.

For example, let’s consider quartiles. These divide the probabilities into 4 equal ranges whose upper limits are 25%, 50%, 75%, and 100%. The effort estimate might be expressed as follows. There is a 0% chance that the project can be completed in less than 18 person-months, a 25% chance that the project will require 20 person-months or less, a 50% chance that the project will require 25 person-months or less, a 75% chance that the project will require as much as 35 person-months, and we are certain that the project can be completed with no more that 50 person-months. These values are assembled into a quantile function as follows:

###### Listing of Estimate with QuantileFunction
``````<?xml version="1.0"?>
<ems:Estimate xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dcterms="http://purl.org/dc/terms/" xmlns:ems="http://open-services.net/ns/ems#">
<dcterms:title rdf:parseType="Literal">Deliver Tsunami 1.0 ASAP Estimate by
Guestimator 101
</dcterms:title>
<dcterms:description rdf:parseType="Literal">
Guestimator 101 predicts
25 person-months.
</dcterms:description>
<ems:scenario rdf:resource="http://braintwistors.example.com/ems10/Scenario/5721" />
<ems:predicts>
<ems:MeasureDistribution>
<dcterms:title rdf:parseType="Literal">Total Effort</dcterms:title>
<ems:metric rdf:resource="http://open-services.net/ns/ems/cost#effort" />
<ems:unitOfMeasure
rdf:resource="http://open-services.net/ns/ems/units#person-month" />
<ems:distribution>
<ems:QuantileFunction>
<ems:numberOfQuantiles rdf:datatype="http://www.w3.org/2001/XMLSchema#int">4</ems:numberOfQuantiles>
<ems:low rdf:datatype="http://www.w3.org/2001/XMLSchema#double">18</ems:low>
<ems:quantile>
<ems:Quantile>
<ems:probability rdf:datatype="http://www.w3.org/2001/XMLSchema#double">0.25</ems:probability>
<ems:numericValue rdf:datatype="http://www.w3.org/2001/XMLSchema#double">20</ems:numericValue>
</ems:Quantile>
</ems:quantile>
<ems:quantile>
<ems:Quantile>
<ems:probability rdf:datatype="http://www.w3.org/2001/XMLSchema#double">0.50</ems:probability>
<ems:numericValue rdf:datatype="http://www.w3.org/2001/XMLSchema#double">25</ems:numericValue>
</ems:Quantile>
</ems:quantile>
<ems:quantile>
<ems:Quantile>
<ems:probability rdf:datatype="http://www.w3.org/2001/XMLSchema#double">0.75</ems:probability>
<ems:numericValue rdf:datatype="http://www.w3.org/2001/XMLSchema#double">35</ems:numericValue>
</ems:Quantile>
</ems:quantile>
<ems:high rdf:datatype="http://www.w3.org/2001/XMLSchema#double">50</ems:high>
</ems:QuantileFunction>
</ems:distribution>
</ems:MeasureDistribution>
</ems:predicts>
<!--
Other properties of this resource have been omitted for brevity.
-->
</ems:Estimate>
``````

In the above XML, we have introduced the QuantileFunction element to model general probability distributions. The values of the metric are listed in sequence corresponding to the quantiles in ascending order.

## Expected Value, Variance, and Standard Deviation

Given a probability distribution, whether expressed as a triangular distribution, quantile function, or some other means, one can compute its expected value and variance using standard statistical definitions. The variance of business value probability distribution corresponds to the notion of investment risk. The expected business value and its variance are the key numbers used by portfolio managers when evaluating project proposals.

The expected value of a probability distribution is often denoted by the Greek letter mu. The dimensions of variance are the square of the dimensions of the expected value, which makes comparisons between the two difficult. It is therefore common to work instead with the standard deviation which is the square root of the variance. The dimensions of the standard deviation are the same as the dimensions of the probability distribution. Standard deviation is often denote by the Greek letter sigma.

We can compute the following statistics for the estimated effort from our example quantile function:

``````mu(effort) = 28.5 person-months
sigma(effort) = 9.4 person-months
``````

Note that the expected effort of 28.5 person-months is higher than the 50% probability value of 25 person-months. This is because probability distribution has a longer tail for high values of effort. The quantile function tells us that the low value is 18 person-months which is only 7 person-months under, but the high value is 50 person-months, which is 25 person-months over.

Now let’s look at the business case given these uncertainties. The cost statistics are:

``````mu(cost) = 28.5 * 150,000 / 12 USD = 356,250 USD
sigma(cost) = 9.4 * 150,000 / 12 USD = 117,593 USD
``````

``````mu(value) = 500,000 USD - 356,250 USD = 143,750 USD
sigma(value) = 117,593 USD
``````

The business case is still viable, although less attractive than before. Note that we are still using the point estimate of 500,000 USD for the benefit. In practice, it is very difficult to accurately predict the benefits of a new product. Portfolio managers therefore also consider other non-financial factors, such as market attractiveness and strategic alignment, when evaluating new product proposals. Nevertheless, have a good handle on the cost estimate helps to quantify the overall investment risk.

Next: Summary

Previous: Act 3. Completion