Robustness metrics measure how performance of an alternative changes in different scenarios
When to use
Alternatives can be ranked according to their robustness, including as part of Efficiency-Robustness trade-offs. This is a quantitative form of stress testing.
Evaluation of metrics requires performance measures across multiple scenarios, e.g. produced from multiple model runs with different settings.
How
A wide range of robustness metrics are possible, depending on 1) how performance measures transformed, 2) scenarios are subsetted, 3) subsetted performance measures are aggregated [1]. See Resources for further details
- Performance measures can, for example, be transformed by calculating regret from the best alternative, or by evaluating whether performance satisfies constraints
- Subsetting scenarios might involve taking the best or worst case, or another percentile
- Aggregation might involve taking mean, sum, variance or skew
Some robustness metrics specifically target adaptive decisions, e.g. measuring flexibility of an initial management action.