When to use
Scenario discovery is a form of vulnerability analysis. It is a participatory computer-assisted scenario development approach used to summarise multiple plausible future scenarios and identify scenarios relevant to the design and choice of policies. Such policy relevant scenarios consist of sets of future states of the world representing vulnerabilities for the proposed policy i.e., either conditions in which a policy may fail to meet its performance goals or in which its performance deviates significantly from the optimum policy [1].
Scenario discovery algorithms aim to capture sets of uncertain model input parameter values:
- with a high proportion of policy-relevant cases (high coverage),
- primarily relevant to policy cases (high density i.e., that minimise decision-irrelevant inputs within a region), and
- easy to understand (high interpretability i.e., describing relevant parameter dimensions with influential input parameters for an output) [2].
- One or more computer simulation models to generate uncertain model input parameters datasets,
- Statistical and/or data-mining algorithms to identify candidate scenarios,
- Diagnostic tools for evaluating the significance of the parameter constraints proposed by scenario discovery algorithms [1]
How
Three steps are involved in scenario discovery: database generation, algorithm selection, and scenario assessment and selection.
Database generationDatabase generation with a simulation model links policy actions to consequences of interest, with each scenario described by a vector representing a particular point in a dimensional space of uncertain model input parameters.
A threshold performance level is applied to select a set of cases of interest, using some policy-relevant criteria [1]. To avoid aggregation of performance criteria, scenario discovery has also been applied to multiple performance metrics separately, yielding a set of scenarios, explicitly describing the sensitivity of scenario discovery results to the choice of different metrics [12].
The system model of interest is sampled based on the application of an experimental design over the uncertain inputs for a candidate strategy, e.g. using the Exploratory Modelling & Analysis Workbench [11]. Two approaches to sampling include [13]:
- static input-oriented sampling (e.g., Monte Carlo, Latin Hypercube sampling, criterion based sampling) which consider the input space without considering the resulting output space.
- adaptive output-oriented sampling, which considers the resulting output space and a dynamical approach to complex issues or system non-linearities, e.g. using a roughness measure to characterise the dynamic complexity of simulation runs and direct the adaptive sampling process to potentially areas of interest in the output space [13]
To speed up database generation, scenario discovery algorithms can also be used with "surrogate models" or "meta-models" that provide a faster running model that is trained to replicate the relationship between inputs and outputs, e.g. with PRIM using "rule extraction" [4].
Algorithm selection
One or more algorithms are applied to identify candidate scenarios. Each algorithm can define groups of scenarios in their own way.- PRIM (Patient Rule Induction Method) identifies "boxes" in parameter space, i.e. bounds on parameter values within which scenarios meet some pre-defined criteria. PRIM is a bump-hunting (activity region finding) algorithm used to achieve a desired balance between coverage, density and interpretability.
- PCA-PRIM and CPCA-PRIM [3] don't use the original parameters, but instead identify boxes using orthogonal rotations obtained through principal component analysis (PCA), optionally with constraints to improve interpretability. This may improve density and coverage of boxes.
- CART (Classification and Regression Tree) is a classification algorithm that typically provide outputs as a decision tree. The algorithm successively partitions the input space with a sequence of binary splits. Discovered scenarios are therefore described by a sequence of classification rules.
- Behaviour-based scenario discovery [10] considers model dynamics over time based on the use of time series clustering during the scenario discovery algorithms' identification stage to identify common macro-level behaviours in the ensemble of output time series. Two interests of the approach are:
- to induce regions in the input parameter space associated with model behaviours over time rather than considering a static state of the model.
- to consider every output in an ensemble included in a cluster and mapped to a region (box) in the input parameter space, rather than a small subset.
Assessment and selection of scenarios
The user assesses scenario quality using diagnosis tools, including measures of coverage, density and interpretabiliy, to improve understanding of the identified scenarios and evaluate the ability of the algorithm to characterise cases of interest in the database. Diagnostic tools include [1]:- Resampling test, to evaluate the reproducibility of the scenario discovery algorithm results.
- Quasi-p-value test, to estimate the likelihood that algorithm constrains some parameter by chance.
Selection of criteria may be performed informally (including drawing on diagnosis metrics) or formally, e.g. based on the consideration of utility scores for failure scenarios [1].