A method for critical review of assumptions in model-based assessments

This method aims to systematically identify, prioritise and analyse importance and strength of assumptions in the quantification of Environmental Indicators under various scenarios (such as in the Netherlands Environmental Outlook). These indicators are typically based on chains of soft-linked computer model calculations that start with scenarios for population and economic growth. The models in the chain vary in complexity. Often, these calculation chains behind indicators involve many analysts from several disciplines. Many assumptions have to be made in combining research results in these calculation chains, especially since the output of one computer model often does not fit the requirements of input for the next model (scales, aggregation levels). Assumptions are also frequently applied to simplify parts of the calculations. Assumptions can be made explicitly or implicitly. 

Assumptions can to some degree be value laden. This method distinguishes 4 types of value-ladenness of assumptions: value-ladenness in a socio-political sense (e.g., assumptions may be coloured by political preferences of the analyst), in a disciplinary sense (e.g., assumptions are coloured by the discipline in which the analyst was educated), in an epistemic sense (e.g., assumptions are coloured by the approach that the analyst prefers) and in a practical sense (e.g., the analyst is forced to make simplifying assumptions due to time constraints).
The method can be applied by the analysts carrying out the environmental assessment. However, each analyst has limited knowledge and perspectives with regard to the assessment topic, and in consequence will have some ‘blind spots’. Therefore preferably other analysts (peers) are involved in the method as well. Stakeholders, with their specific views and knowledge, can be involved as well. This can, for instance, be organised in the form of a workshop. The group of persons involved in the assumption analysis will be referred to as ‘the participants’
 
The Method
 
The method involves 7 steps:
 

1. Identify explicit and implicit assumptions in the calculation chain
2. Identify and prioritise key-assumptions in the chain
3. Assess the potential value-ladenness of the key-assumptions
4. Identify ‘weak’ links in the calculation chain
5. Further analyse potential value-ladenness of the key-assumptions
6. Revise/extend assessment
- sensitivity analysis key assumptions
- diversification of assumptions
- different choices in chain
7. Communication
?- key-assumptions
?- alternatives and underpinning of choices regarding assumptions made
?- influence of key-assumptions on results
?- implications in terms of robustness of result

 

All steps will be elaborated on below.
 
Step 1 - Identify explicit and implicit assumptions in the calculation chain
In the first step implicit and explicit assumptions in the calculation chain are identified by the analyst by systematic mapping and deconstruction of the calculation chain, based on document analysis, interviews and critical review. The resulting list of assumptions is then reviewed and completed in a workshop.
The aggregation level of the assumptions on the assumption list may vary. An assumption can refer to a specific detail in the chain (“The assumption that factor x remains constant’), as well as refer to a cluster of assumptions on a part of the chain (“Assumptions regarding sub-model x”).
 
Step 2 - Identify and prioritise key-assumptions in the chain
In step 2 the participants identify the key-assumptions in the chain. The assumptions identified in step 1 are prioritised by taking into account the influence of the assumptions on the end results of the assessment. Ideally, this selection is based on a quantitative sensitivity analysis. Since such an analysis will often not be attainable, the participants are asked to estimate the influence of the assumptions on outcomes of interest of the assessment. An expert elicitation technique can be used in which the experts bring forward their opinions and argumentation on whether an assumption is of high or low influence on the outcome. Based on the discussion the participants then can indicate their personal estimate regarding the magnitude of the influence, informed by the group discussion. A group ranking is established by aggregating the individual scores.
 
Step 3 - Assess the potential value-ladenness of assumptions
To assess potential value ladenness of assumptions, a ‘pedigree matrix’ is used that contains criteria by which the potential value-ladenness of assumptions can be reviewed. The pedigree matrix is presented in Table 1 and will be discussed in detail later on.
For each key-assumption all pedigree criteria are scored by the participants. Here, again a group discussion takes place first, in order for the participants to remedy each other’s blind spots and exchange arguments.
 
The order in which the key-assumptions are discussed in the workshop is determined by the group ranking established in step 2 of the method, starting with the assumption with the highest rank.
 
Step 4 - Identify ‘weak’ links in the calculation chain
The pedigree matrix is designed such that assumptions that score low on the pedigree criteria have a high potential for value-ladenness. Assumptions that, besides a low score on the criteria, also have a high estimated influence on the results of the assessment can be viewed as problematic weak links in the calculation chain.
 
Step 5 - Further analyse potential value-ladenness key-assumptions
In step 5, the nature of the potential value-ladenness of the individual key-assumptions is explored. Based on inspection of the diagrams visualizing the pedigree scores (or the table of pedigree scores), it can be analysed:
-          what types of value-ladenness possibly play a role and to what extent
-          to what extent there is disagreement on the pedigree scores among the participants
-          whether changing assumptions is feasible and desirable
 
Step 6 - Revise/extend assessment
Based on the analysis in step 5, it can be decided to change or broaden the assessment. As a minimum option, the assessment can be extended with a sensitivity analysis, which gives more information on the influence of weak links in the assessment.
Besides a sensitivity analysis, specific assumptions can be revised or diversified. In the case of revising an assumption, the assumption is replaced by a different assumption. In some cases however, it will be difficult or undesirable to choose between alternative assumptions, since there might be differing views on the issue. If these assumptions have a high influence on the assessment as a whole, it can be decided to diversify the assumptions: the calculation chain is ‘calculated’ using several alternative assumptions in addition to the existing ones. In this way several assessments are formed, with differing outcomes, depending on what assumptions are chosen.
 
Step 7 - Communication
It is important to be explicit about potential value-ladenness in the chain and the effects of potentially value-laden assumptions on the outcomes of the assessment. Analogous to a patient information leaflet, the presentation of the assessment results should be accompanied by information on:
- what are the key-assumptions in the calculation chain
- what are the weak links in the chain
- what were the alternatives and what is the underpinning of the choices that were made regarding assumptions
- what is the robustness of the outcomes of interest in view of the key assumptions
 
The Pedigree matrix
The pedigree matrix for assessing the potential value-ladenness of assumptions is presented in Table 1. For a general introduction to the concept of pedigree matrix, we refer to the description of the NUSAP system in this tool catalogue. The criteria are discussed below.

 

 

 


Pedigree matrix for the assessment of the potential value-ladenness of assumptions 

Influence of situational limitations
The choice for an assumption can be influenced by situational limitations, such as limited availability of data, money, time, software, tools, hardware, and human resources. In absence of these restrictions, the analyst would have made a different assumption.
Although indirectly these limitations might be of a socio-political nature (e.g., the institute the analyst works for has other priorities and has a limited budget for the analyst’s work), from the analyst’s point of view these limitations are given. It can therefore be seen as primarily producing value-ladenness in a practical sense.
 
Plausibility
Although it is often not possible to assess whether the approximation created by the assumption is in accordance with reality, mostly an (intuitive) assessment can be made of the plausibility of the assumption.
If an analyst has to revert to fictive or speculative assumptions, the room for epistemic value-ladenness will often be larger. To some extent a fictive or speculative assumption also leaves room for potential disciplinary and socio-political value-ladenness. This is, however, dealt with primarily in the criteria ‘agreement among peers’ and ‘agreement among stakeholders’ respectively.
 
Choice space
The choice space indicates to which degree alternatives were available to choose from when making the assumption. In general, it can be said that a large choice space leaves more room for the epistemic preferences of the analyst. Often, the potential for value-ladenness in an epistemic sense is larger in case of a larger choice space. A large choice space will to some extent also leave more room for disciplinary and socio-political value-ladenness. These are however primarily dealt with in the criteria ‘agreement among peers’ and ‘agreement among stakeholders’ respectively.
 
Agreement among peers
An analyst makes the choice for a certain assumption based on his or her knowledge and perspectives regarding the issue. Other analysts might have made different assumptions. The degree to which the choice of peers is likely to coincide with the analyst’s choice is expressed in the criterion ‘agreement among peers’. These choices may be partly determined by the disciplinary training of the peers, and by their epistemic preferences. This criterion can thus be seen connected to value-ladenness in a disciplinary sense and in a epistemic sense.[1]
 
Agreement among stakeholders
Stakeholders, though mostly not actively involved in carrying out assessments, might also choose a different assumption in case they were asked to give their view. The degree to which it is likely that stakeholders agree with the analyst’s choice is expressed in the criterion ‘intersubjectivity among stakeholders’. This will often have to do with the socio-political perspective of the stakeholders on the issue at hand and this criterion can therefore be seen as referring to value-ladenness in a socio-political sense.
 
Sensitivity to view and interests of the analyst
Some assumptions may be influenced, consciously or unconsciously, by the view and interests of the analyst making the assumption. The analyst’s epistemic preferences, and his cultural, disciplinary and personal background may influence the assumption that is eventually chosen. The influence of the analyst’s disciplinary background on the choices regarding an assumption and the influence of his epistemic preferences are taken into account in the criteria ‘agreement among peers’, ‘plausibility’ and ‘choice space’. In this criterion the focus is on the room for value-ladenness in a socio-political sense.
 
Influence on results

In order to be able to pinpoint important value-laden assumptions in the calculation chain it is not only important to analyse the potential value-ladenness of the assumptions, but also to assess the influence on the end result of the assessment. Ideally, a sensitivity analysis is carried out to assess the influence of each of the assumptions on the results. In most cases, however, this will not be attainable because itrequires the building of new models. This is why the pedigree matrix includes a column ‘influence on results’.

 The modes for each criterion are arranged in such a way that the lower the score, the more value-laden the assumption potentially is.

 

Sorts and locations of uncertainty addressed
The method presented here for critical review of assumptions in assessments typically addresses value-ladenness of choices. The locations that are addressed by the method basically include all locations that contain implicit or explicit assumptions.
 
Required resources
The time required for this method is variable. Firstly, it depends on the number of calculation chains in the assessment that are analysed and on the complexity of the models in these chains. Secondly, the method can be applied by the analysts carrying out the assessment alone or can be applied by the analysts, peers and stakeholders.
For the workshop, basic facilitator skills are needed.
 
Strengths and weaknesses

+ The method enables a well-structured discussion on potentially value-laden assumptions among scientists and stakeholders. In this discussion not only the politically controversial assumptions are addressed (as is often the case when assessment results are discussed in public), but also other assumptions that turn out to be important for assessment results.

+ The method acknowledges that also pragmatic factors may play a role in the colouring of assumptions
- The results may be sensitive to the composition of the group of participants (both the number of persons and the persons’ backgrounds)
- The results may be sensitive to procedure details as determined by the group facilitator.
- The method does not offer a clear answer to how to deal with extensive disagreement on the pedigree scores of assumptions.
 
Guidance on application
The method can both be applied while the environmental assessment is being carried out, and ex post. Application during the assessment is preferable, since an iterative treatment of assumptions can improve the environmental assessment.
 
Pitf lls
· Potential value-ladenness should not be confused with actual value-ladenness. Assessing the actual value-ladenness of an assumption is impossible, since it would require exact and detailed knowledge on what factors contributed to what extent to the analyst’s choices.

· It is the facilitator’s job to make sure that the discussions among the participants do not slide off to a quick group consensus, but that there is an open discussion promoting critical review.

 
References
The method:
P. Kloprogge, J.P. van der Sluijs and A.C. Petersen (in press) A method for the analysis of assumptions in model-based environmental assessments, Environmental Modelling & Software.
 
Kloprogge, P., J.P. van der Sluijs, A. Petersen, 2004, A method for critical review of potentially value-laden assumptions in environmental assessments. Utrecht University, Department of Science, Technology and Society.

I. Boone, Y. Van der Stede, J. Dewulf, W. Messens, M. Aerts, G. Daube, K. Mintiens (2010) NUSAP: a method to evaluate the quality of assumptions in quantitative microbial risk assessment, Journal of Risk Research, 13: 3, 337 — 352

M. Craye, E. Laes, J. van der Sluijs (2009). Re-negotiating the Role of External Cost Calculations in the Belgian Nuclear and Sustainable Energy Debate. In: A. Pereira Guimaraes and S. Funtowicz. Science for Policy, Oxford University Press, pp 272-290.


[1] There is a link to controversy, as not all peers would agree to the same assumption if there was controversy regarding the issue of the assumption. However, if the majority of peers would choose the same assumption, still the score would be 2 (‘many peers would have made the same assumption’). The occurrence of controversies in the scientific field thus is not always visible in the score. Reasoned the other way around, a score of 0 (‘few peers would have made the same assumption’) does not imply that there are controversies surrounding the assumption: it is possible that all peers agree on the issue, yet that the analyst for some reason has chosen a different assumption. The same applies to the criterion ‘agreement among stakeholders’.