Pedigree analysis: how to set up a NUSAP workshop
To provide guidance on setting up a NUSAP workshop we describe as an example how one of the first NUSAP workshops was organised: the 2001 Loosdrecht workshop on the TIMER Energy Model used by the Netherlands Environmental Assessment Agency for the development of energy scenarios and greenhouse gas emission scenarios. The TIMER model is one of the models used by the IPCC Special Report on Emissions Scenraios.
The text is based on chapter 6 of Van der Sluijs et al. (2002). The pedigree matrix used was the Pedigree matrix to assess parameter strength.
On June 12 and 13 2001 a 24 hour workshop was held in Loosdrecht, The Netherlands in which 19 experts on the fields of energy economy and energy systems analysis (12 participants) and uncertainty assessment (7 participants) were brought together. Two participants are the actual TIMER modellers, and one has been one. From the 19 experts, 8 took part in the organisation of the workshop. Participants came from 13 different leading research groups in the relevant fields of expertise, located in 5 different European countries. For the list of participants we refer to Appendix 6.3.
The primary goal of the workshop was to assess the strength of the input values for key variables. As preparation to the workshop, we had sent the participants a briefing document and a paper with some background on the TIMER model.
The workshop was set up in three phases:
• a plenary session with a series of introductory lectures
• a expert elicitation session in three parallel groups
• a concluding plenary session
The purpose of the introductory lectures was to sketch the broader context of the project and to provide the experts with enough understanding on the TIMER model, the Morris sensitivity analysis and the NUSAP method to enable them to accomplish the elicitation exercise.
For the expert elicitation session, we divided the participants into 3 parallel groups of 6 person . The groups were made in such a way that expertise on energy economy, on energy systems analysis, on the TIMER model, on uncertainty assessment and organizers of the workshop were equally distributed over the three groups. We also balanced senior and junior experts equally over the three groups. In each group one person got the task of group moderator and one person was assigned as rapporteur.
Each participant received a set with all 18 cards containing the parameters to be reviewed. Assessment of parameter strength was done by discussing each of the parameters in a moderated group discussion addressing strengths and weaknesses in the underpinning of each parameter, focussing on, but not restricted to, the 5 pedigree criteria and eliciting the scores of the parameters for each of these pedigree criteria.
We gave the following instructions for the exercise:
• Do the Pedigree assessment as an individual expert judgement, we do not want a group judgement
• Group works on one card at a time
• The group moderator determines the order in which the cards are discussed
• Discussion on each card starts with a group discussion aimed at clarification of concepts
• Then the likely range for each parameter is discussed in the group, after which every expert fills in her or his own judgement for this range.
• In the next phase the discussion moves to strengths and weaknesses of the underpinning of the parameter under consideration.
• After that the value ladeness and pedigree criteria (proxy, empirical basis, theoretical understanding, methodological rigour and validation) were discussed one by one in the group ending with that every expert fills in her or his individual judgement of the score for each criterion. To assist in this task, each participant had a copy of the pedigree matrix as given in Table 6.1.
• If you feel you cannot judge one or more of the pedigree scores for a given parameter, leave it blank
• If you feel a certain criterion is not applicable for a given parameter, indicate so with "n.a."
• Indicate a self assessment of your competence on each parameter on a scale from 0 to 4 and write that together with your name on the card (note that competence can differ for different parameters for the same expert).
• Write your name on each card
• If a card contains more than one parameter and you want to differentiate the pedigree scores between these parameters, fill in the corresponding letter for that parameter (parameters are numbered a, b, c, ... on each card) in the appropriate box in the scoring part of the card. An example of a scoring card is presented below.
Example of a pedigree scoring card
We expected that time would be too limited to complete all 18 cards in each group. For that reason we had in advance designed a procedure to make sure that each group did at least six cards that we considered the most important ones in the set. The selection was based on our knowledge on the controversial nature of the concepts and the high ranks for sensitivity in the Morris results for the parameters on these cards. This core set existed of the following cards with parameters that turned out as most sensitive in our sesitivity analysis of the TIMER model:
• Structural change / Growth Elasticity
• Autonomous energy efficiency improvement
• Price induced energy efficiency improvement
• Learning rates
• Resources of fossil fuels (supply cost curve)
The use of such a core set warranted that we had at least for those cards results to explore eventual sensitivity of the method to group composition and group dynamics. If such sensitivity exists we expect to see it strongest for the most controversial concepts and vice versa large inter and intra group differences in pedigree assessment of the same parameter may indicate that the concept is controversial. On top of that core set of six cards, we asked each group to do four more cards, by randomly distributing the remaining twelve cards over the three groups, to assure that every card from the set was assessed by at least one of the groups. We used different colours for the cards so that the 10 cards (the core set of 6 plus 4 of the remaining 12) that had to be completed by each group had the same color, and the six remaining had different color in each set. So every participant in every group had the full set of cards and the colors of the cards indicated what cards had to be done first. The group moderator was instructed to make sure that at least the core set of 6 cards was completed. On some core cards a large number of parameters were listed. For these cards we selected some key parameters from that list (same selection in all groups) and omitted the rest for practical reasons.
Before we broke down in sub groups, we did one card plenary (resources of fossil fuel / supply cost curve) to make everyone familiar with the procedure for the elicitation and to create a shared understanding of the various pedigree criteria.
We concluded the workshop with a plenary session, reflecting on our experiences with the method during the workshop. Observations made by group rapporteurs and group members included:
• Participants were less shy in attributing high scores for value ladeness of parameters than we had expected;
• Even though the parameters were clustered, the participants perceived them as difficult to assess due to their rather specific nature. The card by card approach starting with a brief discussion to clarify concepts and then focussing on each of the pedigree aspects of parameter strength was therefore appreciated by participants. It brings a lot of interesting points on the table and participants felt that it helped them to assign pedigree scores to the parameters. However, the group discussion lead to convergence in assessment scores.
• Several participants had difficulties with specifying likely uncertainty ranges for many of the parameters because they lacked specific domain expertise to do so;
• Participants felt they could well fill in meaningful pedigree scores independent from the group discussions;
• Due to lack of time some interesting in depth discussions per parameter had to be cut of for the sake of completing the required set of cards;
• The majority of parameters scored relatively low on validation;
• The criteria methodological rigour and theoretical understanding were partly ambiguous;
• Some participants found the ranges of freedom of interpretation in attributing pedigree scores fairly high
• One participant indicated that he was familiar with concepts but not with precise model implementation of these concepts. He recommended to do more translation into generic concepts and do the elicitation exercise on that level.
• One participant missed the larger picture of long time dynamics. By the focus on individual parameters you lose some view on the whole system, whereas he believed that what really matters is uncertainty with regard to that overall structure (note that model structure uncertainty is addressed in chapter 4 of this report)
• One participant pointed out that there is a difference between the discussion on whether a certain concept is a right or wrong representation of reality and whether you have the right value for a parameter to represent that concept. This distinction was insufficiently made clear in the parallel group discussions.
• One participant suggested that the workshop had been ambitious in covering all aspects of this complex energy model. He considered it valuable to organize several workshops (or one workshop with parallel sessions) with similar set up but each focussed on specific issues with a much smaller sub domain of expertise, for instance, the role of innovation in energy modelling, factors determining energy demand and resources.
• There was some discussion whether the pedigree assessment could be extended along the lines proposed in Corral, 2001, involving also criteria related to how the model is used in the societal debate and policy process. This would involve pedigree criteria such as accessibility and transparency of TIMER and the information generated with the model, extended peer acceptance, cultural and institutional dimensions and the like. Doing such would also require involvement of an extended peer community in an elicitation procedure for scoring these aspects. Some participants considered TIMER to be too remote from daily life to do such an exercise with citizens. On the other hand, the ULYSSES project (see e.g. http://www.zit.tu-darmstadt.de/ulysses/tutorial.htm) clearly demonstrated that citizens can engage in a meaningful way with complex environmental assessment models.
Overall there was a shared feeling amongst participants that the NUSAP method and the elicitation procedure with the cards, though not straightforward to complete, facilitates and structures a creative process and in depth discussion on and assessment of uncertainty. The task of quality control in complex models is a complicated one and the NUSAP method disciplines and supports this process.
References
Jeroen P. van der Sluijs, Jose Potting, James Risbey, Detlef van Vuuren, Bert de Vries, Arthur Beusen, Peter Heuberger, Serafin Corral Quintana, Silvio Funtowicz, Penny Kloprogge, David Nuijten, Arthur Petersen, Jerry Ravetz. Uncertainty assessment of the IMAGE/TIMER B1 CO2 emissions scenario, using the NUSAP method. Dutch National Research Program on Climate Change, Bilthoven, 2002, 225 pp.