Evaluations do not only present the data about a programme/policy. They also make judgments on the extent to which measured performance is good, average, poor. This means that there needs to be an understanding of what constitutes good performance before judgements can be made. This is facilitated by defining judgement criteria based on evaluation questions and an understanding of a programme’s objectives.
Judgement criteria are clear and measurable statements which define what constitutes good performance. They can be more or less open. Sometimes they simply reflect the direction of change expected (increase of X, decrease of Y). In other cases they are defined as specific targets: increase by x% or decrease by x%.
An important consideration when making judgements is the comparison between results and costs. A small improvement which was achieved at very small cost can be of equal value to a greater improvement at high cost. That is why it is worthwhile integrating elements of cost-effectiveness analysis into an evaluation.
Defining judgement criteria
Defining judgement criteria based on programme/policy objectives
Normally, programme objectives should be formulated in a way which lends itself to formulating judgement criteria. For example: increase the number of schools which have action plans to tackle early school leaving. According to this objective an increase in the number of schools with such plans would constitute success.
The programme can have defined targets. These clearly set out what is considered good performance. Using the example above:
- increase the share of schools which have action plans to tackle early school leaving by 20%, or
- increase the share of schools which have action plans to tackle early school leaving up to 50% of schools
The Danish initiative Retention Caravan established a quantitative target for participating schools: to improve the retention rate by 20% in total by the end of the project period.
2012 evaluation report (in Danish)
Defining judgement criteria based on stakeholders’ views
It is possible that the programme was designed without the clear formulation of targets or objectives that can be easily translated into judgement criteria.
In such cases judgement criteria can be defined based on stakeholder interviews. Asking them what performance they expect from the programme (direction of change, degree of change). The results of stakeholder interviews can then form part of the definition of the judgement criteria.
Asking experts/stakeholders to make the judgements
Alternatively stakeholders or experts can be asked to make the judgements. They can be presented with data on programme performance and in consultation judge the performance. The process can be facilitated by the Delphi technique, which involves several rounds of questionnaires until the expert panel reaches a convergence of opinion.
If you would like to learn more about the Delphi technique, we suggest that you consult the Better Evaluation website.
Another approach is to compare the performance with similar programmes. Does the evaluated programme perform better or worse than a comparable programme in another region or a preceding programme? However, this requires having identical data for both programmes.
Programmes and policies imply certain costs. Some measures to address early leaving are much more costly than others because they necessitate significant changes to the status quo or a significant deployment of human resources. Looking at the costs alone does not provide much insight. There can be good reasons why one policy is more costly than another. That is why it is important to look at the costs in comparison to the outputs and results.
One very simple way of looking at this is by comparing the total costs of the programme with the number of young persons who have been reached. This can be complemented with information on the share of those who have seen positive results. This is not a full cost-effectiveness assessment but it can give an initial idea of the effort invested compared to results.
A relatively simple metric to calculate in most evaluations is the cost-per-output. For example, the costs of involving one beneficiary in remedial training or the costs of one mentoring session. This gives an idea of the efficiency of the implementation process. Evaluators can make comparisons within the programme – for example in one region the costs of training per person may be higher or lower than in another. They can also make a comparison with a similar programme but that requires making sure the costs are calculated using the same approach.
A more complex, but also more interesting metric, is the cost-per-result. For example, how much does it cost to reintegrate one young person into education and training, and retain him/her? In this case, the overall costs are not distributed across all persons reached. The calculation takes into account the share of beneficiaries who reintegrate into education and training and successfully qualify. The costs are distributed over the whole group meaning that they are higher than the costs per person reached. Again such information can be compared within the programme, with other programmes, and also with the costs of non-action (i.e. how much does a young unemployed person cost in terms of social security).
A policy can be more expensive because it is much bigger and reaches out to many more people, or the target group is hard to reach and requires intensive outreach and contacts with qualified staff.
The evaluation of the UK measure ‘The Youth Contract for 16-17 year olds not in education, employment or training evaluation’ included a cost-benefit analysis. It subtracted the estimated direct and indirect costs of the programme from the estimated long-term benefits of participating in it. It looks into the impact of additional qualifications resulting from participation in the programme, on increased lifetime earnings, improved health, and reduced criminal activity.
2014 evaluation report (in English)