2.3.3 Kullback-Leibler divergence

Course subject(s) Module 2. Calibration and Information score

Expert empirical probability vector and theoretical probability vector

We saw that an empirical probability vector for an expert can be obtained from the proportion of questions in which the realization falls into one of the 4 interquantile ranges. This is, s=(s_1,s_2,s_3,s_4), where

s_1 = the proportion of questions in which the realization falls in IQ1
s_2 = the proportion of questions in which the realization falls in IQ2
s_3 = the proportion of questions in which the realization falls in IQ3
s_4 = the proportion of questions in which the realization falls in IQ4

Hence the empirical probability vectors captures the probability that the realization would fall into each of the four interquantile ranges.

But what do we actually expect these probabilities to be? Well, according to experts’ assessments,
– with the 5% quantiles, the expert is stating that (s)he believes there is a 5% chance that the realization would be lower than the 5% quantile
– with the 95% quantiles, the expert is stating that (s)he believes there is a 5% chance that the realization would be greater than the 95% quantile
– also, with the 5% and 50% quantiles, the expert is stating that (s)he believes there is a 45% chance that the realization would be between the 5% and 50% quantile
– also, with the 50% and 95% quantiles, the expert is stating that (s)he believes there is a 45% chance that the realization would be between the 50% and 95% quantile

This leads to the theoretical probability vector p=(p_1,p_2,p_3,p_4)=(0.05,0.45,0.45,0.05)

Kullback-Leibler divergence

We want to measure now how different an expert’s empirical probability vector s is from the theoretical probability vector p.

For this, we compute the Kullback-Leibler divergence of s and p, or the relative information of s with respect to p

**I(s,p)=s_1ln(s_1/p_1)+s_2ln(s_2/p_2)+s_3ln(s_3/p_3)+s_4ln(s_4/p_4)**

When s_i=0, then the convention is that s_i*ln(s_i/p_i)=0.

Let’s go back to the Dutch eating habits example and compute the Kullback-Leibler divergence for the three experts’ empirical probability vectors.

Consider the following table where 3 experts have given their 5%, 50% and 95% quantiles for 5 different questions.

A Dutch supermarket is interested in eating habits among Dutch adults.
For this purpose, three experts have been consulted. First, these experts need to be evaluated based on five calibration questions.

1) What percentage of Dutch adults eats fruit on a daily basis?
2) What percentage of Dutch adults eats fast food less than once a month?
3) Consider the caloric consumption of Dutch adults ten years ago. What is the caloric consumption today, compared to ten years ago? (here, 100% means there was no change)
4) How many liters of milk are consumed on a yearly basis by the average Dutch adult?
5) How many kilos of meat does the average adult consume in six months time?

The answers of the experts are summarized in the table below. For example, expert 1 estimates this percentage to be 46. Also, he believes that there is 90% chance that the percentage is between 44 and 49. The realization, based on actual research, turned out to be 50 (Note that the data are purely fictional).

Question	Realization	Expert 1 5% 50% 95%	Expert 2 5% 50% 95%	Expert 3 5% 50% 95%
1	50	44 46 49	30 40 55	38 47 55
2	7	9 12 15	1 15 20	2 8 17
3	108	102 106 110	60 80 95	91 99 106
4	66	55 59 64	53 70 80	58 68 75
5	24	28 31 35	10 19 30	26 35 43

Decision Making Under Uncertainty: Introduction to Structured Expert Judgment by TU Delft OpenCourseWare is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Based on a work at https://online-learning.tudelft.nl/courses/decision-making-under-uncertainty-introduction-to-structured-expert-judgment//.

2.3.3 Kullback-Leibler divergence

2.3.3 Kullback-Leibler divergence

I(s,p)=s_1*ln(s_1/p_1)+s_2*ln(s_2/p_2)+s_3*ln(s_3/p_3)+s_4*ln(s_4/p_4)

**I(s,p)=s_1ln(s_1/p_1)+s_2ln(s_2/p_2)+s_3ln(s_3/p_3)+s_4ln(s_4/p_4)**