Brier Score

Published by Mario Oettler on

The Brier Score (BS) is a proper score function that measures the accuracy of a probability prediction.

ft: probability that was forecast

ot: actual outcome of the event at instance t

N: Number of forecasting instances

t: instance

The formula above is proper for binary events. So, ot can be 0 (if it doesn’t happen) or 1 (if it happens). The original formula of the Brier score is also applicable for multi-category forecasts.

R: number of possible classes in which the event can fall. R = 2 for the case rain / no rain, R = 3 for the case long, normal, short.

N: number of overall instances of all classes

fti: predicted probability of class i

oti: observation for instance t and class i. oti = 1 if this is the ith class, otherwise it is 0.

Interpretation

The Brier score measures the error of the prediction (how far is the prediction away from reality). The lower the Brier score, the better is the prediction. A Brier score of zero would mean that every prediction matched with reality.

Brier Skill Score (BSS)

To interpret the Brier score, it is helpful to compare it with the score of a reference method (baseline method).

A skill score of <0 means that the forecasting method is worse than the reference forecasting method. A skill score of 0 means that it is equivalent and a skill score of >0 means that the forecasting method is better than the reference method. The higher it is, the better the forecast.

BSref: Brier score for a reference forecasting method.

Often, BSref is calculated for a naïve forecasting method that takes the average probability as a forecast. The average probability is the average of the outcomes o.

Example BSS

Index12345
Event00111
Forecasted Probability0.10.20.60.80.9
Square Error with f0.010.040.160.040.01

BS = 0.052

The reference model assumes a probability of 0.30 throughout every day.

Henc, it yields a squared error of:

Square Error with f0.090.090.490.490.49

BSref = 0.33

BSS = 0.84242

Advantages and Disadvantages of the Brier Score

+ The Brier Score is easy to calculate

– If the events are rare, the Brier score becomes inaccurate

Example 1

Suppose we have a weather forecast that tries to predict the probability of rain. Since we are only interested in whether it will rain or not, we have a binary decision (rain, no rain).  We are only interested in one day (N = 1).

The prediction we want to assess is the probability of rain = 0.9.

In reality, it rained.

Since we have a binary situation, we can use the simplified formula.

The result of 0.01 is pretty good.

Example 2

Suppose we have a weather forecast that tries to predict the probability of rain. Since we are only interested in whether it will rain or not, we have a binary decision (rain, no rain).  We are interested in five days (N = 5).

Day12345
Prediction0.90.70.50.40.1
Reality11010
Categories: