# Stefan Siegert

## A useful proof concerning the CRPS

The continuous ranked probability score (CRPS) is a proper scoring rule that measures how well the cumulative distribution function (cdf) $$F(x)$$ predicted the event that materialised in the measured outcome $$y$$. It is defined as the integrated squared difference between the forecast distribution $$F(x)$$ and the hypothetical "perfect" forecast distribution for the outcome $$y$$ which would be a Heaviside step function centered on $$y$$: \begin{aligned} CRPS & = \int dt [F(t) - H(t - y)]^2. \end{aligned}

An analytical result that is sometimes found in the literature is \begin{aligned} CRPS & = E|X-y| - \frac12 E|X-X'| \end{aligned} where the expectation is taken over the independent random variables $$X$$ and $$X'$$ with distribution $$F(x)$$. The proof can be found in Baringhaus and Franz (2004), but it might be older than that.

The crucial result to realize is that the absolute difference $$|x-y|$$ can be written as an integral over indicator functions as follows \begin{aligned} |x-y| & = \int dt [ I(x \le t \lt y) + I(y \le t \lt x) ] \\ & = \int dt [ I(x \le t) I(y \gt t) + I(y \le t) I(x \gt t) ]. \end{aligned}

Suppose $$X$$ and $$Y$$ are independent random variables with pdfs $$f(x)$$ and $$g(y)$$, and distribution functions $$F(x)$$ and $$G(y)$$. Then \begin{aligned} E|X-Y| & = \int dx \int dy \int dt [ I(x \le t) I(y \gt t) + I(y \le t) I(x \gt t) ] f(x) f(y)\\ & = \int dt \Big\{ \Big[ \int dx I(x \le t) f(x)\Big] \Big[ \int dy I(y \gt t) g(y) \Big] \\ & \quad \quad \quad + \Big[ \int dy I(y \le t) f(y)\Big] \Big[ \int dx I(x \gt t) f(x) \Big] \Big\}\\ & = \int dt \Big[ F(t)(1-G(t)) + G(t)(1-F(t)) \Big]. \end{aligned}

When $$X$$ and $$X'$$ are identically and independently distributed random variables with cdf $$F(x)$$, it follows that $E|X-X'| = 2 \int dt [ F(t)(1-F(t)) ].$

Therefore \begin{aligned} & E|X-Y| - \frac12 E|X-X'| - \frac12 E|Y-Y'| \\ & = \int dt \Big[ F(t)(1-G(t)) + G(t) (1 - F(t)) \Big]\\ &\quad - \int dt \Big[ F(t)(1-F(t))\Big] - \int dt \Big[ G(t)(1-G(t))\Big]\\ & = \int dt [ F(t) - G(t) ]^2. \end{aligned} If $$Y$$ is a constant, we have $$F(y) = H(t - y)$$ and $$E|Y - Y'| = 0$$ and so \begin{aligned} CRPS & = \int dt [F(t) - H(t-y)]^2\\ & = E|X-y| - \frac12 E|X-X'|. \end{aligned}