Chapter 2 Loss distributions

Learning Objectives

  1. Describe the properties of the statistical distributions which are suitable for modelling individual and aggregate losses.
  2. Explain the concepts of excesses, deductibles and retention limits.
  3. Describe the operation of proportional and excess of loss reinsurance.
  4. Derive the distribution and corresponding moments of the claim amounts paid by the insurer and the reinsurer in the presence of excesses (deductibles) and reinsurance.
  5. Estimate the parameters of a failure time or loss distribution when the data is complete, or when it is incomplete, using maximum likelihood and the method of moments.
  6. Fit a statistical distribution to a dataset and calculate appropriate goodness of fit measures.

Theory

R was designed to be used for statistical computing - so it handles randomness well!

set.seed(42) # Fixes result
die_throws <- sample(1:6, 10000, replace = TRUE)
mean(die_throws)
## [1] 3.4627

2.1 Probability distributions for modelling insurance losses

R has in-built functions for probability distributions:

  • d<distribution-name> \(:=\) density (PDF), i.e. \(f_X(x)\)
  • p<distribution-name> \(:=\) probability distribution cumulative function (CDF), i.e. \(F_X(x) =\boldsymbol{P}(X \leq x)\)
  • q<distribution-name> \(:=\) quantile function, i.e. return \(x\) such that \(\boldsymbol{P}(X \leq x) = p\)
  • r<distribution-name> \(:=\) random deviates, i.e. (psuedo) random number generator for a given distribution
  • Where <distribution-name> \(=\) Normal, uniform, lognormal, Student’s t, Poisson, binormal, Weibull … see ?distributions() for more information
R Code Definition
rnorm(1) Generates \(x_1\) where \(X \sim \mathcal{N}(0,\,1)\)
rnorm(y, mean=10, sd=2) Generates \(\{y_1,\,y_2,\,\dots\}\) with \(Y \sim \mathcal{N}(10,\,2^2)\)
runif(3, min=5, max=10) Generates \(\{z_1,\,z_2,\,z_3\}\) where \(Z \sim \mathcal{U}(5,\,10)\)
dbinom(4, size=5, prob=0.5) Computes \(\boldsymbol{P}(X = 4)\) where \(X \sim \mathcal{Bin}(5,\,0.5)\)
pgamma(0.2, shape=2, rate=2)
*See footnote
Computes \(F_Y(0.2)\) where \(Y \sim \mathcal{\Gamma}(2,\,2)\), i.e. \(\boldsymbol{P}(Y\leq 0.2)\)
qexp(0.5, rate = 2) Determines smallest value of \(z\) for \(\boldsymbol{P}(Z \leq z) = 0.5\) where \(Z \sim Exp(2)\)

2.2 Mechanisms for limiting insurance losses

2.3 Proportional and Excess of Loss reinsurance

2.4 Estimating parameters of loss distributions with complete data

2.5 Estimating parameters of loss distributions with incomplete data

R Practice

We are investigating the reinsurance arrangement of 1,000 insurance claims named X with the following characteristics:

  • \(X \sim Exp(0.01)\)
  • Unlimited excess of loss reinsurance, with a retention level of \(M = 400\)
library(dplyr) # Data manipulation

set.seed(42) # Fix result
X <- rexp(
  n = 1000,
  rate = 0.01
)

M <- 400

summary(X)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##   0.0281  33.4354  75.5324 107.7943 145.8514 846.2336

We want to determine the proportion of claims that are fully covered by the insurer:

Proportion <- sum(X <= M) / length(X)

The proportion of claims that are fully covered by the insurer is 97.8%.

Next, for each claim, we want to calculate the net (of reinsurance) amount paid by the insurer. We will record this in a vector called Y:

Y <- ifelse(X > M, M, X)

summary(Y)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##   0.0281  33.4354  75.5324 105.0863 145.8514 400.0000

Likewise, for each claim, we want to calculate the amount paid by the reinsurer. We will record this in a vector called Z:

Z <- ifelse(X > M, X - M, 0)

summary(Z)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   0.000   0.000   2.708   0.000 446.234

Now let us assume that the underlying gross claims distribution follows an exponential distribution of some unknown rate \(\lambda\). We will estimate \(\lambda\) using only the retained claim amounts which we have recorded in vector Y.

First let’s calculate the log-likelihood as a function of the parameter \(\lambda\) and claims data Y:

#TO DO

We now determine the value of \(\lambda\) at which the log-likelihood function reaches its maximum:

#TO DO

Finally let’s plot the results:

library(ggplot2)
#TO DO