The sum of two independent normals. The law of distribution of the sum of two random variables

An extremely important object of probability theory is the sum of independent random variables. It is the study of the distribution of sums of independent random variables that laid the foundation for the development of analytical methods of probability theory.

Distribution of the sum of independent random variables

In this section, we will obtain a general formula that allows us to calculate the distribution function of the sum of independent random variables, and consider several examples.

Distribution of the sum of two independent random variables. Convolution formula

independent random variables with distribution functions

respectively

Then the distribution function F sums of random variables

can be calculated using the following formula ( convolution formula)

To prove this, we use Fubini's theorem.

The second part of the formula is proved similarly.

Distribution density of the sum of two independent random variables

If the distributions of both random variables have densities, then the density of the sum of these random variables can be calculated by the formula

If the distribution of a random variable (or ) has a density, then the density of the sum of these random variables can be calculated by the formula

To prove these assertions, it suffices to use the definition of density.

Multiple convolutions

The calculation of the sum of a finite number of independent random variables is performed using the sequential application of the convolution formula. Sum distribution function k independent identically distributed random variables with a distribution function F

called k–fold convolution of the distribution function F and denoted

Examples of calculating the distribution of sums of independent random variables

In this paragraph, examples of situations are given, when summing random variables, the form of the distribution is preserved. The proofs are exercises on summation and calculation of integrals.

Sums of independent random variables. Normal distribution

Sums of independent random variables. Binomial distribution

Sums of independent random variables. Poisson distribution

Sums of independent random variables. Gamma distribution

Poisson process

a sequence of independent identically distributed random variables having an exponential distribution with parameter



Random sequence of points

on the non-negative semi-axis is called Poisson (point) process.

Let us calculate the distribution of the number of points

Poisson process in the interval (0,t)

equivalents, so

But the distribution of the random variable

is an Erlang distribution of order k, so

Thus, the distribution of the number of points of the Poisson process in the interval (o,t) is a Poisson distribution with the parameter

The Poisson process is used to simulate the moments of occurrence of random events - the process of radioactive decay, the moments of receipt of calls to the telephone exchange, the moments of the appearance of customers in the service system, the moments of equipment failure.

In practice, it often becomes necessary to find the distribution law for the sum of random variables.

Let there be a system (X b X 2) two continuous s. in. and their sum

Let us find the distribution density c. in. U. In accordance with the general solution of the previous paragraph, we find the region of the plane where x + x 2 (Fig. 9.4.1):

Differentiating this expression with respect to y, we obtain an ap. random variable Y \u003d X + X 2:

Since the function φ (x b x 2) = Xj + x 2 is symmetric with respect to its arguments, then

If with. in. X and X 2 are independent, then formulas (9.4.2) and (9.4.3) take the form:


In the case when independent c. in. x x and X 2, talk about the composition of distribution laws. Produce composition two distribution laws - this means finding the distribution law for the sum of two independent c. c., distributed according to these laws. The symbolic notation is used to designate the composition of distribution laws

which is essentially denoted by formulas (9.4.4) or (9.4.5).

Example 1. The work of two technical devices (TD) is considered. First, TU works after its failure (failure) is included in the operation of TU 2. Uptime TU TU TU 2 - x x and X 2 - are independent and distributed according to exponential laws with parameters A,1 and X 2 . Therefore, the time Y trouble-free operation of the TU, consisting of TU! and TU 2 will be determined by the formula

It is required to find a p.r. random variable Y, i.e., the composition of two exponential laws with parameters and X 2 .

Solution. By formula (9.4.4) we get (y > 0)


If there is a composition of two exponential laws with the same parameters (?c = X 2 = Y), then in the expression (9.4.8) an uncertainty of type 0/0 is obtained, expanding which, we get:

Comparing this expression with expression (6.4.8), we are convinced that the composition of two identical exponential laws (?c = X 2 = x) is the second-order Erlang law (9.4.9). When composing two exponential laws with different parameters x x and A-2 get second-order generalized Erlang law (9.4.8). ?

Problem 1. The law of distribution of the difference of two s. in. System with. in. (X and X 2) has a joint r.p./(x x x 2). Find a p.r. their differences Y=X - X 2 .

Solution. For the system with in. (X b - X 2) etc. will be / (x b - x 2), i.e. we replaced the difference with the sum. Therefore, a.r. random variable U will have the form (see (9.4.2), (9.4.3)):

If a With. in. X x iX 2 independent, then

Example 2. Find a f.r. the difference of two independent exponentially distributed s. in. with parameters x x and X 2 .

Solution. According to the formula (9.4.11) we get

Rice. 9.4.2 Rice. 9.4.3

Figure 9.4.2 shows a p. g(y). If we consider the difference of two independent exponentially distributed s. in. with the same settings (A-i= X 2 = BUT,), then g(y) \u003d / 2 - already familiar

Laplace's law (Fig. 9.4.3). ?

Example 3. Find the distribution law for the sum of two independent c. in. X and X 2, distributed according to the Poisson law with parameters a x and a 2 .

Solution. Find the probability of an event (X x + X 2 = t) (t = 0, 1,



Therefore, s. in. Y= X x + X 2 distributed according to the Poisson law with the parameter a x2) - a x + a 2. ?

Example 4. Find the distribution law for the sum of two independent c. in. x x and X 2, distributed according to binomial laws with parameters p x ri p 2 , p respectively.

Solution. Imagine with. in. x x as:

where X 1) - event indicator BUT wu "th experience:

Distribution range with. in. X,- has the form


We will make a similar representation for s. in. X 2: where X] 2) - event indicator BUT in y"-th experience:


Consequently,

where is X? 1)+(2) if the event indicator BUT:

Thus, we have shown that in. Father-in-law amount (u + n 2) event indicators BUT, whence it follows that s. in. ^distributed according to the binomial law with parameters ( n x + n 2), p.

Note that if the probabilities R in different series of experiments are different, then as a result of adding two independent s. c., distributed according to binomial laws, it turns out c. c., distributed not according to the binomial law. ?

Examples 3 and 4 are easily generalized to an arbitrary number of terms. When composing Poisson's laws with parameters a b a 2 , ..., a t Poisson's law is again obtained with the parameter a (t) \u003d a x + a 2 + ... + and t.

When composing binomial laws with parameters (n r); (i 2 , R) , (n t, p) again we get the binomial law with parameters (“(“), R), where n (t) \u003d u + n 2 + ... + etc.

We have proved important properties of Poisson's law and the binomial law: the "stability property". The distribution law is called sustainable, if the composition of two laws of the same type results in a law of the same type (only the parameters of this law differ). In Subsection 9.7 we will show that the normal law has the same stability property.

THEME 3

concept of distribution function

mathematical expectation and variance

uniform (rectangular) distribution

normal (Gaussian) distribution

Distribution

t- Student's distribution

F- distribution

distribution of the sum of two random independent variables

example: distribution of the sum of two independent

uniformly distributed quantities

random variable transformation

example: distribution of a harmonic wave

with random phase

central limit theorem

moments of a random variable and their properties

PURPOSE OF THE CYCLE

LECTURES:

REPORT INITIAL INFORMATION ABOUT THE MOST IMPORTANT DISTRIBUTION FUNCTIONS AND THEIR PROPERTIES

DISTRIBUTION FUNCTIONS

Let x(k) is some random variable. Then for any fixed value x a random event x(k) x defined as the set of all possible outcomes k such that x(k) x. In terms of the original probability measure given on the sample space, distribution functionP(x) defined as the probability assigned to a set of points k x(k) x. Note that the set of points k satisfying the inequality x(k) x, is a subset of the set of points that satisfy the inequality x(k). Formally

It's obvious that

If the range of values ​​of the random variable is continuous, which is assumed below, then probability density(one-dimensional) p(x) is determined by the differential relation

(4)

Consequently,

(6)

In order to be able to consider discrete cases, it is necessary to admit the presence of delta functions in the composition of the probability density.

EXPECTED VALUE

Let the random variable x(k) takes values ​​from the range from -  to + . Mean(otherwise, expected value or expected value) x(k) is calculated using the corresponding passage to the limit in the sum of products of values x(k) on the probability of these events occurring:

(8)

where E- mathematical expectation of the expression in square brackets by index k. The mathematical expectation of a real single-valued continuous function is defined similarly g(x) from a random variable x(k)

(9)

where p(x)- probability density of a random variable x(k). In particular, taking g(x)=x, we get mean square x(k) :

(10)

Dispersionx(k) defined as the mean square of the difference x(k) and its average value,

i.e. in this case g(x)= and

By definition, standard deviation random variable x(k), denoted , there is a positive value square root from dispersion. The standard deviation is measured in the same units as the mean.

MOST IMPORTANT DISTRIBUTION FUNCTIONS

UNIFORM (RECTANGULAR) DISTRIBUTION.

Let us assume that the experiment consists in a random selection of a point from the interval [ a,b] , including its endpoints. In this example, as the value of a random variable x(k) you can take the numeric value of the selected point. The corresponding distribution function has the form

Therefore, the probability density is given by the formula

In this example, the calculation of the mean and variance using formulas (9) and (11) gives

NORMAL (GAUSSIAN) DISTRIBUTION

, - arithmetic mean, - RMS.

The value of z corresponding to the probability P(z)=1-, i.e.

CHI - SQUARE DISTRIBUTION

Let - n independent random variables, each of which has a normal distribution with zero mean and unit variance.

Chi-squared random variable with n degrees of freedom.

probability density .

DF: 100 - percentage points - distributions are denoted by , i.e.

mean and variance are equal

t - STUDENT DISTRIBUTIONS

y, z are independent random variables; y - has - distribution, z - normally distributed with zero mean and unit variance.

value - has t- Student's distribution with n degrees of freedom

DF: 100 - percentage point t - distribution is indicated

Mean and variance are equal

F - DISTRIBUTION

Independent random variables; has - distribution with degrees of freedom; distribution with degrees of freedom. Random value:

,

F is a distributed random variable with and degrees of freedom.

,

DF: 100 - percentage point:

The mean and variance are equal:

DISTRIBUTION OF THE AMOUNT

TWO RANDOM VARIABLES

Let x(k) and y(k) are random variables having a joint probability density p(x,y). Find the probability density of the sum of random variables

At a fixed x we have y=z–x. That's why

At a fixed z values x run the interval from – to +. That's why

(37)

whence it can be seen that in order to calculate the desired density of the sum, one must know the original joint probability density. If a x(k) and y(k) are independent random variables having densities and, respectively, then and

(38)

EXAMPLE: THE SUM OF TWO INDEPENDENT, UNIFORMLY DISTRIBUTED RANDOM VARIABLES.

Let two random independent variables have densities of the form

In other cases Let us find the probability density p(z) of their sum z= x+ y.

Probability Density for i.e. for Consequently, x less than z. In addition, is not equal to zero for By formula (38), we find that

Illustration:

The probability density of the sum of two independent, uniformly distributed random variables.

RANDOM CONVERSION

VALUES

Let x(t)- random variable with probability density p(x), let it go g(x) is a single-valued real continuous function of x. Consider first the case when the inverse function x(g) is also a single-valued continuous function of g. Probability Density p(g), corresponding to a random variable g(x(k)) = g(k), can be determined from the probability density p(x) random variable x(k) and derivative dg/dx under the assumption that the derivative exists and is different from zero, namely:

(12)

Therefore, in the limit dg/dx#0

(13)

Using this formula, follows on its right side instead of a variable x substitute the appropriate value g.

Consider now the case when the inverse function x(g) is valid n-valued function of g, where n is an integer and all n values ​​are equally probable. Then

(14)

EXAMPLE:

DISTRIBUTION OF THE HARMONIC FUNCTION.

Harmonic function with fixed amplitude X and frequency f will be a random variable if its initial phase angle = (k)- random value. In particular, let t fixed and equal t o, and let the harmonic random variable have the form

Let's pretend that (k) has a uniform probability density p() kind

Find the probability density p(x) random variable x(k).

In this example, the direct function x() unambiguously, and the inverse function (x) ambiguous.

The decision maker may use insurance to mitigate the adverse financial impact of certain types of random events.

But this discussion is very general, since a decision maker could mean both an individual seeking protection from damage to property, savings, or income, and an organization seeking protection from the same kind of damage.

In fact, such an organization may be Insurance Company, which is looking for ways to protect itself from financial losses due to too many insured events that happened to its individual client or to its insurance portfolio. This protection is called reinsurance.

Consider one of two models (namely individual risk model) widely used in determining insurance rates and reserves, as well as in reinsurance.

Denote by S the amount of accidental losses of the insurance company for some part of its risks. In this case S is a random variable for which we have to determine the probability distribution. Historically, for distributions of r.v. S there were two sets of postulates. The individual risk model defines S in the following way:

where r.v. means losses caused by the object of insurance with the number i, a n denotes the total number of insurance objects.

It is usually assumed that they are independent random variables, since in this case mathematical calculations are simpler and information about the nature of the relationship between them is not required. The second model is the collective risk model.

The considered model of individual risks does not reflect changes in the value of money over time. This is done to simplify the model, which is why the title of the article refers to a short time interval.

We will consider only closed models, i.e. those in which the number of insurance objects n in formula (1.1) is known and fixed at the very beginning of the considered time interval. If we introduce assumptions about the presence of migration from or to the insurance system, then we get an open model.

Random variables describing individual payouts

First, let us recall the main provisions regarding life insurance.

In case of death insurance for a period of one year, the insurer undertakes to pay the amount b, if the policyholder dies within a year from the date of conclusion of the insurance contract, and does not pay anything if the policyholder lives this year.

The probability of an insured event occurring during the specified year is denoted by .

The random variable describing insurance payments has a distribution that can be specified either by the probability function

(2.1)

or the corresponding distribution function

(2.2)

From formula (2.1) and from the definition of moments, we obtain

(2.4)

These formulas can also be obtained by writing X as

where is a constant value paid in case of death, and is a random variable that takes the value 1 upon death and 0 otherwise.

Thus, and , and the mean value and variance of the r.v. are equal and respectively, and the mean value and variance of r.v. are equal to and , which coincides with the above formulas.

A random variable with range (0,1) is widely used in actuarial models.

In textbooks on probability theory, it is called indicator, Bernoulli random value or binomial random variable in the single test design.

We will call her indicator for reasons of brevity, and also because it indicates the onset, or not onset, of the event in question.

Let's move on to searching for more general models, in which the value of the insurance payment is also a random variable and several insured events can occur in the considered time interval.

Health insurance, auto and other property insurance, and liability insurance immediately provide many examples. Generalizing formula (2.5), we set

where is a random variable describing insurance payments in the considered time interval, r.v. denotes the total amount of payments in this interval and r.v. is an indicator for the event that at least one insured event has occurred.

Being an indicator of such an event, r.v. fixes the presence () or lack () insured events in this time interval, but not the number of insured events in it.

Probability will continue to be denoted by .

Let's discuss several examples and determine the distribution of random variables and in some model.

Let's first consider death insurance for one year, with an additional benefit if the death is an accident.

For definiteness, let's assume that if the death occurred as a result of an accident, then the amount of the payment will be 50,000. If death occurs due to other causes, the amount of the payment will be 25,000.

Let us assume that for a person of a given age, state of health and profession, the probability of dying as a result of an accident during the year is 0.0005, and the probability of dying from other causes is 0.0020. In formula form, it looks like this:

Summing over all possible values ​​of , we obtain

,

Conditional distribution c. in. condition has the form

Let us now consider car collision insurance (compensation paid to the owner of the car for damage caused to his car) with an unconditional deductible of 250 and a maximum payout of 2000.

For clarity, we assume that the probability of occurrence of one insured event in the considered period of time for an individual is 0.15, and the probability of occurrence of more than one collision is equal to zero:

, .

The unrealistic assumption that no more than one insured event can occur during one period is made in order to simplify the distribution of r.v. .

We will drop this assumption in the next section after we consider the distribution of the sum of several insurance claims.

Since is the value of the insurer's payments, and not the damage caused to the car, we can consider two characteristics, and.

First, the event includes those collisions in which the damage is less than the unconditional deductible, which is 250.

Second, the distribution of r.v. will have a "clot" of the probabilistic mass at the point of the maximum amount of insurance payments, which is equal to 2000.

Assume that the probabilistic mass concentrated at this point is 0.1. Further, suppose that the value of insurance payments in the interval from 0 to 2000 can be modeled by a continuous distribution with a density function proportional to (In practice, the continuous curve that is chosen to represent the distribution of premiums is the result of studies of premiums in the previous period.)

Summing up these assumptions about the conditional distribution of r.v. under the condition , we arrive at a mixed-type distribution that has a positive density in the range from 0 to 2000 and a certain “bunch” of the probabilistic mass at the point 2000. This is illustrated by the graph in Fig. 2.2.1.

The distribution function of this conditional distribution looks like this:

Fig.2.1. Distribution function of r.v. B under the condition I = 1

We calculate the mathematical expectation and variance in the considered example with car insurance in two ways.

First, we write out the distribution of the r.v. and use it to calculate and . Denoting through the distribution function of the r.v. , we have

For x<0

This is a mixed distribution. As shown in fig. 2.2, it has both a discrete (“clump” of probabilistic mass at point 2000) and a continuous part. Such a distribution function corresponds to a combination of the probability function

Rice. 2.2. Distribution function of r.v. X=IB

and density functions

In particular, and . That's why .

There are a number of formulas that relate the moments of random variables with conditional mathematical expectations. For the mathematical expectation and for the variance, these formulas have the form

(2.10)

(2.11)

It is assumed that the expressions on the left-hand sides of these equalities are calculated directly from the distribution of the r.v. . When calculating the expressions on the right-hand sides, namely, and , the conditional distribution of the r.v. is used. at a fixed value of r.v. .

These expressions are, therefore, functions of the r.v. , and we can calculate their moments using the distribution of r.v. .

Conditional distributions are used in many actuarial models and this allows the formulas above to be applied directly. In our model . Considering the r.v. as and r.v. as , we get

(2.12)

, (2.14)

, (2.15)

and consider conditional mathematical expectations

(2.16)

(2.17)

Formulas (2.16) and (2.17) are defined as a function of the r.v. , which can be written as the following formula:

Since at , then (2.21)

For we have and (2.22)

Formulas (2.21) and (2.22) can be combined: (2.23)

Thus, (2.24)

Substituting (2.21), (2.20), and (2.24) into (2.12) and (2.13), we get

Let's apply the received formulas for calculation and in an example of automobile insurance (fig. 2.2). Since the density function of the r.v. In the condition is expressed by the formula

and P(B=2000|I=1)= 0.1, we have

Finally, assuming q= 0.15, from formulas (2.25) and (2.26) we obtain the following equalities:

To describe another insurance situation, we can offer other models for r.v. .

Example: model for the number of deaths due to aviation accidents

As an example, consider a model for the number of deaths due to aviation accidents over a one-year period of an airline's operation.

We can start with a random variable that describes the number of deaths for one flight, and then sum these random variables over all flights in a year.

For one flight, the event will indicate the onset of an air crash. The number of deaths that this catastrophe entailed will be represented by the product of two random variables and , where is the aircraft load factor, i.e. the number of people on board at the time of the crash, and is the proportion of deaths among people on board.

The number of deaths is presented in this way, since separate statistics for and are more accessible than statistics for r.v. . So, although the proportion of deaths among persons on board and the number of persons on board are probably related, as a first approximation it can be assumed that the r.v. and independent.

Sums of independent random variables

In the individual risk model, insurance payments made by an insurance company are presented as the sum of payments to many individuals.

Recall two methods for determining the distribution of the sum of independent random variables. Consider first the sum of two random variables, whose sample space is shown in Fig. 3.1.

Rice. 2.3.1. Event

The line and the area under this line represent an event. Therefore, the distribution function of the r.v. S has the form (3.1)

For two discrete non-negative random variables, we can use the total probability formula and write (3.1) as

If a X and Y are independent, the last sum can be rewritten as

(3.3)

The probability function corresponding to this distribution function can be found by the formula

(3.4)

For continuous non-negative random variables, the formulas corresponding to formulas (3.2), (3.3) and (3.4) have the form

When either one or both random variables X and Y have a mixed type distribution (which is typical for individual risk models), the formulas are similar, but more cumbersome. For random variables that can also take negative values, the sums and integrals in the above formulas are taken over all values ​​of y from to .

In probability theory, the operation in formulas (3.3) and (3.6) is called the convolution of two distribution functions and and is denoted by . The convolution operation can also be defined for a pair of probability or density functions using formulas (3.4) and (3.7).

To determine the distribution of the sum of more than two random variables, we can use iterations of the convolution process. For , where are independent random variables, denotes the distribution function of the r.v., and is the distribution function of the r.v. , we'll get

Example 3.1 illustrates this procedure for three discrete random variables.

Example 3.1. Random variables , and are independent and have distributions defined by columns (1), (2) and (3) of the table below.

Let us write out the probability function and the distribution function of the r.v.

Solution. The table uses the notation introduced before the example:

Columns (1)-(3) contain the available information.

Column (4) is obtained from columns (1) and (2) using (3.4).

Column (5) is obtained from columns (3) and (4) using (3.4).

The definition of column (5) completes the determination of the probability function for the r.v. . Its distribution function in column (8) is the set of partial sums of column (5), starting from the top.

For clarity, we have included column (6), the distribution function for column (1), column (7), which can be obtained directly from columns (1) and (6) using (2.3.3), and column (8) determined by similarly for columns (3) and (7). Column (5) can be determined from column (8) by successive subtraction.

Let us turn to the consideration of two examples with continuous random variables.

Example 3.2. Let r.v. has a uniform distribution on the interval (0,2), and let the r.v. does not depend on the r.v. and has a uniform distribution on the interval (0,3). Let us define the distribution function of the r.v.

Solution. Since the distributions of r.v. and continuous, we use formula (3.6):

Then

Sample space of r.v. and is illustrated in Fig. 3.2. The rectangular area contains all possible values ​​of the pair and . The event of interest to us, , is depicted in the figure for five values s.

For each value, the line intersects the axis Y at the point s and a line at a point. The function values ​​for these five cases are described by the following formula:

Rice. 3.2. Convolution of two uniform distributions

Example 3.3. Let us consider three independent r.v. . For r.v. has an exponential distribution and . Let us find the density function of the r.v. by applying the convolution operation.

Solution. We have

Using formula (3.7) three times, we get

Another method for determining the distribution of the sum of independent random variables is based on the uniqueness of the moment generating function, which for r.v. is determined by the relation .

If this mathematical expectation is finite for all t from some open interval containing the origin, then is the only generating function of the distribution moments of the r.v. in the sense that there is no other function other than , which would be the generating function of the distribution moments of the r.v. .

This uniqueness can be used as follows: for the sum

If they are independent, then the expectation of the product in formula (3.8) is equal to ..., so

Finding an explicit expression for the only distribution corresponding to the generating function of the moments (3.9) would complete the finding of the distribution of the r.v. . If it is not possible to specify it explicitly, then it can be searched for by numerical methods.

Example 3.4. Consider the random variables from Example 3.3. Let us define the density function of the r.v. , using the generating function of the moments of the r.v. .

Solution. According to equality (3.9), which can be written as using the method of decomposition into simple fractions. The solution is . But is the generating function of the moments of the exponential distribution with the parameter , so that the density function of the r.v. has the form

Example 3.5. In the study of random processes, the inverse Gaussian distribution was introduced. It is used as a distribution of r.v. AT, the amount of insurance payments. The density function and the generating function of the moments of the inverse Gaussian distribution are given by the formulas

Let us find the distribution of r.v. , where r.v. are independent and have the same inverse Gaussian distributions.

Solution. Using formula (3.9), we obtain the following expression for the generating function of the r.v. moments. :

The generating function of the moments corresponds to a unique distribution, and it can be seen that it has an inverse Gaussian distribution with parameters and .

Approximations for sum distribution

The central limit theorem gives a method for finding numerical values ​​for the distribution of the sum of independent random variables. Usually this theorem is formulated for the sum of independent and identically distributed random variables, where .

For any n, the distribution of the r.v. where = , has mathematical expectation 0 and variance 1. As is known, the sequence of such distributions (for n= 1, 2, ...) tends to the standard normal distribution. When n large, this theorem is applied to approximate the distribution of r.v. normal distribution with mean μ and dispersion. Similarly, the distribution of the sum n random variables is approximated by a normal distribution with mean and variance.

The efficiency of such an approximation depends not only on the number of terms, but also on the closeness of the distribution of terms to the normal one. Many elementary statistics courses state that n must be at least 30 for the approximation to be reasonable.

However, one of the programs for generating normally distributed random variables used in simulation modeling implements a normal random variable as an average of 12 independent random variables uniformly distributed over the interval (0,1).

In many individual risk models, the random variables included in the sums are not equally distributed. This will be illustrated by examples in the next section.

The central limit theorem also extends to sequences of unequally distributed random variables.

To illustrate some applications of the individual risk model, we will use a normal approximation of the distribution of the sum of independent random variables to obtain numerical solutions. If a , then

and further, if r.v. independent, then

For the application in question, we only need:

  • find the averages and variances of random variables simulating individual losses,
  • sum them up to get the average and variance of losses of the insurance company as a whole,
  • use the normal approximation.

Below we illustrate this sequence of actions.

Applications for insurance

This section illustrates the use of the normal approximation with four examples.

Example 5.1. A life insurance company offers a one-year death insurance contract with payments of 1 and 2 units to persons whose probabilities of death are 0.02 or 0.01. The table below shows the number of persons nk in each of the four classes formed in accordance with the payment b k and the probability of an insured event qk:

k q k b k nk
1 0,02 1 500
2 0,02 2 500
3 0,10 1 300
4 0,10 2 500

The insurance company wants to collect from this group of 1800 individuals an amount equal to the 95th percentile of the distribution of the total insurance payments for this group. In addition, she wants each person's share of that amount to be proportional to the person's expected insurance payout.

The share of the person with the number , whose average payment is equal to , should be . It follows from the requirement of the 95th percentile that . The excess value, , is the risk premium, and is called the relative risk premium. Let's calculate .

Solution. The value is determined by the relation = 0.95, where S = X 1 + X 2 + ... + X 1800 . This probability statement is equivalent to the following:

In accordance with what was said about the central limit theorem in Sec. 4, we approximate the distribution of the r.v. standard normal distribution and use its 95th percentile, from which we get:

For the four classes into which the policyholders are divided, we obtain the following results:

k q k b k Average b k q k Variance b 2 k q k (1-q k) nk
1 0,02 1 0,02 0,0196 500
2 0,02 2 0,04 0,0784 500
3 0,10 1 0,10 0,0900 300
4 0,10 2 0,20 0,3600 500

In this way,

Therefore, the relative risk premium is

Example 5.2. The customers of a car insurance company are divided into two classes:

Class Number in class

Probability of occurrence

insured event

Distribution of insurance payments,

truncated exponential parameters

distribution

k L
1 500 0,10 1 2,5
2 2000 0,05 2 5,0

The truncated exponential distribution is defined by the distribution function

This is a mixed type distribution with a density function , and a "clump" of probabilistic mass at a point L. The graph of this distribution function is shown in Figure 5.1.

Rice. 5.1. Truncated exponential distribution

As before, the probability that the total amount of insurance payments exceeds the amount collected from policyholders should be equal to 0.05. We will assume that the relative risk premium should be the same in each of the two classes under consideration. Let's calculate .

Solution. This example is very similar to the previous one. The only difference is that the values ​​of insurance payments are now random variables.

First, we will obtain expressions for the moments of the truncated exponential distribution. This will be a preparatory step for applying formulas (2.25) and (2.26):

Using the parameter values ​​given in the condition and applying formulas (2.25) and (2.26), we obtain the following results:

k q k µk σ 2 k Average q k μ k Dispersion μ 2 k q k (1-q k)+σ 2 k q k nk
1 0,10 0,9139 0,5828 0,09179 0,13411 500
2 0,05 0,5000 0,2498 0,02500 0,02436 2000

So, S, the total amount of insurance payments, has moments

The condition for the definition remains the same as in Example 5.1, namely,

Using again the normal distribution approximation, we get

Example 5.3. The portfolio of the insurance company includes 16,000 death insurance contracts for a period of one year according to the following table:

The probability of an insured event q for each of 16,000 clients (these events are assumed to be mutually independent) is 0.02. The company wants to set its own retention rate. For each policyholder, the level of own retention is the value below which this company (assigning company) makes payments independently, and payments exceeding this value are covered under the reinsurance agreement by another company (reinsurer).

For example, if the own retention rate is 200,000, then the company reserves coverage up to 20,000 for each insured and buys reinsurance to cover the difference between the premium and the amount of 20,000 for each of the 4,500 policyholders whose insurance premiums exceed 20,000 .

The company chooses as a decision criterion the minimization of the probability that insurance claims left on its own deduction, plus the amount paid for reinsurance, will exceed the amount of 8,250,000. Reinsurance costs 0.025 per unit of coverage (i.e. 125% of the expected the value of insurance payments per unit 0.02).

We believe that the portfolio under consideration is closed: new insurance contracts entered into during the current year will not be taken into account in the described decision-making process.

Partial solution. Let's do all the calculations first, choosing 10,000 as the payout unit. As an illustration, suppose that c. in. S is the amount of payments left on own deduction, has the following form:

To these insurance payouts left on your own deduction S, the amount of reinsurance premiums is added. In total, the total amount of coverage according to this scheme is

The amount left on own deduction is equal to

Thus, the total reinsured value is 35,000-24,000=11,000 and the cost of reinsurance is

Hence, at the own retention level equal to 2, the insurance payments left on own retention plus the cost of reinsurance are . The decision criterion is based on the probability that this total will exceed 825,

Using the normal distribution, we get that this value is approximately equal to 0.0062.

The average values ​​of insurance payments in case of loss excess insurance, as one of the types of reinsurance, can be approximated using the normal distribution as the distribution of total insurance payments.

Let the total insurance payments X have a normal distribution with mean and variance

Example 5.4. Let's consider an insurance portfolio, as in an example 5.3. Let us find the mathematical expectation of the amount of insurance payments under the insurance contract for the excess of unprofitability, if

(a) there is no individual reinsurance and the unconditional deductible is set at 7,500,000

(b) a personal withholding of 20,000 on individual insurance contracts is established and the unconditional deductible for the portfolio is 5,300,000.

Solution.

(a) In the absence of individual reinsurance and in the transition to 10,000 as a currency

applying formula (5.2) gives

which is the sum of 43,770 in the original units.

(b) In Exhibit 5.3, we get the mean and variance of total premiums for an individual deductible of 20,000 to be 480 and 784, respectively, using 10,000 as a unit. Thus, =28.

applying formula (5.2) gives

which is the sum of 4140 in the original units.

Definition. Random variables Х 1 , Х 2 , …, Х n are called independent if for any x 1, x 2 , …, x n the events are independent

(ω: X 1 (ω)< x},{ω: Х 2 (ω) < x},…, {ω: Х n (ω) < x n }.

It follows directly from the definition that for independent random variables X 1, X 2, …, X n distribution function n-dimensional random variable X = X 1, X 2, …, X n is equal to the product of distribution functions of random variables X 1, X 2, …, X n

F(x 1 , x2, …, x n) = F(x 1)F(x2)…F(x n). (1)

Let us differentiate equality (1) n times by x 1 , x2, …, x n, we get

p(x 1 , x2, …, x n) = p(x 1)p(x2)…p(x n). (2)

Another definition of the independence of random variables can be given.

If the law of distribution of one random variable does not depend on what possible values ​​other random variables have taken, then such random variables are called independent in the aggregate.

For example, two lottery tickets of different editions are purchased. Let X– the amount of winnings for the first ticket, Y– the amount of winnings for the second ticket. random variables X and Y- independent, since the winning of one ticket will not affect the law of distribution of the other. But if the tickets are of the same issue, then X and Y- dependent.

Two random variables are called independent if the distribution law of one of them does not change depending on what possible values ​​the other variable has taken.

Theorem 1(convolutions) or "the theorem on the density of the sum of 2 random variables".

Let X = (X 1;X 2) is an independent continuous two-dimensional random variable, Y = X 1+ X 2. Then the distribution density

Proof. It can be shown that if , then

where X = (X 1 , X 2 , …, X n). Then if X = (X 1 , X 2), then the distribution function Y = X 1 + X 2 can be defined as follows (Fig. 1) –

In accordance with the definition, the function is the distribution density of the random variable Y = X 1 + X 2 , i.e.

py (t) = which was to be proved.

Let us derive a formula for finding the probability distribution of the sum of two independent discrete random variables.

Theorem 2. Let X 1 , X 2 – independent discrete random variables,

Proof. Imagine an event A x = {X 1 +X 2 = x) as a sum of incompatible events

A x = å( X 1 = x i ; X 2 = xx i).

Because X 1 , X 2 - independent then P(X 1 = x i ; X 2 = xx i) = P(X 1 = x i) P(X 2 = x-x i), then

P(A x) = P(å( X 1 = x i ; X 2 = x – x i)) = å( P(X 1 = x i) P(X 2 = x-x i))

Q.E.D.

Example 1 Let X 1 , X 2 - independent random variables having a normal distribution with parameters N(0;1); X 1 , X 2 ~ N(0;1).

Let us find the distribution density of their sum (we denote X 1 = x, Y = X 1 +X 2)


It is easy to see that the integrand is the distribution density of a normal random variable with parameters a= , , i.e. the integral is 1.

Function py(t) is the density of the normal distribution with parameters a = 0, s = . Thus, the sum of independent normal random variables with parameters (0,1) has a normal distribution with parameters (0,), i.e. Y = X 1 + X 2 ~ N(0;).

Example 2. Let two discrete independent random variables with Poisson distribution be given, then

where k, m, n = 0, 1, 2, …,¥.

By Theorem 2 we have:

Example 3 Let X 1, X 2 - independent random variables with exponential distribution . Let's find the density Y= X 1 +X 2 .

Denote x = x 1. Since X 1, X 2 are independent random variables, then we use the “convolution theorem”

It can be shown that if the sum ( Х i have an exponential distribution with parameter l), then Y= has a distribution called the Erlang distribution ( n- 1) order. This law was obtained by modeling the operation of telephone exchanges in the first works on the theory of queuing.

In mathematical statistics, distribution laws are often used for random variables that are functions of independent normal random variables. Let's consider three laws most frequently encountered in modeling random phenomena.

Theorem 3. If random variables are independent X 1, ..., X n, then the functions of these random variables are also independent Y 1 = f 1 (X 1), ...,Y n = f n(X n).

Pearson distribution(from 2 -distribution). Let X 1, ..., X n are independent normal random variables with parameters a= 0, s = 1. Compose a random variable

In this way,

It can be shown that the density for x > 0 has the form , where k n is some coefficient for the condition to be met. As n ® ¥, the Pearson distribution tends to the normal distribution.

Let Х 1 , Х 2 , …, Хn ~ N(a,s), then random variables ~ N(0,1). Therefore, the random variable has a c 2 distribution with n degrees of freedom.

The Pearson distribution is tabulated and used in various applications of mathematical statistics (for example, when testing the hypothesis that the distribution law is consistent).