Dr Mike

Puddephat

Online

Taken from Mike Puddephat's PhD, this article introduces probability theory and random variables.

In this article probability theory and random variables are introduced. If more information is required, the book "*Probability and Random Processes*" by Grimmett & Stirzaker (1982) is a good place to start.

Suppose a needle is dropped onto a floor made up of planks of wood. The needle may or may not intersect one of the joints between the planks. A single throw of the needle is called an *experiment* or *trial*. There are two possible *outcomes*. Either the needle intersects a joint(s) or it lands between the joints. By repeating the experiment a large number of times, the probability **P** of a particular *outcome* or *event* can be calculated. Let *A* be the event "needle intersects a joint". Let *N*(*A*) be the number of occurrences of *A* over *n* trials. As *n* tends to infinity, *N*(*A*)/*n* converges to the probability that *A* occurs, **P**(*A*), on any particular trial. On occasions, a probability can be assigned to an outcome without experiment. Buffon (1777) found this to be the case for the needle throwing experiment. The set of all possible outcomes of an experiment is called the *sample space* and is denoted by Ω. A *random variable* is a function *X* : Ω → ℜ.

Uppercase letters will be used to represent generic random variables, whilst lowercase letters will be used to represent possible numerical values of these variables. To describe the probability of possible values of *X*, consider the following definition. The *distribution function* of a random variable *X* is the function *F _{X}* : ℜ → [0, 1] given by

The random variable *X* is *discrete* if it takes values in some countable subset {*x*_{1}, *x*_{2}, …}, only, of ℜ. The distribution function of such a random variable has jump discontinuities at the values *x*_{1}, *x*_{2}, … and is constant in between. The function *f _{X}* : ℜ → [0, 1] given by

The expected value of *X* is often written as *μ*.

It is often of great interest to measure the extent to which a random variable *X* is dispersed. The *variance* of *X* or Var(*X*) is defined as follows:

(2)

The variance of *X* is often written as σ^{2}, while its positive square root is called the *standard deviation*. Since *X* is discrete, (2) can be re-expressed accordingly:

(3)

In the special case where the mass function *f _{X}*(

(4)

Events *A* and *B* are said to be *independent* if and only if the incidence of *A* does not change the probability of *B* occurring. An equivalent statement is **P**(*A* ∩ *B*) = **P**(*A*)**P**(*B*). Similarly, the discrete random variables *X* and *Y* are called *independent* if the numerical value of *X* does not affect the distribution of *Y*. In other words, the events {*X *= *x*} and {*Y* = *y*} are independent for all *x* and *y*. The *joint distribution* function *F _{X}*

Consider an archer, shooting arrows at the target shown in Figure 1. Suppose the archer is a very poor shot and hits the target randomly - in other words, target regions of equal area will have the same probability of being hit. For simplicity, it is assumed the archer always hits the target. If the archer is allowed to fire two arrows, the sample space

Ω = { AA, AB, AC, AD, AE, BA, BB, …, DD, DE, EA, EB, EC, ED, EE }.

**Figure 1:** An archery target. A hit in region A scores 4 points, B scores 3 points, C scores 2 points, D scores 1 point and E scores nothing.

Let the variable *X*(ω) represent the score of a particular outcome. The scoring guidelines outlined in Figure 1 imply

*X*(AA) = 8, *X*(AB) = *X*(BA) = 7, *X*(AC) = *X*(BB) = *X*(CA) = 6, …,*X*(CE) = *X*(DD) = *X*(EC) = 2, *X*(DE) = *X*(ED) = 1, *X*(EE) = 0.

Clearly *X* is a discrete random variable, mapping the sample space Ω to scores (real numbers).

The probability that an arrow hits a target region is directly proportional to the area of the region. The regions A to E are annuli with inner and outer radii as shown in Figure 1. The probabilities of hitting A to E are 1/25, 3/25, 5/25, 7/25 and 9/25 respectively. The mass function of *X*, *f _{X}*(

*f _{X}*(0) =

From (1), the expected value of *X* is **E**(*X*) = 0.81/625 + 1.126/625 + 2.139/625 + 3.124/625 + … = 2.4. From (3), the variance of *X* is Var(*X*) = (2.4^{2})⋅81/625 + (1.4^{2})⋅126/625 + (0.4^{2})⋅139/625 + (0.6^{2})⋅124/625 + … = 2.72. The distribution function of *X*, *F _{X}*(

*F _{X}*(0) =

The distribution function *F _{X}*(

**Figure 2:** The distribution function *F _{X}* of

The random variable *X* is *continuous* if its distribution function can be expressed as

(5)

for some integrable function *f _{X}* : ℜ → [0, ∞). In this case,

*f*(*x*)δ*x* can be thought of as the element of probability **P**(*x* ≤ *X* ≤ *x* + δ*x*) where

If *B*_{1} is a measurable subset of ℜ (such as a line segment or union of line segments) then

(7)

where **P**(*X* ∈ *B*_{1}) is the probability that the outcome of this random choice lies in *B*_{1}. The expected value (or *expectation*) of *X* with density function *f _{X}* is

whenever this integral exists. The *variance* of *X* or Var(*X*) is defined by the already familiar (2). Since *X* is continuous, (2) can be re-expressed accordingly:

(9)

The *joint distribution function* of the continuous random variables *X* and *Y* is the function *F _{X, Y}* : ℜ

for each *x*, *y* ∈ ℜ. The fundamental theorem of calculus suggests the following result

*f _{X, Y}*(

If *B*_{2} is a measurable subset of ℜ^{2} (such as a rectangle or union of rectangles and so on) then

where **P**((*X*, *Y*) ∈ *B*_{2}) is the probability that the outcome of this random choice lies in *B*_{2}. *X* and *Y* are *independent* if and only if {*X* ≤ *x*} and {*Y* ≤ *y*} are independent events for all *x*, *y* ∈ ℜ. If *X* and *Y* are independent, *F _{X, Y}*(

An example of a continuous random variable can be found in the needle throwing described earlier. A needle is thrown onto the floor and lands with random angle ω relative to some fixed axis. The sample space Ω = [0, 2π). The angle Ω is equally likely in the real interval [0, 2π). Therefore, the probability that the angle lies in some interval is directly proportional to the length of the interval. Consider the continuous random variable *X*(ω) = ω. The distribution function of *X*, shown graphically in Figure 3, is

*F _{X}*(0) =

The density function, *f _{X}*, of

**Figure 3:** The distribution function *F _{X}* of

BUFFON, G. L. L. Comte de. Essai d'Arithmétique Morale. In: Supplément à l'Histoire Naturelle, v. 4. Paris: Imprimerie Royale (1777).

GRIMMETT, G. and STIRZAKER, D. Probability and Random Processes, Clarendon Press, Oxford (1982).

- Posted: 14 February, 2012

- Posted: 1 December, 2011

- Posted: 29 June, 2010

- Posted: 24 June, 2010

- Posted: 23 June, 2010

- Posted: 21 June, 2010

- Posted: 17 June, 2010

- Posted: 17 June, 2010

- Posted: 15 June, 2010

- Posted: 14 June, 2010