Dr Mike
Puddephat
Online

An introduction to random variables (part 1)

An introduction to random variables (part 1)

Taken from Mike Puddephat's PhD, this article introduces probability theory and random variables.

In this article probability theory and random variables are introduced. If more information is required, the book "Probability and Random Processes" by Grimmett & Stirzaker (1982) is a good place to start.

Suppose a needle is dropped onto a floor made up of planks of wood. The needle may or may not intersect one of the joints between the planks. A single throw of the needle is called an experiment or trial. There are two possible outcomes. Either the needle intersects a joint(s) or it lands between the joints. By repeating the experiment a large number of times, the probability P of a particular outcome or event can be calculated. Let A be the event "needle intersects a joint". Let N(A) be the number of occurrences of A over n trials. As n tends to infinity, N(A)/n converges to the probability that A occurs, P(A), on any particular trial. On occasions, a probability can be assigned to an outcome without experiment.  Buffon (1777) found this to be the case for the needle throwing experiment. The set of all possible outcomes of an experiment is called the sample space and is denoted by Ω. A random variable is a function X : Ω → ℜ.

Uppercase letters will be used to represent generic random variables, whilst lowercase letters will be used to represent possible numerical values of these variables. To describe the probability of possible values of X, consider the following definition. The distribution function of a random variable X is the function FX : ℜ → [0, 1] given by FX(x) = P(Xx).

Discrete Random Variables

The random variable X is discrete if it takes values in some countable subset {x1, x2, …}, only, of ℜ. The distribution function of such a random variable has jump discontinuities at the values x1, x2, … and is constant in between. The function fX : ℜ → [0, 1] given by fX(x) = P(X = x) is called the (probability) mass function of X. The mean value, or expectation, or expected value of X with mass function fX, is defined to be

Equation 1 (1)

The expected value of X is often written as μ.

It is often of great interest to measure the extent to which a random variable X is dispersed. The variance of X or Var(X) is defined as follows:

Equation 2 (2)

The variance of X is often written as σ2, while its positive square root is called the standard deviation. Since X is discrete, (2) can be re-expressed accordingly:

Equation 3 (3)

In the special case where the mass function fX(x) is constant and X takes n real values, (3) reduces to a well known equation determining the variance of a set of n numbers:

Equation 4 (4)

Events A and B are said to be independent if and only if the incidence of A does not change the probability of B occurring. An equivalent statement is P(AB) = P(A)P(B). Similarly, the discrete random variables X and Y are called independent if the numerical value of X does not affect the distribution of Y. In other words, the events {X = x} and {Y = y} are independent for all x and y. The joint distribution function FX, Y : ℜ2 → [0, 1] of X and Y is given by FX, Y(x, y) = P(X ≤ x and Y ≤ y). Their joint mass function fX, Y : ℜ2 → [0, 1] is given by fX, Y(x, y) = P(X = x and Y = y). X and Y are independent if and only if fX, Y(x, y) = fX(x)fY(y) for all x, y ∈ ℜ.

Consider an archer, shooting arrows at the target shown in Figure 1. Suppose the archer is a very poor shot and hits the target randomly - in other words, target regions of equal area will have the same probability of being hit. For simplicity, it is assumed the archer always hits the target. If the archer is allowed to fire two arrows, the sample space

Ω = { AA, AB, AC, AD, AE, BA, BB, …, DD, DE, EA, EB, EC, ED, EE }.

Figure 1

Figure 1: An archery target. A hit in region A scores 4 points, B scores 3 points, C scores 2 points, D scores 1 point and E scores nothing.

Let the variable X(ω) represent the score of a particular outcome. The scoring guidelines outlined in Figure 1 imply

X(AA) = 8, X(AB) = X(BA) = 7, X(AC) = X(BB) = X(CA) = 6, …,
X(CE) = X(DD) = X(EC) = 2, X(DE) = X(ED) = 1, X(EE) = 0.

Clearly X is a discrete random variable, mapping the sample space Ω to scores (real numbers).

The probability that an arrow hits a target region is directly proportional to the area of the region. The regions A to E are annuli with inner and outer radii as shown in Figure 1. The probabilities of hitting A to E are 1/25, 3/25, 5/25, 7/25 and 9/25 respectively. The mass function of X, fX(x), is then

fX(0) = P(X = 0) = P(Hit E)P(Hit E) = 81/625,
fX(1) = P(X = 1) = 2 . P(Hit D)P(Hit E) = 126/625,
fX(2) = P(X = 2) = 2 . P(Hit C)P(Hit E) + P(Hit D)P(Hit D) = 139/625 and so on.

From (1), the expected value of X is E(X) = 0.81/625 + 1.126/625 + 2.139/625 + 3.124/625 + … = 2.4. From (3), the variance of X is Var(X) = (2.42)⋅81/625 + (1.42)⋅126/625 + (0.42)⋅139/625 + (0.62)⋅124/625 + … = 2.72. The distribution function of X, FX(x), is then

FX(0) = P(X ≤ 0) = fX(0),
FX(1) = P(X ≤ 1) = fX(1) + fX(0),
FX(2) = P(X ≤ 2) = fX(2) + fX(1) + fX(0) and so on.

The distribution function FX(x) is shown in Figure 2.

Figure 2

Figure 2: The distribution function FX of X for the archery target.

Continuous Random Variables

The random variable X is continuous if its distribution function can be expressed as

Equation 5 (5)

for some integrable function fX : ℜ → [0, ∞). In this case, fX is called the (probability) density function of X. The fundamental theorem of calculus and (5) imply

Equation

f(xx can be thought of as the element of probability P(xXx + δx) where

Equation 6 (6)

If B1 is a measurable subset of ℜ (such as a line segment or union of line segments) then

Equation 7(7)

where P(XB1) is the probability that the outcome of this random choice lies in B1. The expected value (or expectation) of X with density function fX is

Equation 8 (8)

whenever this integral exists. The variance of X or Var(X) is defined by the already familiar (2). Since X is continuous, (2) can be re-expressed accordingly:

Equation 9 (9)

The joint distribution function of the continuous random variables X and Y is the function FX, Y : ℜ2 → [0, 1] given by FX, Y(x, y) = P(Xx, Yy). X and Y are (jointly) continuous with joint (probability) density function fX, Y : ℜ2 → [0, ∞) if

for each x, y ∈ ℜ.  The fundamental theorem of calculus suggests the following result

Equation

fX, Y(x, yxδy can be thought of as the element of probability P(xXx + δx, yYy + δy) where

Equation 10 (10)

If B2 is a measurable subset of ℜ2 (such as a rectangle or union of rectangles and so on) then

Equation 11 (11)

where P((X, Y) ∈ B2) is the probability that the outcome of this random choice lies in B2. X and Y are independent if and only if {Xx} and {Yy} are independent events for all x, y ∈ ℜ. If X and Y are independent, FX, Y(x, y) = FX(x)FY(y) for all x, y ∈ ℜ. An equivalent condition is fX, Y(x, y) = fX(x)fY(y) whenever FX, Y is differentiable at (x, y).

An example of a continuous random variable can be found in the needle throwing described earlier. A needle is thrown onto the floor and lands with random angle ω relative to some fixed axis. The sample space Ω = [0, 2π). The angle Ω is equally likely in the real interval [0, 2π). Therefore, the probability that the angle lies in some interval is directly proportional to the length of the interval. Consider the continuous random variable X(ω) = ω. The distribution function of X, shown graphically in Figure 3, is

FX(0) = P(X ≤ 0) = 0,
FX(x) = P(X ≤ x) = x/2π, (0 ≤ x < 2π)
FX(2π) = P(X ≤ 2π) = 1.

The density function, fX, of FX is as follows:

Equation

Figure 3

Figure 3: The distribution function FX of X for the needle.

References

BUFFON, G. L. L. Comte de. Essai d'Arithmétique Morale. In: Supplément à l'Histoire Naturelle, v. 4. Paris: Imprimerie Royale (1777).

GRIMMETT, G. and STIRZAKER, D. Probability and Random Processes, Clarendon Press, Oxford (1982).

Latest articles in this section

Michael James Puddephat BSc MSc PhD

  • Posted: 14 February, 2012

IT2 workbenches and process maps

  • Posted: 1 December, 2011

Single object stereology (part 3)

  • Posted: 23 June, 2010

Single object stereology (part 2)

  • Posted: 21 June, 2010

Single object stereology (part 1)

  • Posted: 17 June, 2010

What is stereology?

  • Posted: 17 June, 2010

Random position and orientation

  • Posted: 15 June, 2010