Chapter 13: Markov Chains#
“Do the difficult things while they are easy and do the great things while they are small. A journey of a thousand miles must begin with a single step.”
—Lao Tzu
See also
A stochastic process \(\{X_{t},t \in T\}\) is a collection of random variables, in which index \(t\) is time. We refer to \(X_{t}\) as the state of the process at time \(t\). The set \(T\) is called the index set of the process. When \(T\) is a countable set (indexed by integers), the process is said to be a discrete time process. If \(T\) is an interval of real line, the process is said to be a continuous time process.
Example 73
\(X_{t}\) is the total number of customers who have entered a supermarket by time \(t\). This is a discrete state continuous time stochastic process. \(X_{t} = 0,1,\ldots,\) and \(t \ge 0\).
Example 74
\(X_{t}\) is the number of individuals in a population in each generation \(t\), i.e., \(X_{t} = 0,1,2,\ldots\), and \(t = 0, 1, 2, \dots\) This is a discrete state discrete time stochastic process.
Discrete time Markov chains#
Markov property
Given the current state \(X_{n}\), the future state \(X_{n + 1}\) does not depend on the past states \((X_{n - 1},\ X_{n - 2},\ldots,X_{0})\), i.e., \(P\left( X_{n + 1} \middle| X_{n},\ldots,X_{0} \right) = P(X_{n + 1}|X_{n})\).
Definition 28 (Markov chain)
A discrete time Markov chain \(\{ X_{n},n = 0,1,2\ldots\}\) is a discrete state discrete time stochastic process that satisfies Markov property.
Suppose that \(\{X_{n}\}\) can take on non-negative integers, i.e., the state space is \(\{0, 1, 2, \dots\}\). If \(X_{n} = i\), we say that the process is in state \(i\) at time \(n\).
How do we define a DTMC?
In the context of a discrete-time Markov Chain (DTMC) denoted as \({X_{n}}\), the objective is to determine the probability distribution \(\pi_n\) of the random variable \(X_n\) at time \(n\). The specification of a DTMC requires knowledge of the initial state \(\pi_0\) at time 0 and the transition probabilities \(P(X_{n+1}|X_n)\) from time \(n\) to time \(n+1\), outlining how the chain begins and moves.
One-step transition probability matrix#
The one-step (from time \(n\) to time \(n+1\)) transition probability from state \(i\) to state \(j\) is defined as
Note that
If the one-step transition probabilities are independent of time index \(n\), i.e.,
they are called time homogeneous transition probabilities and denoted by \(P_{ij}\). Note that
We use \(P\) to denote the matrix of time homogeneous transition probabilities \(P_{ij}\), where \(i, j = 0,1,\dots\).
A discrete time Markov chain is determined if the initial state \(X_0\) and homogeneous transition probability matrix are given. The initial state \(X_0\) defines how the chain starts and the transition probability matrix \(P\) determines how the chain moves.
Example 75
Suppose that a gene has two alleles \(A\) and \(a\). The alleles \(A\) and \(a\) may mutate to another type in the next generation. The chance that A mutates to a is α and the chance that a mutates to A is β. The evolution of alleles \(A\) and \(a\) over generations is a Markov chain with one-step transition probabilities
The transition probabilities are independent of time \(n\). Thus, they are time homogeneous transition probabilities. The one-step transition probability matrix is
Example 76
We consider a population of \(N\) individuals. Let \(X_{n}\) denote the number of individuals with allele A in generation \(n\). Suppose \(X_{0} = k, 0 \leq k \leq N\). Given \(X_{i}\) (the number of A in generation i), \(X_{i + 1}\ \)has the Binomial \((N, X_i/N)\). The transition probabilities are independent of time n. Thus, they are time homogeneous transition probabilities. The one-step transition probability matrix is given by
where \(B\left(i,\frac{j}{N}\right)\) is the probability \(P(X = i)\) when \(X\) has the Binomial \((N, p = \frac{j}{N})\).
Example 77
A gambling model. a gamble either wins $1 with probability \(p\) or loses $1 with probability 1-p. if the gambler quits playing either when he goes broke or he attains a fortune of $N, the one-step transition probability \(P_{i,i + 1} = p\) and \(P_{i,i - 1} = 1 - p\), for \(i = 1,..,N\). In addition, \(P_{00} = P_{NN} = 1\). The one-step transition probability matrix is
Important
In this example, states 0 and N are absorbing states, because once the process gets into those states it will never get out.
Example 78
In Example 73, if the initial state \(P(X_0) = (0.1, 0.9)\) and \(\alpha = 0.1\) and \(\beta=0.2\), find \(P(X_1)\).
The one-step transition probability matrix is
Next, we find the probability \(P(X_1)\)
We have a matrix representation for the probability distribution of \(X_{1}\),
Important
In general, the probability distribution of \(X_n\) is determined by the initial state \(P(X_0)\) and the one-step transition probability matrix \(P\), i.e.,
p = matrix(c(0.8,0.1,0.2,0.9),2,2)
x_0 = c(0.1,0.9)
x_1 = x_0 %*% p
print(paste("the probability distribution of x_1 is", x_1))
library(expm)
x_10 = x_0 %*% (p %^% 10)
print(paste("the probability distribution of x_10 is",x_10))
[1] "the probability distribution of x_1 is 0.17"
[2] "the probability distribution of x_1 is 0.83"
Loading required package: Matrix
Attaching package: ‘expm’
The following object is masked from ‘package:Matrix’:
expm
[1] "the probability distribution of x_10 is 0.32674224419"
[2] "the probability distribution of x_10 is 0.67325775581"
Limiting probability distribution#
Another major goal of the Markov chain theory is to determine the limiting probabilities \(\lim_{n \rightarrow \infty}{P(X_{n})}\), as time \(n\) goes to infinity.
Theorem 21
If the Markov chain is ergodic (irreducible, aperiodic, and positive recurrent), limiting probabilities \(\pi\) exist and satisfy the equations
library(expm)
print(paste("the probability for time = 10", round(x_0 %*% (p %^% 10),5)))
print(paste("the probability for time = 20", round(x_0 %*% (p %^% 20),5)))
print(paste("the probability for time = 30", round(x_0 %*% (p %^% 30),5)))
print(paste("the probability for time = 40", round(x_0 %*% (p %^% 40),5)))
print(paste("the probability for time = 50", round(x_0 %*% (p %^% 50),5)))
print(paste("the probability for time = 60", round(x_0 %*% (p %^% 60),5)))
[1] "the probability for time = 10 0.32674"
[2] "the probability for time = 10 0.67326"
[1] "the probability for time = 20 0.33315"
[2] "the probability for time = 20 0.66685"
[1] "the probability for time = 30 0.33333"
[2] "the probability for time = 30 0.66667"
[1] "the probability for time = 40 0.33333"
[2] "the probability for time = 40 0.66667"
[1] "the probability for time = 50 0.33333"
[2] "the probability for time = 50 0.66667"
[1] "the probability for time = 60 0.33333"
[2] "the probability for time = 60 0.66667"
It’s important to note that once the chain attains the limiting probability distribution, it remains within that distribution. Consequently, the limiting probability distribution is also referred to as the stationary distribution.
x_lim = x_0 %*% (p %^% 60)
print(paste("the limit probability is",round(x_lim,5)))
print(paste("the probability after one more move is", round(x_lim %*% p,5)))
[1] "the limit probability is 0.33333" "the limit probability is 0.66667"
[1] "the probability after one more move is 0.33333"
[2] "the probability after one more move is 0.66667"
In the case of an irreducible finite-state discrete-time Markov chain, the ergodic conditions are always fulfilled. Therefore, we can determine the limiting probabilities by solving the equations:
Example 79
Consider a Markov chain of two alleles (a, A) with the one-step transition probability \(P=\begin{pmatrix} 0.2 & 0.8 \\ 0.1 & 0.9 \\ \end{pmatrix}\) What are the limiting probabilities of a and A?
From the theorem, we know that the limiting probabilities, \(\pi_{a}\) and \(\pi_{A}\), must satisfy the following equations,
Thus, \(\pi_{a} = \frac{1}{9}\) and \(\pi_{A} = \frac{8}{9}\). In addition, if the Markov chain starts with the limiting probabilities, i.e., \(P( X_{0}) = \left(\frac{1}{9},\frac{8}{9}\right)\), then \(P( X_{n}) = \left(\frac{1}{9},\frac{8}{9}\right)\) for all \(n\).
x = matrix(c(0.2,0.1,0.8,0.9),2,2)
eigen(t(x))
print(paste("the limiting probability is", eigen(t(x))$vectors[,1]/sum(eigen(t(x))$vectors[,1])))
eigen() decomposition
$values
[1] 1.0 0.1
$vectors
[,1] [,2]
[1,] -0.1240347 -0.7071068
[2,] -0.9922779 0.7071068
[1] "the limiting probability is 0.111111111111111"
[2] "the limiting probability is 0.888888888888889"
Continuous time Markov chains#
Definition 29 (continuous time Markov chain)
A continuous time Markov chain \(\{ X_{t}\}\) is a discrete state continuous time stochastic process that satisfies Markov property.
Markov property for continuous times
For all \(s,t \geq 0\), and nonnegative integers \(i, j\),
Suppose \(\{ X_{t},t \geq 0\}\) is a continuous time stochastic process taking on values in the set of nonnegative integers {0, 1, 2, …}. In addition, if \(P\left( X_{t + s} = j \middle| X_{s} = i \right)\) is independent of \(s\), the continuous time Markov chain is said to have stationary or homogeneous transition probabilities, denoted by \(P_{ij}(t)\), the probability of transition from state \(i\) to state \(j\) in time \(t\).
Finite state space#
Suppose there are \(n\) states. The transition probabilities \(P_{ij}(t)\) can be represented by a \(n\times n\) matrix \(P(t)\). Given the transition probability matrix \(P(t)\), the probability distribution \(\pi_t\) of \(X_{t}\) is equal to the initial probability \(\pi_0\) at time 0 multiplied by the transition probability matrix \(P(t)\),
The transition probability matrix \(P(t)\) can be derived from the transition rate matrix \(Q\). The transition rate matrix \(Q\) is equivalent to the one-step transition probability matrix in the discrete time Markov chain. In other words, the continuous time Markov chain is determined by the initial state \(\pi_0\) and the transition rate matrix \(Q\).
From Kolmogorov backward equations, the transition probability matrix \(P(t)\) satisfies the differential equation
The solution to this differential equation is
Thus,
Theorem 22
The limiting probabilities \(\pi_\infty=\lim_{t \rightarrow \infty}{P\left( X_{t} = j \middle| X_{0} = i \right)}\), if exist, are independent of the initial state \(X_{0}\), and must satisfy the equation
From the theorem, the limiting probabilities \(\pi_\infty\), if exist, can be found by solving the equations