5. The Kolmogorov Forward Equation#
In addition to what’s in Anaconda, this lecture will need the following libraries:
!pip install quantecon
Show code cell output
Requirement already satisfied: quantecon in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (0.7.2)
Requirement already satisfied: numba>=0.49.0 in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from quantecon) (0.60.0)
Requirement already satisfied: numpy>=1.17.0 in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from quantecon) (1.26.4)
Requirement already satisfied: requests in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from quantecon) (2.32.3)
Requirement already satisfied: scipy>=1.5.0 in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from quantecon) (1.13.1)
Requirement already satisfied: sympy in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from quantecon) (1.13.2)
Requirement already satisfied: llvmlite<0.44,>=0.43.0dev0 in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from numba>=0.49.0->quantecon) (0.43.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from requests->quantecon) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from requests->quantecon) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from requests->quantecon) (2.2.3)
Requirement already satisfied: certifi>=2017.4.17 in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from requests->quantecon) (2024.8.30)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from sympy->quantecon) (1.3.0)
5.1. Overview#
In this lecture we approach continuous time Markov chains from a more analytical perspective.
The emphasis will be on describing distribution flows through vector-valued differential equations and their solutions.
These distribution flows show how the time
Distribution flows will be identified with initial value problems generated by autonomous linear ordinary differential equations (ODEs) in vector space.
We will see that the solutions of these flows are described by Markov semigroups.
This leads us back to the theory we have already constructed – some care will be taken to clarify all the connections.
In order to avoid being distracted by technicalities, we continue to defer our
treatment of infinite state spaces, assuming throughout this lecture that
As before,
We will use the following imports
import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
import quantecon as qe
from numba import njit
from scipy.linalg import expm
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D
from mpl_toolkits.mplot3d.art3d import Poly3DCollection
5.2. From Difference Equations to ODEs#
Previously we generated this figure, which shows how distributions evolve over time for the inventory model under a certain parameterization:
(Hot colors indicate early dates and cool colors denote later dates.)
We also learned how this flow is related to the Kolmogorov backward equation, which is an ODE.
In this section we examine distribution flows and their connection to ODEs and continuous time Markov chains more systematically.
5.2.1. Review of the Discrete Time Case#
Let
Recall that, in the discrete time case, the distribution
where distributions are understood as row vectors.
Here’s a visualization for the case
The initial condition is (0, 0, 1)
and the Markov matrix is
P = ((0.9, 0.1, 0.0),
(0.4, 0.4, 0.2),
(0.1, 0.1, 0.8))
Show code cell source
def unit_simplex(angle):
fig = plt.figure(figsize=(8, 6))
ax = fig.add_subplot(111, projection='3d')
vtx = [[0, 0, 1],
[0, 1, 0],
[1, 0, 0]]
tri = Poly3DCollection([vtx], color='darkblue', alpha=0.3)
tri.set_facecolor([0.5, 0.5, 1])
ax.add_collection3d(tri)
ax.set(xlim=(0, 1), ylim=(0, 1), zlim=(0, 1),
xticks=(1,), yticks=(1,), zticks=(1,))
ax.set_xticklabels(['$(1, 0, 0)$'], fontsize=12)
ax.set_yticklabels(['$(0, 1, 0)$'], fontsize=12)
ax.set_zticklabels(['$(0, 0, 1)$'], fontsize=12)
ax.xaxis.majorTicks[0].set_pad(15)
ax.yaxis.majorTicks[0].set_pad(15)
ax.zaxis.majorTicks[0].set_pad(35)
ax.view_init(30, angle)
# Move axis to origin
ax.xaxis._axinfo['juggled'] = (0, 0, 0)
ax.yaxis._axinfo['juggled'] = (1, 1, 1)
ax.zaxis._axinfo['juggled'] = (2, 2, 0)
ax.grid(False)
return ax
def convergence_plot(ψ, n=14, angle=50):
ax = unit_simplex(angle)
P = ((0.9, 0.1, 0.0),
(0.4, 0.4, 0.2),
(0.1, 0.1, 0.8))
P = np.array(P)
colors = cm.jet_r(np.linspace(0.0, 1, n))
x_vals, y_vals, z_vals = [], [], []
for t in range(n):
x_vals.append(ψ[0])
y_vals.append(ψ[1])
z_vals.append(ψ[2])
ψ = ψ @ P
ax.scatter(x_vals, y_vals, z_vals, c=colors, s=50, alpha=0.7, depthshade=False)
return ψ
ψ = convergence_plot((0, 0, 1))
plt.show()
There’s a sense in which a discrete time Markov chain “is” a homogeneous linear difference equation in distribution space.
To clarify this, suppose we
take
Because
Moreover, a matrix
So, under the stated conditions, our difference equation (5.1) uniquely
identifies a Markov matrix, along with an initial condition
Together, these objects identify the joint distribution of a discrete time Markov chain, as previously described.
5.2.2. Shifting to Continuous Time#
We have just argued that a discrete time Markov chain can be identified with a
linear difference equation evolving in
This strongly suggests that a continuous time Markov chain can be identified
with a linear ODE evolving in
This intuition is correct and important.
The rest of the lecture maps out the main ideas.
5.3. ODEs in Distribution Space#
Consider linear differential equation given by
where
is an matrix,distributions are again understood as row vectors, and
derivatives are taken element by element, so that
5.3.1. Solutions to Linear Vector ODEs#
Using the matrix exponential, the unique solution to the initial value problem (5.2) is
To check that (5.3) is a solution, we use (4.7) again to get
The first equality can be written as
The second equality can be written as
and is called the Kolmogorov forward equation.
Applying the Kolmogorov forward equation, we obtain
This confirms that (5.3) solves (5.2).
Here’s an example of three distribution flows with dynamics generated by (5.2), one starting from each vertex.
The code uses (5.3) with matrix
Q = ((-3, 2, 1),
(3, -5, 2),
(4, 6, -10))
Show code cell source
Q = np.array(Q)
ψ_00 = np.array((0.01, 0.01, 0.99))
ψ_01 = np.array((0.01, 0.99, 0.01))
ψ_02 = np.array((0.99, 0.01, 0.01))
ax = unit_simplex(angle=50)
def flow_plot(ψ, h=0.001, n=400, angle=50):
colors = cm.jet_r(np.linspace(0.0, 1, n))
x_vals, y_vals, z_vals = [], [], []
for t in range(n):
x_vals.append(ψ[0])
y_vals.append(ψ[1])
z_vals.append(ψ[2])
ψ = ψ @ expm(h * Q)
ax.scatter(x_vals, y_vals, z_vals, c=colors, s=20, alpha=0.2, depthshade=False)
flow_plot(ψ_00)
flow_plot(ψ_01)
flow_plot(ψ_02)
plt.show()
(Distributions cool over time, so initial conditions are hot colors.)
5.3.2. Forwards vs Backwards Equations#
As the above discussion shows, we can take the Kolmogorov forward equation
In this sense, we can understand the Kolmogorov forward equation as pushing distributions forward in time.
Analogously, we can take the Kolmogorov backward equation
Recalling that
Both the forward and the backward equations uniquely pin down the same solution
5.3.3. Matrix- vs Vector-Valued ODEs#
The ODE
It is a vector-valued ODE that describes the evolution of a particular distribution path.
By comparison, the Kolmogorov forward equation is (like the backward equation) a differential equation in matrices.
(And matrices are really maps, which send vectors into vectors.)
Operating at this level is less intuitive and more abstract than working with the Fokker–Planck equation.
But, in the end, the object that we want to describe is a Markov semigroup.
The Kolmogorov forward and backward equations are the ODEs that define this fundamental object.
5.3.4. Preserving Distributions#
In the simulation above,
What are the exact properties we require on
This is an important question, because we are setting up an exact
correspondence between linear ODEs that evolve in
Recall that the linear update rule
So now we can rephrase our key question regarding invariance on
What properties do we need to impose on
A square matrix
If
is a Markov matrix for all . is an intensity matrix.
The proof is related to that of Lemma 4.2 and is found as a solved exercise below.
If
We call
Later we will see that this result extends to the case
5.4. Jump Chains#
Let’s return to the chain
We found that the semigroup is given by
Using the fact that
Hence
We can differentiate
We can then premultiply by
More explicitly, for given
The rate of probability flow into
5.5. Summary#
We have seen that any intensity matrix
Henceforth, we will say that
While our discussion has been in the context of a finite state space, later we will see that these ideas carry over to an infinite state setting under mild restrictions.
We have also hinted at the fact that every continuous time Markov chain
is a Markov chain with intensity matrix
Later we will prove this to be universally true when
Intensity matrices are important because
they are the natural infinitesimal descriptions of Markov semigroups,
they are often easy to write down in applications and
they provide an intuitive description of dynamics.
Later, we will see that, for a given intensity matrix
when
, the value is the “rate of leaving for ” and is the “rate of leaving ” .
5.6. Exercises#
Let
(The derivative at
Define (pointwise, at each
Assuming that this limit exists, and hence
both hold. (These are the Kolmogorov forward and backward equations.)
Recall our model of jump chains with state-dependent jump
intensities given by rate function
After a wait time with exponential rate
We found that the associated semigroup
Show that
Prove Theorem 5.1 by adapting the arguments in Lemma 4.2. (This is nontrivial but worth at least trying.)
Hint: The constant
5.7. Solutions#
Solution to Exercise 5.1
Let
Fix
Combining the semigroup property and linearity with the restriction
Taking
For the backward equation we observe that
also holds. Taking
Solution to Exercise 5.2
Let
We need to show that
The first assertion is immediate from nonnegativity of
For the second, we use the fact that
Solution to Exercise 5.3
Suppose that
The proof from Lemma 4.2 that
The proof of nonnegativity of
To this end, set
You can check that
The rest of the proof of nonnegativity of
We conclude that
Regarding the converse implication, suppose that
Because
Hence
We can use the definition of the matrix exponential to obtain,
for any
From this equality and the assumption that
Hence