4. The Kolmogorov Backward Equation#

In addition to what’s in Anaconda, this lecture will need the following libraries:

!pip install quantecon
Hide code cell output
Requirement already satisfied: quantecon in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (0.7.2)
Requirement already satisfied: numba>=0.49.0 in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from quantecon) (0.60.0)
Requirement already satisfied: numpy>=1.17.0 in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from quantecon) (1.26.4)
Requirement already satisfied: requests in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from quantecon) (2.32.3)
Requirement already satisfied: scipy>=1.5.0 in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from quantecon) (1.13.1)
Requirement already satisfied: sympy in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from quantecon) (1.13.2)
Requirement already satisfied: llvmlite<0.44,>=0.43.0dev0 in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from numba>=0.49.0->quantecon) (0.43.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from requests->quantecon) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from requests->quantecon) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from requests->quantecon) (2.2.3)
Requirement already satisfied: certifi>=2017.4.17 in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from requests->quantecon) (2024.8.30)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /home/runner/miniconda3/envs/quantecon/lib/python3.12/site-packages (from sympy->quantecon) (1.3.0)

4.1. Overview#

As models become more complex, deriving analytical representations of the Markov semigroup (Pt) becomes harder.

This is analogous to the idea that solutions to continuous time models often lack analytical solutions.

For example, when studying deterministic paths in continuous time, infinitesimal descriptions (ODEs and PDEs) are often more intuitive and easier to write down than the associated solutions.

(This is one of the shining insights of mathematics, beginning with the work of great scientists such as Isaac Newton.)

We will see in this lecture that the same is true for continuous time Markov chains.

To help us focus on intuition in this lecture, rather than technicalities, the state space is assumed to be finite, with |S|=n.

Later we will investigate the case where |S|=.

We will use the following imports

import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
import quantecon as qe
from numba import njit

from scipy.linalg import expm
from scipy.stats import binom

4.2. State Dependent Jump Intensities#

As we have seen, continuous time Markov chains jump between states, and hence can have the form

Xt=k0Yk1{Jkt<Jk+1}(t0)

where (Jk) are jump times and (Yk) are the states at each jump.

(We are assuming that Jk with probability one, so that Xt is well defined for all t0, but this is always true for when holding times are exponential and the state space is finite.)

In the previous lecture,

  • the sequence (Yk) was drawn from a Markov matrix K and called the embedded jump chain, while

  • the holding times Wk:=JkJk1 were IID and Exp(λ) for some constant jump intensity λ.

In this lecture, we will generalize by allowing the jump intensity to vary with the state.

This difference sounds minor but in fact it will allow us to reach full generality in our description of continuous time Markov chains, as clarified below.

4.2.1. Motivation#

As a motivating example, recall the inventory model, where we assumed that the wait time for the next customer was equal to the wait time for new inventory.

This assumption was made purely for convenience and seems unlikely to hold true.

When we relax it, the jump intensities depend on the state.

4.2.2. Jump Chain Algorithm#

We start with three primitives

  1. An initial condition ψ,

  2. a Markov matrix K on S satisfying K(x,x)=0 for all xS and

  3. a function λ mapping S to (0,).

The process (Xt)

  • starts at state x, draw from ψ,

  • waits there for an exponential time W with rate λ(x) and then

  • updates to a new state y drawn from K(x,).

Now we take y as the new state for the process and repeat.

Here is the same algorithm written more explicitly:

Algorithm 4.1 (Jump Chain Algorithm)

Inputs ψD, rate function λ, Markov matrix K

Outputs Markov chain (Xt)

  1. Draw Y0 from ψ, set J0=0 and k=1.

  2. Draw Wk independently from Exp(λ(Yk1)).

  3. Set Jk=Jk1+Wk.

  4. Set Xt=Yk1 for t in [Jk1,Jk).

  5. Draw Yk from K(Yk1,).

  6. Set k=k+1 and go to step 2.

The sequence (Wk) is drawn as an IID sequence and (Wk) and (Yk) are drawn independently.

The restriction K(x,x)=0 for all x implies that (Xt) actually jumps at each jump time.

4.3. Computing the Semigroup#

For the jump process (Xt) with time varying intensities described in the jump chain algorithm, calculating the Markov semigroup is not a trivial exercise.

The approach we adopt is

  1. Use probabilistic reasoning to obtain an integral equation that the semigroup must satisfy.

  2. Convert the integral equation into a differential equation that is easier to work with.

  3. Solve this differential equation to obtain the Markov semigroup (Pt).

The differential equation in question has a special name: the Kolmogorov backward equation.

4.3.1. An Integral Equation#

Here is the first step in the sequence listed above.

Lemma 4.1 (An Integral Equation)

The semigroup (Pt) of the jump chain with rate function λ and Markov matrix K obeys the integral equation

(4.1)#Pt(x,y)=etλ(x)I(x,y)+λ(x)0t(KPtτ)(x,y)eτλ(x)dτ

for all t0 and x,y in S.

Here (Pt) is the Markov semigroup of (Xt), the process constructed via Algorithm 4.1, while KPtτ is the matrix product of K and Ptτ.

Proof. Conditioning implicitly on X0=x, the semigroup (Pt) must satisfy

(4.2)#Pt(x,y)=P{Xt=y}=P{Xt=y,J1>t}+P{Xt=y,J1t}

Regarding the first term on the right hand side of (4.2), we have

(4.3)#P{Xt=y,J1>t}=I(x,y)P{J1>t}=I(x,y)etλ(x)

where I(x,y)=1{x=y}.

For the second term on the right hand side of (4.2), we have

P{Xt=y,J1t}=E[1{J1t}P{Xt=y|W1,Y1}]=E[1{J1t}PtJ1(Y1,y)]

Evaluating the expectation and using the independence of J1 and Y1, this becomes

P{Xt=y,J1t}=01{τt}zK(x,z)Ptτ(z,y)λ(x)eτλ(x)dτ=λ(x)0tzK(x,z)Ptτ(z,y)eτλ(x)dτ

Combining this result with (4.2) and (4.3) gives (4.1).

4.3.2. Kolmogorov’s Differential Equation#

We have now confirmed that the semigroup (Pt) associated with the jump chain process (Xt) satisfies (4.1).

Equation (4.1) is important but we can simplify it further without losing information by taking the time derivative.

This leads to our main result for the lecture

Theorem 4.1 (Kolmogorov Backward Equation)

The semigroup (Pt) of the jump chain with rate function λ and Markov matrix K satisfies the Kolmogorov backward equation

(4.4)#Pt=QPtwhere Q(x,y):=λ(x)(K(x,y)I(x,y))

The derivative on the left hand side of (4.4) is taken element by element, with respect to t, so that

Pt(x,y)=(ddtPt(x,y))((x,y)S×S)

The proof that differentiating (4.1) yields (4.4) is an important exercise (see below).

4.3.3. Exponential Solution#

The Kolmogorov backward equation is a matrix-valued differential equation.

Recall that, for a scalar differential equation yt=ayt with constant a and initial condition y0, the solution is yt=etay0.

This, along with P0=I, encourages us to guess that the solution to Kolmogorov’s backward equation (4.4) is

(4.5)#Pt=etQ

where the right hand side is the matrix exponential, with definition

(4.6)#etQ=k01k!(tQ)k=I+tQ+t22!Q2+

Working element by element, it is not difficult to confirm that the derivative of the exponential function tetQ is

(4.7)#ddtetQ=QetQ=etQQ

Hence, differentiating (4.5) gives Pt=QetQ=QPt, which convinces us that the exponential solution satisfies (4.4).

Notice that our solution

(4.8)#Pt=etQwhere Q(x,y):=λ(x)(K(x,y)I(x,y))

for the semigroup of the jump process (Xt) associated with the jump matrix K and the jump intensity function λ:S(0,) is consistent with our earlier result.

In particular, we showed that, for the model with constant jump intensity λ, we have Pt=etλ(KI).

This is obviously a special case of (4.8).

4.4. Properties of the Solution#

Let’s investigate further the properties of the exponential solution.

4.4.1. Checking the Transition Semigroup Properties#

While we have confirmed that Pt=etQ solves the Kolmogorov backward equation, we still need to check that this solution is a Markov semigroup.

Lemma 4.2 (From Jump Chain to Semigroup)

Let λ map S to R+ and let K be a Markov matrix on S. If Pt=etQ for all t0, where Q(x,y)=λ(x)(K(x,y)I(x,y)), then (Pt) is a Markov semigroup on S.

Proof. Observe first that Q has zero row sums, since

yQ(x,y)=λ(x)y(K(x,y)I(x,y))=0

As a small exercise, you can check that, with 1 representing a column vector of ones, the following is true

(4.9)#Q has zero row sums Qk1=0 for all k1

This implies that Qk1=0 for all k and, as a result, for any t0,

Pt1=etQ1=I1+tQ1+t22!Q21+=I1=1

In other words, each Pt has unit row sums.

Next we check nonnegativity of all elements of Pt (which can easily fail for matrix exponentials).

To this end, adopting an argument from [Stroock, 2013], we set m:=maxxλ(x) and P^:=I+Q/m.

It is not difficult to check that P^ is a Markov matrix and Q=m(P^I).

Recalling that, for matrix exponentials, eA+B=eAeB whenever AB=BA, we have

etQ=etm(P^I)=etmIetmP^=etm(I+tmP^+(tm)22!P^2+)

It is clear from this representation that all entries of etQ are nonnegative.

Finally, we need to check the continuity condition Pt(x,y)I(x,y) as t0, which is also part of the definition of a Markov semigroup. This is immediate, in the present case, because the exponential function is continuous, and hence Pt=etQe0=I.

We can now be reassured that our solution to the Kolmogorov backward equation is indeed a Markov semigroup.

4.4.2. Uniqueness#

Might there be another, entirely different Markov semigroup that also satisfies the Kolmogorov backward equation?

The answer is no: linear ODEs in finite dimensional vector space with constant coefficients and fixed initial conditions (in this case P0=I) have unique solutions.

In fact it’s not hard to supply a proof — see the exercises.

4.5. Application: The Inventory Model#

Let us look at a modified version of the inventory model where jump intensities depend on the state.

In particular, the wait time for new inventory will now be exponential at rate γ.

The arrival rate for customers will still be denoted by λ and allowed to differ from γ.

For parameters we take

α = 0.6
λ = 0.5
γ = 0.1
b = 10

Our plan is to investigate the distribution ψT of XT at T=30.

We will do this by simulating many independent draws of XT and histogramming them.

(In the exercises you are asked to calculate ψT a different way, via (4.8).)

@njit
def draw_X(T, X_0, max_iter=5000):
    """
    Generate one draw of X_T given X_0.
    """

    J, Y = 0, X_0
    m = 0

    while m < max_iter:
        s = 1/γ if Y == 0 else 1/λ
        W = np.random.exponential(scale=s)  # W ~ E(λ)
        J += W
        if J >= T:
            return Y
        # Otherwise update Y
        if Y == 0:
            Y = b
        else:
            U = np.random.geometric(α)
            Y = Y - min(Y, U)
        m += 1


@njit
def independent_draws(T=10, num_draws=100):
    "Generate a vector of independent draws of X_T."

    draws = np.empty(num_draws, dtype=np.int64)

    for i in range(num_draws):
        X_0 = np.random.binomial(b+1, 0.25)
        draws[i] = draw_X(T, X_0)

    return draws
T = 30
n = b + 1
draws = independent_draws(T, num_draws=100_000)
fig, ax = plt.subplots()

ax.bar(range(n), [np.mean(draws == i) for i in range(n)], width=0.8, alpha=0.6)
ax.set_xlabel("inventory", fontsize=14)

plt.show()
_images/f636fff0a038b818feb4113fd53b15b1eb0aaeb74f68c649788c034ecff95860.png

If you experiment with the code above, you will see that the large amount of mass on zero is due to the low arrival rate γ for inventory.

4.6. Exercises#

Exercise 4.1

In the discussion above, we generated an approximation of ψT when T=30, the initial condition is Binomial(n,0.25) and parameters are set to

α = 0.6
λ = 0.5
γ = 0.1
b = 10

The calculation was done by simulating independent draws and histogramming.

Try to generate the same figure using (4.8) instead, modifying code from our lecture on the Markov property.

Exercise 4.2

Prove that differentiating (4.1) at each (x,y) yields (4.4).

Exercise 4.3

We claimed above that the solution Pt=etQ is the unique Markov semigroup satisfying the backward equation Pt=QPt.

Try to supply a proof.

(This is not an easy exercise but worth thinking about in any case.)

4.7. Solutions#

Note

code is currently not supported in sphinx-exercise so code-cell solutions are immediately after this solution block.

Solution to Exercise 4.1

Here is one solution:

α = 0.6
λ = 0.5
γ = 0.1
b = 10

states = np.arange(n)
I = np.identity(n)

# Embedded jump chain matrix
K = np.zeros((n, n))
K[0, -1] = 1
for i in range(1, n):
    for j in range(0, i):
        if j == 0:
            K[i, j] = (1 - α)**(i-1)
        else:
            K[i, j] = α * (1 - α)**(i-j-1)

# Jump intensities as a function of the state
r = np.ones(n) * λ
r[0] = γ

# Q matrix
Q = np.empty_like(K)
for i in range(n):
    for j in range(n):
        Q[i, j] = r[i] * (K[i, j] - I[i, j])

def P_t(ψ, t):
    return ψ @ expm(t * Q)

ψ_0 = binom.pmf(states, n, 0.25)
ψ_T = P_t(ψ_0, T)

fig, ax = plt.subplots()

ax.bar(range(n), ψ_T, width=0.8, alpha=0.6)
ax.set_xlabel("inventory", fontsize=14)

plt.show()
_images/5a2f57c24a8d01dd0de2f0a88dc2ecb83c20c284647062b059e1c5d38d98f203.png

Solution to Exercise 4.2

One can easily verify that, when f is a differentiable function and α>0, we have

(4.10)#g(t)=etαf(t)g(t)=etαf(t)αg(t)

Note also that, with the change of variable s=tτ, we can rewrite (4.1) as

(4.11)#Pt(x,y)=etλ(x){I(x,y)+λ(x)0t(KPs)(x,y)esλ(x)ds}

Applying (4.10) yields

Pt(x,y)=etλ(x){λ(x)(KPt)(x,y)etλ(x)}λ(x)Pt(x,y)

After minor rearrangements this becomes

Pt(x,y)=λ(x)[(KI)Pt](x,y)

which is identical to (4.4).

Solution to Exercise 4.3

Here is one proof of uniqueness.

Suppose that (P^t) is another Markov semigroup satisfying Pt=QPt.

Fix t>0 and let Vs be defined by Vs=PsP^ts for all s0.

Note that V0=P^t and Vt=Pt.

Note also that sVs is differentiable, with derivative

Vs=PsP^tsPsP^ts=PsQP^tsPsQP^ts=0

where, in the second last equality, we used (4.7).

Hence Vs is constant, so our previous observations V0=P^t and Vt=Pt now yield P^t=Pt.

Since t was arbitrary, the proof is now done.