Markov chains
Consider the set
of sequences in which every 1 is followed by a zero. This is certainly invariant for the shift map: if then belongs to as well.
To study the dynamics of on using ergodic theory we want a measure on that is invariant. For each we have the coin measure on and each is invariant. However, only one - the point measure - gives positive measure to . For example, since the number of cylinders of length that intersects satisfies the recurrence
for all we have
as . How can we equip with a probability measure that is invariant?
Walks on directed graphs
The coin measure were inappropriate because they see all cylinders: they assign and positive measure whereas only one of those cylinders intersects . We can think of the points in as the result of all possible infinite walks on the directed graph with vertices , and directed edges as follows.
- An edge from to
- An edge from to
- An edge from to
To any endless journey on the graph one associates a sequence in by recording the labels associated with the visited vertices. As the vertex labelled 1 cannot be visited consecutively, and as there are not other restrictions, we get all sequences in this way.
Let us assign probabilities to each traversal. We encode these in a matrix
with representing the probability that one moves in a single step from vertex to vertex . We must have
and we must also have as we forbid consecutive ones. To entirely determine the measure we also need the probability of the starting location. Fix with
where is the probability that one begins at vertex . With this information - values for all and all - we define
on all cylinder sets . For example
is the probability of beginning at 0 multiplied by the probability that the first step is from 0 to 1.
Note that if is to be an invariant measure then we must have
which is to say
holds so we assume this of our parameters.
We will take for granted that the above formula defines a measure on . Let us verify that it is invariant. Recall that it suffices to check and agree for all cylinder sets because such sets form a π system. Write . We calculate
so is invariant.
We have total freedom in the parameters and . Fixing their values then determines and by the laws of total probability. What is the best way to choose their values? Absent any other information about the dynamics, or any other quantity that we might be interested in, it is often reasonable to choose the values that maximize the entropy.
Proposition
For the above measure the quantity
is the entropy of .
Proof:
As is a generator for the Borel σ algebra on the limit
is the entropy we want to calculate. We have
by writing a sum of logarithms and then using repeatedly both
for . Dividing by and taking the limit as gives the desired result as and in our special case. ▮
The Parry measure
We would like to maximize
for values in subject to
which is not so simple an optimization problem.
If we attempt to be as unbiased as possible in our walk on the graph, choosing which edge to traverse from vertex 0 each time by a fair coin toss then we can assert
which then forces and . For these particular choices we get an entropy value of
but it is not clear that this is maximal.
Theorem (Parry)
Let be a matrix with entries from . Suppose there is with all entries of positive. Let be the largest positive eigenvalue of . Fix left and right eigenvectors and of respectively with
such that both have an eigenvalue of . With
and
the corresponding Markov measure maximizes the entropy for on the set
where transitions are determined by .
Proof:
This is Theorem 8.10 in Walters. ▮
The resulting measure is the Parry measure on the Markov chain. For our example we have
with eigenvalues
and the latter must be . The vectors
are left and right eigenvectors respectively of with eingevalue . Since the ratio of the eigenvectors is unchanged by scaling we conclude that
and that
are the values that will maximize the entropy. As
we conclude that
giving an entropy value of
which is indeed larger than entropy one gets from our naive guess . In fact, the entropy of the Parry measure is always .