Full shifts

In the previous two sections we studied the irrational rotation T(x)=x+αmod1 on [0,1) where α was a fixed irrational number. We proved that all orbits are dense, and that all orbits are uniformly distributed in the sense that limN1Nn=0N11[a,b)(Tn(x))=ba for all 0a<b1 and all x[0,1).

The full shift on two symbols

We are next going to try to study the concepts for a different dynamical system. We will take X={0,1}N and study the map T:XX defined by (T(x))(n)=x(n+1) for all xX and all nN. A point xX is an infinite sequence of zeroes and ones. We can represent such sequences as infinite strings x=01101101001010101100010101000101001001010T(x)=1101101001010101100010101000101001001010T2(x)=101101001010101100010101000101001001010T3(x)=01101001010101100010101000101001001010 and in that representation the effect of T is to discard the first term.

In comparison with irrational rotations, the qualitative behaviour of the orbits of T can vary dramatically. All of the following are possible.

The behaviour of empirical averages is also much more difficult to control. When working with irrational rotations we studeied the frequencies with which orbit segments visited [a,b) in the long term by considering the quantity 1Nn=0N11[a,b)(Tn(x))=|{0nN1:aTn(x)<b}|N in the limit N. We want to do the same thing for our shift map T on {0,1}N. What do we use instead of intervals?

Definition

By a cylinder set we mean any set of the form {x{0,1}N:x(i1)=ϵ(i1),,x(ir)=ϵ(ir)} where i1<<ir are natural numbers and each ϵ(ij) is either 0 or 1.

A cylinder set is a subset of {0,1}N defined by specifying the values to be taken by sequences in {0,1} at certain indices.

Example

If r=1, i1=2 and ϵ(2)=0 then the corresponding cylinder is the set of all sequences in X that have a zero in the second position.

We will use a special notation for cylinder sets with i1=1,,ir=r as these are the cylinder sets we will work with most often. Write [ϵ(1)ϵ(2)ϵ(r)]={x{0,1}N:x(1)=ϵ(1),,x(r)=ϵ(r)} for any ϵ(1),,ϵ(r) in {0,1}. So, for example, we have [0]={x{0,1}N:x(1)=0}[11]={x{0,1}N:x(1)=1,x(2)=1}[01]={x{0,1}N:x(1)=0,x(2)=1}[101]={x{0,1}N:x(1)=1,x(2)=0,x(3)=1}

The cylinder sets will be our analogues in {0,1}N of the intervals [a,b) in [0,1). Topologically, cylinder sets are slightly better behaved than intervals. For one thing, every cylinder set is both open and closed with respect to the metric d(x,y)=n=1|x(n)y(n)|2n on {0,1}N. For another, we can cover {0,1}N by cylinders withour overlap. For example {0,1}N=[0][10][110][111] is a cover of {0,1}N by pairwise disjoint sets that are both open and closed.

What we want to do is to investigate the extent to which limN1Nn=0N11C(Tn(x)) exists for points x{0,1}N and cylinders C{0,1}N.

A measure on the full shift

When analyzing irrational rotations we deduced that the Lebesgue measure was responsible for the limiting values of the empirical averages we were interested in. To analyze the shift map T on {0,1}N we similarly need a measure on the Borel subsets of {0,1}N. Fix 0p1 and let us decalare that P([ϵ(1)ϵ(r)])=p|{1ir:ϵ(i)=1}|(1p)|{1ir:ϵ(i)=0}| so that, for example P([1])=pP([0])=1pP([101])=p2(1p) and then put Ξ(E)=inf{n=1P(Cn):En=1Cn for C1,C2, cylindres} which defines an outer measure on {0,1}N. The Carathéodory construction we used to construct Lebesgue measure can be applied in the same way to construct a measure μp on the Borel subsets of {0,1}N with the property that μp(C)=P(C) for every cylinder C. We take the existence of such measures for granted without going through the details again.

We call the resulting measure μp the (p,1p) coin measure and we call the (12,12) coin measure the fair coin measure. Our main goal is to use these measures to prove the following theorem.

Theorem

Fix 0p1. The set {x{0,1}N:limN|{1nN:x(n)=1}|N=p} has full measure with respect to μp.

Taking C=[1] and writing |{1nN:x(n)=1}|N=1Nn=1Nx(n)=1Nn=0N11C(Tn(x)) gives us the same perspective - of averaging orbit segments along functions - as was fruitful when working with irrational rotations. However, we will not be able to proceed as smoothly because we do not have an analogue of the functions ψk whose average over the orbit of an irrational rotation we were able to calculate relatively easily. Instead we will take a more probabilistic approach.