We covered over the previous four sections two specific examples of dynamical systems: the irrational rotation and the full shift on two symbols. In both examples were able to identify a measure that controlled the limiting behaviour of empirical averages. In this section we will begin to develop the abstract setting into which both examples fit: that of measure-preserving maps between finite measure spaces. This is the beginning of ergodic theory.
A measure space $(X,\B,\mu)$ is a probability space if $\mu(X) = 1$. We will only study measure-preserving maps of probability spaces in this course. The subject of infinite ergodic theory is concerned with the more general setting of measure-preserving transformations on infinite measure spaces.
Fix a probability space $(X,\B,\mu)$. A measurable map $T : X \to X$ is measure-preserving if \[ \mu(T^{-1}(B)) = \mu(B) \] for all $B \in \B$.
Fix a measure space $(X,\B)$ and a measurable map $T : X \to X$. A measure $\mu$ on $(X,\B)$ is invariant for $T$ if \[ \mu(T^{-1}(B)) = \mu(B) \] for all $B \in \B$.
If one takes the view that:
then the assumption $T$ is measure-preserving says roughly that events are no more or less likely from moment to moment. For example, if $X = [0,1)$ and $B = [0,\log(2))$ then $\mu(B)$ is the likelihood that a random point in $X$ is less than $\log(2)$. The set \[ T^{-1}(B) = \{ x \in X : T(x) \in B \} \] then consists of those points $x \in X$ that will be in $B$ one moment later, and if $T$ is measure-preserving then it is as likely now as later that a random point will be smaller than $\log(2)$.
In drawing from the above analogy we must be careful: what is meant by "random" above is specified by $\mu$ and this puts the cart before the horse to some extent, because in applications one often has interesting dynamics in the form of the measurable map $T : X \to X$ to begin with and seeks a measure that is invariant for $T$ in order to apply the tools of ergodic theory, rather than begining with a probability space $(X,\B,\mu)$ and attempt to understand its measure-preserving transformations.
We need, before going much further, to be able to deduce whether a given measurable map is measure-preserving. One way of achieving this is via the ฯ-๐ theorem, which tells us we only need to check that two measures agree on special collections of sets in $\B$ known as ฯ-systems.
Fix a set $X$. A collection $\mathcal{D} \subset \P(X)$ is a ฯ-system if $\mathcal{D}$ is non-empty and whenever $A,B$ belong to $\mathcal{D}$ one has $A \cap B$ in $\mathcal{D}$ as well.
Two particular ฯ-systems will be important for us later on.
Take $X = [0,1)$. The collection \[ \{ [a,b) : 0 \le a \l b \le 1 \} \cup \{ \emptyset \} \] is a ฯ-system on $X$.
Take $X = \{0,1\}^\N$. The collection \[ \{ C \subset X : C \textsf{ is a cylinder} \} \cup \{ \emptyset \} \] is a ฯ-system on $X$.
The following theorem - a special case of the ฯ-๐ theorem suited to our needs - will simplify the task of verifying a measurable map is measure-preserving.
Fix a measurable space $(X,\B)$. Let $\mu$ and $\nu$ be measures on $(X,\B)$. If $\mathcal{D}$ is a ฯ-system with $\sigma(\mathcal{D}) \supset \B$ and \[ D \in \mathcal{D} \Rightarrow \mu(D) = \nu(D) \] then $\mu = \nu$.
We will take this for granted. โฎ
Fix a measure space $(X,\B,\mu)$ and a ฯ-system $\mathcal{D}$ with $\sigma(\mathcal{D}) \supset \B$. A measurable map $T : X \to X$ is measure-preserving if \[ \mu(D) = \mu(T^{-1}(D)) \] for all $D \in \mathcal{D}$.
Since $T$ is measurable the map $\nu : \B \to [0,\infty]$ defined by $\nu(B) = \mu(T^{-1}(B))$ is a measure on $(X,\B)$. Apply the ฯ-๐ theorem. โฎ
Let us check that our two foundational examples are measure-preserving transformations.
Take $X = [0,1)$ and fix $\alpha \in \R$. Define $T : X \to X$ by \[ T(x) = x + \alpha \bmod 1 \] for all $x \in X$. Let $\B$ be the Borel ฯ-algebra on $X$ and let $\mu$ be the restriction of Lebesgue measure to $\B$. Let us check that $\mu$ is an invariant measure for $T$. It suffices, by the ฯ-๐ theorem, to check that \[ \mu([a,b)) = \mu(T^{-1}([a,b))) \] for all $0 \le a \l b \le 1$. But $T^{-1}([a,b))$ is either an interval of length $b-a$ or a union of two disjoint intervals whose lengths sum to $b-a$. In either case, the total length is unchanged.
Take $X = \{0,1\}^\N$ and define $T : X \to X$ by \[ (T(x))(n) = x(n+1) \] for all $x \in X$. Let $\B$ be the Borel ฯ-algebra on $X$ and let $\mu$ be the $(p,1-p)$ coin measure on $(X,\B)$. Let us check that $\mu$ is an invariant measure for $T$. It suffices, by the ฯ-๐ theorem, to check that \[ \mu([\epsilon(1) \cdots \epsilon(r)]) = \mu(T^{-1}([\epsilon(1) \cdots \epsilon(r)])) \] for all $0 \le a \l b \le 1$. But \[ T^{-1}([\epsilon(1) \cdots \epsilon(r)]) = [0\epsilon(1) \cdots \epsilon(r)] \cup [1\epsilon(1) \cdots \epsilon(r)] \] has the same measure as $[\epsilon(1) \cdots \epsilon(r)]$ by direct calculation.
We finish with our first result about measure-preserving dynamical systems. It states that events $B \in \B$ with positive measure must happen infinitely often as one continues to iterate the dynamics.
Let $(X,\B,\mu,T)$ be a system. For every $B \in \B$ with $\mu(B) > 0$ there is $n \in \N$ with $\mu(B \cap (T^n)^{-1}(B)) > 0$.
Suppose to the contrary that $\mu(B \cap (T^n)^{-1}(B)) = 0$ for all $n \in \N$. Since $\mu$ is $T$ invariant we deduce that \[ \mu \Big( (T^k)^{-1}(B) \cap (T^{k+n})^{-1}(B) \Big) = 0 \] for all $n \in \N$ and therefore the sequence $n \mapsto (T^n)^{-1}(B)$ of sets in $\B$ is pairwise disjoint. But then we have for all $N \in \N$ that \[ 1 = \mu(X) \ge \mu\Big( B \cup T^{-1}(B) \cup \cdots \cup (T^n)^{-1}(B) \Big) = N \mu(B) \] yielding a contradiction if one chooses $N > \mu(B)^{-1}$. โฎ