When $a \le b$ the interval $(a,b)$ has a length of $b-a$. In this section we investigate the extent to which we can assign a length to other subsets of $\R$.
If a set $B \subset \R$ is contained within a union of intervals, it seems entirely reasonable that any notion of length we might come up with would say the following: the length of $B$ is not larger than the sum of the lengths of the intervals. That is, if \[ B \subset (a_1,b_1) \cup (a_2,b_2) \cup \cdots \cup (a_N,b_N) \] then \[ \textsf{Length}(B) \le \sum_{n=1}^N b_n - a_n \] holds.
Our first major idea is that, if we choose the intervals $(a_1,b_1),\dots,(a_N,b_N)$ wisely, then there may be very little discrepancy between the set $B$ and the union of the intervals; in that case the sum \[ \sum_{n=1}^N b_n - a_n \] could be considered very close to the length of $B$.
The above idea is formalized mathematically as follows. We define the outer measure of a set $B \subset \R$ by finding the most efficient way of covering it by a union of intervals. Thus we define \[ \Lambda(B) = \inf \left\{ \sum_{n=1}^\infty b_n - a_n : B \subset \bigcup_{n=1}^\infty (a_n,b_n) \right\} \] for all sets $B \subset \R$. It is the infimum that picks out the most efficient way of covering $B$ by intervals. Notice that, in the definition, we allow not only unions of a finite number of intervals, but unions of an infinite number as well.
The function $\Lambda$ is defined on $\mathcal{P}(\R)$. It assigns a value in $[0,\infty]$ to every subset of $\R$. It has the following properties, which are very reasonable when thinking of $\Lambda$ as assigning length to subsets of $\R$.
The map $\Lambda : \mathcal{P}(\R) \to [0,\infty]$ has the following properties.
First we prove (1). We must show that \[ \left\{ \sum_{n=1}^\infty b_n - a_n : \emptyset \subset \bigcup_{n=1}^\infty (a_n,b_n) \right\} \] has an infimum of zero. Since every member of the set is non-negative we can do this by producing a value in the set smaller than any given $\epsilon > 0$. Since any sequence of intervals covers the empty set, we have complete freedom to choose the endpoints, and can for example take $a_n = 0$ for all $n \in \N$ and $b_n = \epsilon/2^n$ for all $n \in \N$. For this choice we have \[ \sum_{n=1}^\infty b_n - a_n = \sum_{n=1}^\infty \dfrac{\epsilon}{2^n} = \epsilon \] and therefore $\epsilon$ belongs to the above set. Since every member of the set is non-negative and $\epsilon > 0$ was arbitrary, the infimum must be zero.
For (2) fix $A \subset \R$ and $B \subset \R$ with $A \subset B$. To prove that one infimum is smaller than another, it suffices to prove that one set is larger than the other. Thus, if we can show that \[ \begin{align*} & \left\{ \sum_{n=1}^\infty b_n - a_n : A \subset \bigcup_{n=1}^\infty (a_n,b_n) \right\} \\ & \quad \supset \left\{ \sum_{n=1}^\infty b_n - a_n : B \subset \bigcup_{n=1}^\infty (a_n,b_n) \right\} \end{align*} \] then we will immediately have $\Lambda(A) \le \Lambda(B)$. But this containment is immediate as any sequence of intervals that covers $B$ must cover the smaller set $A$.
Next we prove (3). Fix $\epsilon > 0$. For each $i \in \N$ let \[ (a^i_1,b^i_1),(a^i_1,b^i_1),\dots,(a^i_1,b^i_1),\dots \] be a cover of $A_i$ with \[ \Lambda(A_i) + \dfrac{\epsilon}{2^i} \ge \sum_{n=1}^\infty b^i_n - a^i_n \] and let $(c_n,d_n)$ be a sequence of intervals that enumerates all the intervals $(a^i_n,b^i_n)$ for all $i,n \in \N$. Certainly they cover the union of the $A_i$ and therefore \[ \begin{align*} \Lambda \left( \, \bigcup_{n=1}^\infty A_n \right) & \le \sum_{n=1}^\infty d_n - c_n \\ &= \sum_{i=1}^\infty \sum_{n=1}^\infty b^i_n - a^i_n \le \sum_{i=1}^\infty \lambda(A_i) + \dfrac{\epsilon}{2^i} \end{align*} \] which gives what we want as $\epsilon > 0$ was arbitrary.
Lastly, for (4) note that the translate of a cover of $A$ is a cover of $A-t$ and vice versa.▮
The properties we have proved for $\Lambda$ are reasonable if we think of $\Lambda$ as assigning a length. Here is a further reasonable property that we should expect from a device that assigns length.
Countably additive Say that $\Xi : \mathcal{P}(\R) \to [0,\infty]$ is countably additive if one has \[ \Xi \left( \; \bigcup_{n=1}^\infty A_n \right) = \sum_{n=1}^\infty \Xi(A_n) \] whenever $n \mapsto A_n$ is a sequence of subsets of $\R$ that is pairwise disjoint.
Recall that a sequence $n \mapsto A_n$ of sets is pairwise disjoint if $A_i \cap A_j = \emptyset$ for all $i \ne j$. Informally, a mapping $\Xi$ from $\mathcal{P}(\R)$ to $[0,\infty]$ is countably additive if we can calculate $\Xi(A)$ by breaking $A$ up into countably many pieces $A_1,A_2,\dots$ and summing up the values $\Xi(A_1),\Xi(A_2),\dots$
We are now faced with a central question: is $\Lambda$ is countably additive?
Define an equivalence relation $\sim$ on $[0,1]$ by $x \sim y$ if and only if $x-y \in \Q$. Let $V \subset [0,1]$ be a set that contains exactly one member of every equivalence class. Any such set is called a Vitali set.
We can use Vitali sets to prove that $\Lambda$ is not countably additive!
The map $\Lambda : \P(\R) \to [0,\infty]$ is not countably additive.
Fix a Vitali set $V$. First we show that $V - q$ and $V$ are disjoint for every $q \in \Q$. Indeed, suppose that $t \in V$ and $t \in V-q$. Then $t \in V$ and $t+q \in V$. But $t+q \sim t$ so we have contradicted the defining property of $V$.
Now suppose that $\Lambda$ is countably additive. We want to reach a contradiction. First we verify that \[ [0,1] \subset \bigcup_{q \in \Q \cap [-1,1]} V-q \subset [-1,2] \] holds. For the second containment, since $V \subset [0,1]$ we certainly have $V - q \subset [-1,2]$ for every $q \in [-1,1]$. For the first containment fix $t \in [0,1]$. There is by definition of $V$ some $v \in V$ with $v-t \in \Q$. Thus $v-t=q$ for some $q \in \Q \cap [-1,1]$. We therefore have $t \in V-q$ as desired.
Applying $\Lambda$ to the above gives \[ 1 \le \Lambda \bigg( \; \bigcup_{q \in \Q \cap [-1,1]} V-q \bigg) \le 3 \] and since $\Lambda$ is assumed countably additive we have \[ 1 \le \sum_{q \in \Q \cap [-1,1]} \Lambda(V-q) \le 3 \] after applying $\Lambda$ to each side. Finally \[ 1 \le \sum_{q \in \Q \cap [-1,1]} \Lambda(V) \le 3 \] because $\Lambda(V) = \Lambda(V-q)$ for all $q$. But the above is impossible no matter the purported value of $\Lambda(V)$.▮
We have seen, from Vitali sets and the above theorems, that there is no mapping \[ \Xi : \P(\R) \to [0,\infty] \] that has all three of the following properties.
In order to proceed with a theory of length, we have to give something up. We are going to insist on translation invariance and countable additivity by abandoning the requirement that $\Lambda$ is defined on $\P(\R)$. Our next step, therefore, is to try to identify a rich collection $\mathscr{B}$ of subsets of $\R$ so that \[ \Lambda : \mathscr{B} \to [0,\infty] \] is countably additive and translation invariant.