Week 2 Parameters. Bounds

Version 2023-10-01. To PDF of this week only Open the whole set of notes in PDF

Synopsis. Basic properties of a code C can be expressed by numbers called parameters. We learn why such parameters as the rate, R, and the minimum distance, d(C), are important when C is used for channel coding. We also learn to use the notation (n,M,d)q and [n,k,d]q. It turns out that there is a trade-off between the rate and the minimum distance: both cannot be high (good) at the same time. This trade-off is expressed by inequalities known as bounds. We only prove the Hamming bound and the Singleton bound in this course, although other bounds have been obtained in coding theory research.

Parameters of a code

Parameters are numerical characteristics of a code. The most important parameters are:

Definition: parameters of a code

Let F be an alphabet and CFn be a code. Then:

  • q denotes the size of the alphabet, i.e., q=#F;

  • n is called the length of the code — each codeword consists of n symbols;

  • M denotes the number of codewords in the code, i.e., M=#C;

  • k=logqM is the information dimension of C;

  • d(C)=min{d(v,w):v,wC,vw} is the minimum distance of C;

  • R=k/n is the rate of C;

  • δ=d/n is the relative distance of C.

We say that C is an (n,M,d)q-code or an [n,k,d]q-code.

The importance of the minimum distance for error detection and correction

We will now see that the higher d(C), the more symbol errors per codeword is the code C guaranteed to detect and correct.

Notation: [a] denotes the integer part of a real a; e.g., [3]=[3.5]=[π]=3, [7.99]=7.

Theorem 2.1: the number of errors detected/corrected by a code

Let C be a code with d(C)=d. Throughout the course, t will denote [(d1)/2]. Let vC and yFn.

  • 1. If 1d(v,y)d1, then yC. Thus, if at most d1 errors occur in a transmitted codeword, they will be detected.

  • 2. If d(v,y)t, then y has a unique nearest neighbour in C, which is v. So if at most t errors occur in a codeword, a decoder will correct them by decoding y back to c.

  • Proof. 1. If yC then by definition of minimum distance, either d(v,y)=0 or d(v,y)d. So the statement follows by contrapositive.

    2. We use proof by contradiction, so we must assume for contradiction that w is a nearest neighbour of y in C such that wv. Then d(y,w)d(y,v)t so by the triangle inequality

    0<d(v,w)d(v,y)+d(y,w)t+t=2td1.

    Hence v,w are distinct codewords at distance less than d. This contradicts d being the minimum distance of C.

Remark

The Theorem is expressed by saying that a code of minimal distance d detects up to d1 errors and corrects up to [(d1)/2] errors in a codeword.

Channel coding. The importance of rate

To understand why the information dimension k and hence the rate R are defined via logarithm, we recall channel coding, the most common use case for codes discussed in the previous chapter. Here is a diagram which shows channel coding with error correction:

sender message m codeword  ENCODE(m) channel word  received  DECODE()  ENCODE1() receiver

Messages are arbitrary words of length k, that is, elements of Fk. The cardinality of Fk is qk, because a word (u1,u2,,uk)Fk can be chosen in qk ways: q choices for u1, q independent choices for u2 and so on. Therefore,

M=#C=#(Fk)=qkk=logqM.

For each k symbols of information, the sender will transmit a codeword of n symbols. Recall that the rate is R=kn. One has R1 (see the trivial bound below):

Remark: high R, close to 1, is good

Encoding increases transmission costs by a factor of R1. The higher the rate R, the more economical and efficient the code is.

Although the increase in transmission costs is a proce to pay for error detection or correction, we want this increase to be as small as possible. One can try to construct codes with higher rate, without degrading the error detection or correction performance, by using more sophisticated mathematics. This is one of the main themes in Coding Theory.

There are obstacles to increasing the rate. Mathematically, they are expressed by bounds.

Bounds

Proposition 2.2: the trivial bound

If [n,k,d]q-codes exist, then kn and dn.

  • Proof. Let C be an [n,k,d]q-code. Then, by definition, C is a non-empty subset of Fn with #F=q. The cardinality of a set is greater than or equal to the cardinality of its subset. In particular, M=#C#Fn=qn. Applying the monotone function logq to both sides of the inequality, we obtain k=logqMn.

    Furthermore, the Hamming distance between any two words of length n is an integer between 0 and n. Therefore, 0<d(C)n for any code of length n.

It is easy to describe the codes which attain k=n. All of them are given in the following

Example: Fn, the trivial code of length n

The trivial code of length n over the alphabet F is the code C=Fn.

Exercise: prove that a code has k=n if and only if it is a trivial code. Show that trivial codes have d=1. Show that some codes are not trivial but still have d=1.

We will now give a simple example of codes which attain d=n.

Example: Rep(n,F), the repetition code of length n

Rep(n,F)={aaaaaF}Fn is the repetition code of length n over the alphabet F. All codewords are formed by repeating a symbol n times.

Exercise: prove that Rep(n,F) has d=n. Show that some codes are not repetition codes but still have d=n.

The Hamming bound

To state the next bound, we recall that (ni) is the number of ways to choose i positions out of n. This integer is called the binomial coefficient. It is given by the formula (ni)=n!(ni)!i!=n(n1)(ni+1)12i.

Theorem 2.3: the Hamming bound

Denote t=[(d1)/2]. If (n,M,d)q-codes exist, Mqni=0t(ni)(q1)i.

Before proving the Theorem, we introduce

Definition: Hamming sphere

If yFn and rn, the Hamming sphere with centre y and radius r is the set

Sr(y)={vFn:d(v,y)r}.

The number of words in the Hamming sphere depends only on the radius r (not on y):

Lemma 2.4: the cardinality of a Hamming sphere

#Sr(y)=i=0r(ni)(q1)i.

  • Proof. To construct a word v at distance i from y, we need to choose i positions out of n where y will differ from v. Then we need to change the symbol in each of the i chosen positions to one of the other q1 symbols. The total number of choices for v which is at distance exactly i from y is thus (ni)(q1)i.

    The Hamming sphere contains all vectors at distance 0ir from v, so we sum over i from 0 up to r. The Lemma is proved.

  • Proof of Theorem 2.3. First of all, we prove that spheres of radius t centred at distinct codewords c do not overlap. Indeed, by Theorem 2.1(2), each word in St(c) has unique nearest neighbour, which is c. Hence a word in St(c) cannot lie in another such sphere (a word cannot have two unique nearest neighbours!)

    Hence the whole set Fn contains M disjoint spheres centred at codewords. By Lemma 2.4, each of the M spheres contains i=0t(ni)(q1)i words. The number of elements in a disjoint union of sets is equal to the sum of cardinalities of the sets, hence the total number of words in the M spheres is Mi=0t(ni)(q1)i. Since the union of the M spheres is a subset of Fn, this does not exceed #Fn=qn. The bound follows.

Given the length n and the minimum distance d, we may wish to know whether there are codes with the number of codewords equal to the Hamming bound. Such a code would be the most economical (highest possible number M of codewords). Such codes have a special name:

Definition: perfect code

A code which attains the Hamming bound is called a perfect code.

It turns out that meaningful perfect codes are quite rare. When the number of symbols in the alphabet is a prime power, a complete classification of perfect codes up to parameter equivalence is known; we will see it later in the course.

Remark: what does it mean to attain the bound?

Attains the bound means: the inequality in the bound becomes equality for this code.

It is a mistake to say that perfect codes are those that “satisfy” the Hamming bound. Every code satisfies the Hamming bound — only perfect codes attain it!

The Singleton bound

Another upper bound on the number M of codewords can be conveniently stated for k=logqM.

Theorem 2.5: the Singleton bound

If [n,k,d]q codes exist, knd+1.

  • Proof. Let C be an [n,k,d]q-code. Consider the function f:CFnd+1 where f(v) is the word obtained from v by deleting the last d1 symbols.

    I claim that f is an injective function. Indeed, if v,wC, vw, then by definition of the minimum distance, v and w differ in at least d positions. Since f deletes only d1 symbols, the words f(v) and f(w) still differ in at least one position. So f(v)f(w). Injectivity of f is proved.

    Now, by the Pigeonhole Principle, injective functions f:CFnd+1 exist only if #C#Fnd+1. We conclude that #Cqnd+1 so that k=logq#Cnd+1 as claimed.

Definition: maximum distance separable code, MDS code

A code which attains the Singleton bound is called a maximum distance separable (MDS) code.

Remark: the bounds do not work in reverse

It is important to remember that the converses to Theorems 2.3 and 2.5 do not hold. That is, if the numbers n,k,d,q satisfy the Hamming bound and the Singleton bound, it does not imply that an [n,k,d]q-code exists. For example, n,k,d,q may fail further bounds, not covered in this course.