ch5-exercises-sol

Exercise 5.1¶

Here is a confusion matrix for a classifier of joke funniness.

Considering funny to be the positive outcome, calculate the (a) recall, (b) precision, (c) specificity, (d) accuracy, and (e) $F_1$ score of the classifier.

In [1]:

# Solution
TP = 648
FP = 45
TN = 1004
FN = 78

# following the table in Definition 5.3
print("accuracy:   ", (TP+TN)/(TP+FP+TN+FN))
recall = TP/(TP+FN)
print("recall:     ", recall)
print("specificity:", TN/(TN+FP))
precision = TP/(TP+FP)
print("precision:  ", TP/(TP+FP))

# Definition 5.4
F1 = 1/(0.5*(1/precision + 1/recall))
print("F1 score:   ", F1)

accuracy:    0.9307042253521127
recall:      0.8925619834710744
specificity: 0.9571020019065777
precision:   0.935064935064935
F1 score:    0.9133192389006343

Exercise 5.2¶

Here is a confusion matrix for a classifier of ice cream flavours.

(a) Calculate the recall rate for chocolate.

(b) Find the precision for vanilla.

(c) Find the accuracy for strawberry.

Solution: In the multiclass case we apply the one-vs-rest paradigm to reduce to binary. For example, the number of true positives for vanilla is $\mathrm{TP}=75$, while there are $\mathrm{FP}=8+11=19$ false positives. Therefore,

$$ \mathrm{precision\ for\ vanilla} = \dfrac{\mathrm{TP}}{\mathrm{TP} + \mathrm{FP}} = \dfrac{75}{94} \approx 0.798. $$

Exercise 5.3¶

Find the Gini impurity of the set

$$ S = \{ A, B, B, C, C, C \}.$$

Solution: Applying Definition 5.5, this set has $K=3$ labels and

$$ p_1 = 1/6, \quad p_2 = 2/6, \quad p_3 = 3/6. $$

The Gini impurity of $S$ is

$$ H(S) = \sum_{k=1}^K p_k (1-p_k) = \frac{1}{6}\cdot\frac{5}{6} + \frac{2}{6}\cdot\frac{4}{6} + \frac{3}{6}\cdot\frac{3}{6} \approx 0.6111. $$

Exercise 5.4¶

Explain why for the binary case $K=2$, Definition 5.5 implies that $H(S)$ is maximized when the two classes are equally represented in $S$.

Solution: In the binary case, one class appears with probability $p$ and the other with probability $1-p$. Hence,

$$ H = p (1-p) + (1-p) p = 2p - 2p^2. $$

This is a quadratic function that attains its maximum at $p=1/2$ (you can check the derivative to verify).

Exercise 5.5¶

Given $x_i=i$ for $i=0,\ldots,5$, with labels $$ y_0=y_4=y_5=A, \quad y_1=y_2=y_3=B, $$ write Python code to find an optimal partition threshold using Gini impurity.

In [2]:

# Solution
# This is similar to Example 5.13, but here solve the problem with some code.
# The smallest Q value of 1.5 is achieved for theta = 3, 
# resulting in the partition [A, B, B, B] | [A, A] 

data = { (0,'A'), (1,'B'), (2,'B'), (3,'B'), (4,'A'), (5,'A') }

def gini(S):
    """"Gini impurity of a list."""
    g = 0
    unique_el = set(S)
    for el in unique_el:
        p = S.count(el)/len(S)
        g += p*(1-p)
    return g

for theta in range(-1,6):
    SL = [ y for (x,y) in data if x<=theta ]
    SR = [ y for (x,y) in data if x>theta ]
    Q = len(SL)*gini(SL) + len(SR)*gini(SR)
    print("{:2}   {:32} {:32} {:3.1f}".format(theta, str(SL), str(SR), Q))

-1   []                               ['B', 'A', 'A', 'B', 'A', 'B']   3.0
 0   ['A']                            ['B', 'A', 'A', 'B', 'B']        2.4
 1   ['A', 'B']                       ['B', 'A', 'A', 'B']             3.0
 2   ['B', 'A', 'B']                  ['A', 'A', 'B']                  2.7
 3   ['B', 'B', 'A', 'B']             ['A', 'A']                       1.5
 4   ['B', 'A', 'B', 'A', 'B']        ['A']                            2.4
 5   ['B', 'A', 'A', 'B', 'A', 'B']   []                               3.0

Exercise 5.6¶

For the decision tree drawn in Example 5.10, make predictions for each of the following queries, showing the path taken through the tree for each case:

(a) $(x_1, x_2) = (4, 5),\quad$ (b) $(x_1, x_2) = (-3, 1),\quad$ (c) $(x_1, x_2) = (10, -1)$.

Solution:

(a) Will go to leaf A and hence predict "T" by majority vote.

(b) Will go to leaf B and hence predict "H".

(c) Will go to leaf C and hence predict "H" by majority vote.

Exercise 5.7¶

Using 1-norm, 2-norm, and $\infty$-norm, find the distance between the given vectors:

(a) $\mathbf u=[2,3,0], \ \mathbf v=[-2,2,1]$

(b) $\mathbf u=[0,1,0,1,0], \ \mathbf v=[1,1,1,1,1]$

Solution: The distance metric is defined via $\mathrm{dist}(\mathbf{u}, \mathbf{v}) = \|\mathbf{u} - \mathbf{v}\|$. The rest is straightforward calculation using the definitions of the respective norms.

Exercise 5.8¶

(a) Prove that for any $\mathbf u \in \mathbb{R}^d$, $\|\mathbf u\|_\infty \le \| \mathbf u\|_2$.

(b) Prove that for any $\mathbf u \in \mathbb{R}^d$, $\|\mathbf u\|_2 \le \sqrt{d}\, \|\mathbf u\|_\infty$.

Solution:

(a) Let $u_j$ be an entry of $\mathbf{u}$ such that $|u_j| = \|\mathbf{u}\|_\infty$. Then

$$ \|\mathbf{u}\|_2^2 = \sum_{i=1}^d | u_i |^2 \geq | u_j |^2 = \|\mathbf{u}\|_\infty^2. $$

(b) We have $|u_i| \leq \|\mathbf{u}\|_\infty$ for all $i=1,\ldots,d$. Hence,

$$ \| \mathbf{u}\|_2^2 = \sum_{i=1}^d |u_i |^2 \leq \sum_{i=1}^d \|\mathbf{u}\|_\infty^2 = d \|\mathbf{u}\|_\infty^2. $$

Exercise 5.9¶

Carefully sketch the set of all points in $\mathbb{R}^2$ whose 1-norm distance from the origin equals 1. This is a Manhattan unit circle. (Hint: You can consider each quadrant of the plane separately.)

In [3]:

# Solution
import numpy as np
import matplotlib.pyplot as plt

# corner points for the 1-norm unit circle
x = np.array([1, 0, -1, 0, 1])
y = np.array([0, 1, 0, -1, 0])

plt.figure(figsize=(6, 6))
plt.plot(x, y)
plt.gca().set_aspect('equal', adjustable='box')
plt.xlabel('x'); plt.ylabel('y')
plt.title('1-norm unit circle');

No description has been provided for this image

Exercise 5.10¶

Suppose you are training to fit the ground-truth one-dimensional classifier

$$ y = f(x) = \begin{cases} +1, & \text{if } |x| \leq 2, \\ -1, & \text{otherwise}. \end{cases} $$

Here is a table of training data:

$x_i$	$-5$	$-4$	$-3$	$-2$	$-1$	$0$	$1$	$2$	$3$	$4$	$5$
$y_i$	$-1$	$-1$	$+1$	$-1$	$+1$	$+1$	$-1$	$+1$	$-1$	$-1$	$-1$

(a) Here is a table of testing data:

$t_i$	$-4.75$	$-3.75$	$-2.75$	$-1.75$	$-0.75$	$0.25$	$1.25$	$2.25$	$3.25$	$4.25$
$f(t_i)$	$\,$	$\,$	$\,$	$\,$	$\,$	$\,$	$\,$	$\,$	$\,$	$\,$

Fill in the second row.

(b) Add a row to your table from part (a) showing the predictions of a kNN classifier with $k=1$ trained on the given training data.

(c) Add another row for a kNN classifier with $k=3$. Then add another row for $k=9$.

(d) Find the testing precision and recall for the rows with $k=1,3,9$, considering $+1$ to be the positive outcome.

Solution:

We provide the solution for $k=1$:

$t_i$	$-4.75$	$-3.75$	$-2.75$	$-1.75$	$-0.75$	$0.25$	$1.25$	$2.25$	$3.25$	$4.25$
$f(t_i)$	$-1$	$-1$	$-1$	$+1$	$+1$	$+1$	$+1$	$-1$	$-1$	$-1$
kNN (k=1)	$-1$	$-1$	$+1$	$-1$	$+1$	$+1$	$-1$	$+1$	$-1$	$-1$

With this, we have $\mathrm{TP} = 2$, $\mathrm{FP} = 2$, $\mathrm{TN} = 4$, $\mathrm{FN} = 2$ for the test data. Therefore,

$$ \text{precision} = \frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FP}} = \frac{2}{4}, \quad \text{recall} = \frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}} = \frac{2}{4}. $$

Exercise 5.11¶

Here are blue/orange labels on an integer lattice.

Let $\hat{f}(x_1,x_2)$ be the kNN probabilistic classifier with $k=4$, Euclidean metric, and mean averaging that returns the probability of a blue label. In each case below, a function $g(t)$ is defined from values of $\hat{f}$ along a vertical or horizontal line. Carefully sketch a plot of $g(t)$ for $-2\le t \le 2$.

(a) $g(t) = \hat{f}(1.2,t)$

(b) $g(t) = \hat{f}(t,-0.75)$

(c) $g(t) = \hat{f}(t,1.6)$

(d) $g(t) = \hat{f}(-0.25,t)$

Exercise 5.12¶

Suppose you have written a kNN classifier with $k=5$ to predict whether dad jokes are funny. Here are the votes for 6 test jokes:

Joke	Funny	Not funny	Actual
Is the refrigerator running? Better go catch it!	3	2	not funny
Why did the scarecrow win an award? Because he was outstanding in his field!	1	4	funny
What do you call a fake noodle? An impasta!	5	0	funny
Why couldn’t the bicycle stand up by itself? It was two-tired!	4	1	funny
What’s blue and not heavy? Light blue.	2	3	not funny
What do you call a pile of cats? A meowtain!	0	5	not funny

Carefully sketch the ROC curve for this classifier, considering the positive outcome to be "funny."