Exercise 5.1¶
Here is a confusion matrix for a classifier of joke funniness.
Considering funny to be the positive outcome, calculate the (a) recall, (b) precision, (c) specificity, (d) accuracy, and (e) $F_1$ score of the classifier.
Exercise 5.2¶
Here is a confusion matrix for a classifier of ice cream flavours.
(a) Calculate the recall rate for chocolate.
(b) Find the precision for vanilla.
(c) Find the accuracy for strawberry.
Exercise 5.4¶
Explain why for the binary case $K=2$, Definition 5.5 implies that $H(S)$ is maximized when the two classes are equally represented in $S$.
Exercise 5.5¶
Given $x_i=i$ for $i=0,\ldots,5$, with labels $$ y_0=y_4=y_5=A, \quad y_1=y_2=y_3=B, $$ write Python code to find an optimal partition threshold using Gini impurity.
Exercise 5.6¶
For the decision tree drawn in Example 5.10, make predictions for each of the following queries, showing the path taken through the tree for each case:
(a) $(x_1, x_2) = (4, 5),\quad$ (b) $(x_1, x_2) = (-3, 1),\quad$ (c) $(x_1, x_2) = (10, -1)$.
Exercise 5.7¶
Using 1-norm, 2-norm, and $\infty$-norm, find the distance between the given vectors:
(a) $\mathbf u=[2,3,0], \ \mathbf v=[-2,2,1]$
(b) $\mathbf u=[0,1,0,1,0], \ \mathbf v=[1,1,1,1,1]$
Exercise 5.8¶
(a) Prove that for any $\mathbf u \in \mathbb{R}^d$, $\|\mathbf u\|_\infty \le \| \mathbf u\|_2$.
(b) Prove that for any $\mathbf u \in \mathbb{R}^d$, $\|\mathbf u\|_2 \le \sqrt{d}\, \|\mathbf u\|_\infty$.
Exercise 5.9¶
Carefully sketch the set of all points in $\mathbb{R}^2$ whose 1-norm distance from the origin equals 1. This is a Manhattan unit circle. (Hint: You can consider each quadrant of the plane separately.)
Exercise 5.10¶
Suppose you are training to fit the ground-truth one-dimensional classifier
$$ y = f(x) = \begin{cases} +1, & \text{if } |x| \leq 2, \\ -1, & \text{otherwise}. \end{cases} $$
Here is a table of training data:
| $x_i$ | $-5$ | $-4$ | $-3$ | $-2$ | $-1$ | $0$ | $1$ | $2$ | $3$ | $4$ | $5$ |
|---|---|---|---|---|---|---|---|---|---|---|---|
| $y_i$ | $-1$ | $-1$ | $+1$ | $-1$ | $+1$ | $+1$ | $-1$ | $+1$ | $-1$ | $-1$ | $-1$ |
(a) Here is a table of testing data:
| $t_i$ | $-4.75$ | $-3.75$ | $-2.75$ | $-1.75$ | $-0.75$ | $0.25$ | $1.25$ | $2.25$ | $3.25$ | $4.25$ |
|---|---|---|---|---|---|---|---|---|---|---|
| $f(t_i)$ | $\,$ | $\,$ | $\,$ | $\,$ | $\,$ | $\,$ | $\,$ | $\,$ | $\,$ | $\,$ |
Fill in the second row.
(b) Add a row to your table from part (a) showing the predictions of a kNN classifier with $k=1$ trained on the given training data.
(c) Add another row for a kNN classifier with $k=3$. Then add another row for $k=9$.
(d) Find the testing precision and recall for the rows with $k=1,3,9$, considering $+1$ to be the positive outcome.
Exercise 5.11¶
Here are blue/orange labels on an integer lattice.
Let $\hat{f}(x_1,x_2)$ be the kNN probabilistic classifier with $k=4$, Euclidean metric, and mean averaging that returns the probability of a blue label. In each case below, a function $g(t)$ is defined from values of $\hat{f}$ along a vertical or horizontal line. Carefully sketch a plot of $g(t)$ for $-2\le t \le 2$.
(a) $g(t) = \hat{f}(1.2,t)$
(b) $g(t) = \hat{f}(t,-0.75)$
(c) $g(t) = \hat{f}(t,1.6)$
(d) $g(t) = \hat{f}(-0.25,t)$
Exercise 5.12¶
Suppose you have written a kNN classifier with $k=5$ to predict whether dad jokes are funny. Here are the votes for 6 test jokes:
| Joke | Funny | Not funny | Actual |
|---|---|---|---|
| Is the refrigerator running? Better go catch it! | 3 | 2 | not funny |
| Why did the scarecrow win an award? Because he was outstanding in his field! | 1 | 4 | funny |
| What do you call a fake noodle? An impasta! | 5 | 0 | funny |
| Why couldn’t the bicycle stand up by itself? It was two-tired! | 4 | 1 | funny |
| What’s blue and not heavy? Light blue. | 2 | 3 | not funny |
| What do you call a pile of cats? A meowtain! | 0 | 5 | not funny |
Carefully sketch the ROC curve for this classifier, considering the positive outcome to be "funny."