Exercise 4.1¶

Prove that the Python function gauss_sample defined in Chapter 4 indeed generates samples of a Gaussian distribution with mean $c$ and covariance $\Sigma$.

Exercise 4.2¶

Recall the definition of a covariance matrix. Implement a version of the NumPy function np.cov yourself (you may call it my_cov, for example). Then go to the NumPy documentation of np.cov (https://numpy.org/doc/stable/reference/generated/numpy.cov.html) and click on the [source] link of that page. After some searching, you should be able to view the Python implementation of np.cov. Can you identify the part where the actual computation happens?

Exercise 4.3¶

Recall Example 4.1 from Chapter 4 that produces a contour plot of the gauss_density.

That code uses a nested for-loop over all entries in the matrices X0 and X1 to compute the corresponding entry in Z. This is inefficient for large matrices for at least two reasons:

  1. every call of gauss_density computes the same normalisation constant anew, even though this does not depend on x
  2. the numerical libraries underlying NumPy can perform vector and matrix operations much more efficiently than element-wise operations. Rewriting code to make best use of vector and matrix operations is called vectorization or array programming.

Write a function gauss_density_2d(X0, X1, c, Sigma) that accepts NumPy matrices X0, X1 of 2D point coordinates and returns Z in one go. Can you avoid using any for-loops? Perform a timing comparison of both approaches for evaluating the Gaussian density function.

Exercise 4.4¶

What is the interpretation of the empirical error $\hat R(h)\approx 0.0016$ in Example 4.2, in terms of individual data points?

Exercise 4.5¶

Try increasing N in the code of Example 4.2. You'll notice that the empirical error converges to a value $\approx 0.002112$. Write down a formula for the precise limiting value.

Exercise 4.6¶

In Section 4.5 we have implemented the Bayes hypothesis h_best_L2 and computed its generalisation error. As discussed there, h_best_L2 does not necessarily attain integer values in $\{0,1,2\}$. Can you come up with a best possible hypothesis $h$ that takes only values in $\{0,1,2\}$? And compute its generalisation error?