{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For these exercises, you will need to install `torch` and `torchvision`. Open the terminal and type `python -m pip install torch torchvision`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import torch\n",
    "from torch import nn\n",
    "from torch.utils.data import DataLoader\n",
    "from torchvision import datasets\n",
    "from torchvision.transforms import ToTensor\n",
    "import matplotlib.pyplot as plt\n",
    "from time import time"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 8.1\n",
    "\n",
    "Tuning the hyperparameters of deep learning models and their optimization routines can be a very time-consuming task. It is essentially a task that requires a lot of experience and patience. In this exercise, we will try to understand the effect of the learning rate parameter in stochastic gradient descent (SGD). \n",
    "\n",
    "Revisit the fully connected neural network we trained on the FashionMNIST dataset in the lecture notes. Modify the code so that you can easily perform multiple training runs with different learning rates. Then produce a plot that visualizes the training time until each epoch versus the classification accuracy for different learning rates.  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 8.2\n",
    "\n",
    "Again starting with the fully connected neural network we trained on the FashionMNIST dataset in the lecture notes, experiment with different model architectures by varying the number of layers and neurons per layer. Produce a plot that visualizes the training time until each epoch versus the classification accuracy for a few different configurations."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise 8.3\n",
    "\n",
    "Modify the LeNet CNN example from the lecture notes to work with coloured images and train and test it on the CIFAR10 dataset. Compared to FashionMNIST, images are now coloured (3x32x32 instead of 1x28x28), and so the first convolutional layer needs to act on three channels. This then also affects the following layer dimensions. In addition, replace the Tanh activation functions by nowadays more commonly used ReLU activations.\n",
    "\n",
    "This [PyTorch tutorial](https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html) might be useful. "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
