|
|
@@ -0,0 +1,529 @@ |
|
|
|
{ |
|
|
|
"nbformat": 4, |
|
|
|
"nbformat_minor": 0, |
|
|
|
"metadata": { |
|
|
|
"accelerator": "GPU", |
|
|
|
"colab": { |
|
|
|
"name": "HW3-CNN.ipynb", |
|
|
|
"provenance": [], |
|
|
|
"collapsed_sections": [], |
|
|
|
"toc_visible": true |
|
|
|
}, |
|
|
|
"kernelspec": { |
|
|
|
"display_name": "Python 3", |
|
|
|
"name": "python3" |
|
|
|
} |
|
|
|
}, |
|
|
|
"cells": [ |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"metadata": { |
|
|
|
"id": "D_a2USyd4giE" |
|
|
|
}, |
|
|
|
"source": [ |
|
|
|
"# **Homework 3 - Convolutional Neural Network**\n", |
|
|
|
"\n", |
|
|
|
"This is the example code of homework 3 of the machine learning course by Prof. Hung-yi Lee.\n", |
|
|
|
"\n", |
|
|
|
"In this homework, you are required to build a convolutional neural network for image classification, possibly with some advanced training tips.\n", |
|
|
|
"\n", |
|
|
|
"\n", |
|
|
|
"There are three levels here:\n", |
|
|
|
"\n", |
|
|
|
"**Easy**: Build a simple convolutional neural network as the baseline. (2 pts)\n", |
|
|
|
"\n", |
|
|
|
"**Medium**: Design a better architecture or adopt different data augmentations to improve the performance. (2 pts)\n", |
|
|
|
"\n", |
|
|
|
"**Hard**: Utilize provided unlabeled data to obtain better results. (2 pts)" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"metadata": { |
|
|
|
"id": "VHpJocsDr6iA" |
|
|
|
}, |
|
|
|
"source": [ |
|
|
|
"## **About the Dataset**\n", |
|
|
|
"\n", |
|
|
|
"The dataset used here is food-11, a collection of food images in 11 classes.\n", |
|
|
|
"\n", |
|
|
|
"For the requirement in the homework, TAs slightly modified the data.\n", |
|
|
|
"Please DO NOT access the original fully-labeled training data or testing labels.\n", |
|
|
|
"\n", |
|
|
|
"Also, the modified dataset is for this course only, and any further distribution or commercial use is forbidden." |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "code", |
|
|
|
"metadata": { |
|
|
|
"id": "zhzdomRTOKoJ" |
|
|
|
}, |
|
|
|
"source": [ |
|
|
|
"# Download the dataset\n", |
|
|
|
"# You may choose where to download the data.\n", |
|
|
|
"\n", |
|
|
|
"# Google Drive\n", |
|
|
|
"# !gdown --id '1awF7pZ9Dz7X1jn1_QAiKN-_v56veCEKy' --output food-11.zip\n", |
|
|
|
"\n", |
|
|
|
"# Dropbox\n", |
|
|
|
"!wget https://www.dropbox.com/s/m9q6273jl3djall/food-11.zip -O food-11.zip\n", |
|
|
|
"\n", |
|
|
|
"# Unzip the dataset.\n", |
|
|
|
"# This may take some time.\n", |
|
|
|
"!unzip -q food-11.zip" |
|
|
|
], |
|
|
|
"execution_count": null, |
|
|
|
"outputs": [] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"metadata": { |
|
|
|
"id": "BBVSCWWhp6uq" |
|
|
|
}, |
|
|
|
"source": [ |
|
|
|
"## **Import Packages**\n", |
|
|
|
"\n", |
|
|
|
"First, we need to import packages that will be used later.\n", |
|
|
|
"\n", |
|
|
|
"In this homework, we highly rely on **torchvision**, a library of PyTorch." |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "code", |
|
|
|
"metadata": { |
|
|
|
"id": "9sVrKci4PUFW" |
|
|
|
}, |
|
|
|
"source": [ |
|
|
|
"# Import necessary packages.\n", |
|
|
|
"import numpy as np\n", |
|
|
|
"import torch\n", |
|
|
|
"import torch.nn as nn\n", |
|
|
|
"import torchvision.transforms as transforms\n", |
|
|
|
"from PIL import Image\n", |
|
|
|
"# \"ConcatDataset\" and \"Subset\" are possibly useful when doing semi-supervised learning.\n", |
|
|
|
"from torch.utils.data import ConcatDataset, DataLoader, Subset\n", |
|
|
|
"from torchvision.datasets import DatasetFolder\n", |
|
|
|
"\n", |
|
|
|
"# This is for the progress bar.\n", |
|
|
|
"from tqdm.auto import tqdm" |
|
|
|
], |
|
|
|
"execution_count": null, |
|
|
|
"outputs": [] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"metadata": { |
|
|
|
"id": "F0i9ZCPrOVN_" |
|
|
|
}, |
|
|
|
"source": [ |
|
|
|
"## **Dataset, Data Loader, and Transforms**\n", |
|
|
|
"\n", |
|
|
|
"Torchvision provides lots of useful utilities for image preprocessing, data wrapping as well as data augmentation.\n", |
|
|
|
"\n", |
|
|
|
"Here, since our data are stored in folders by class labels, we can directly apply **torchvision.datasets.DatasetFolder** for wrapping data without much effort.\n", |
|
|
|
"\n", |
|
|
|
"Please refer to [PyTorch official website](https://pytorch.org/vision/stable/transforms.html) for details about different transforms." |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "code", |
|
|
|
"metadata": { |
|
|
|
"id": "gKd2abixQghI" |
|
|
|
}, |
|
|
|
"source": [ |
|
|
|
"# It is important to do data augmentation in training.\n", |
|
|
|
"# However, not every augmentation is useful.\n", |
|
|
|
"# Please think about what kind of augmentation is helpful for food recognition.\n", |
|
|
|
"train_tfm = transforms.Compose([\n", |
|
|
|
" # Resize the image into a fixed shape (height = width = 128)\n", |
|
|
|
" transforms.Resize((128, 128)),\n", |
|
|
|
" # You may add some transforms here.\n", |
|
|
|
" # ToTensor() should be the last one of the transforms.\n", |
|
|
|
" transforms.ToTensor(),\n", |
|
|
|
"])\n", |
|
|
|
"\n", |
|
|
|
"# We don't need augmentations in testing and validation.\n", |
|
|
|
"# All we need here is to resize the PIL image and transform it into Tensor.\n", |
|
|
|
"test_tfm = transforms.Compose([\n", |
|
|
|
" transforms.Resize((128, 128)),\n", |
|
|
|
" transforms.ToTensor(),\n", |
|
|
|
"])\n" |
|
|
|
], |
|
|
|
"execution_count": null, |
|
|
|
"outputs": [] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "code", |
|
|
|
"metadata": { |
|
|
|
"id": "qz6jeMnkQl0_" |
|
|
|
}, |
|
|
|
"source": [ |
|
|
|
"# Batch size for training, validation, and testing.\n", |
|
|
|
"# A greater batch size usually gives a more stable gradient.\n", |
|
|
|
"# But the GPU memory is limited, so please adjust it carefully.\n", |
|
|
|
"batch_size = 128\n", |
|
|
|
"\n", |
|
|
|
"# Construct datasets.\n", |
|
|
|
"# The argument \"loader\" tells how torchvision reads the data.\n", |
|
|
|
"train_set = DatasetFolder(\"food-11/training/labeled\", loader=lambda x: Image.open(x), extensions=\"jpg\", transform=train_tfm)\n", |
|
|
|
"valid_set = DatasetFolder(\"food-11/validation\", loader=lambda x: Image.open(x), extensions=\"jpg\", transform=test_tfm)\n", |
|
|
|
"unlabeled_set = DatasetFolder(\"food-11/training/unlabeled\", loader=lambda x: Image.open(x), extensions=\"jpg\", transform=train_tfm)\n", |
|
|
|
"test_set = DatasetFolder(\"food-11/testing\", loader=lambda x: Image.open(x), extensions=\"jpg\", transform=test_tfm)\n", |
|
|
|
"\n", |
|
|
|
"# Construct data loaders.\n", |
|
|
|
"train_loader = DataLoader(train_set, batch_size=batch_size, shuffle=True, num_workers=2, pin_memory=True)\n", |
|
|
|
"valid_loader = DataLoader(valid_set, batch_size=batch_size, shuffle=True, num_workers=2, pin_memory=True)\n", |
|
|
|
"test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=False)" |
|
|
|
], |
|
|
|
"execution_count": null, |
|
|
|
"outputs": [] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"metadata": { |
|
|
|
"id": "j9YhZo7POPYG" |
|
|
|
}, |
|
|
|
"source": [ |
|
|
|
"## **Model**\n", |
|
|
|
"\n", |
|
|
|
"The basic model here is simply a stack of convolutional layers followed by some fully-connected layers.\n", |
|
|
|
"\n", |
|
|
|
"Since there are three channels for a color image (RGB), the input channels of the network must be three.\n", |
|
|
|
"In each convolutional layer, typically the channels of inputs grow, while the height and width shrink (or remain unchanged, according to some hyperparameters like stride and padding).\n", |
|
|
|
"\n", |
|
|
|
"Before fed into fully-connected layers, the feature map must be flattened into a single one-dimensional vector (for each image).\n", |
|
|
|
"These features are then transformed by the fully-connected layers, and finally, we obtain the \"logits\" for each class.\n", |
|
|
|
"\n", |
|
|
|
"### **WARNING -- You Must Know**\n", |
|
|
|
"You are free to modify the model architecture here for further improvement.\n", |
|
|
|
"However, if you want to use some well-known architectures such as ResNet50, please make sure **NOT** to load the pre-trained weights.\n", |
|
|
|
"Using such pre-trained models is considered cheating and therefore you will be punished.\n", |
|
|
|
"Similarly, it is your responsibility to make sure no pre-trained weights are used if you use **torch.hub** to load any modules.\n", |
|
|
|
"\n", |
|
|
|
"For example, if you use ResNet-18 as your model:\n", |
|
|
|
"\n", |
|
|
|
"model = torchvision.models.resnet18(pretrained=**False**) → This is fine.\n", |
|
|
|
"\n", |
|
|
|
"model = torchvision.models.resnet18(pretrained=**True**) → This is **NOT** allowed." |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "code", |
|
|
|
"metadata": { |
|
|
|
"id": "Y1c-GwrMQqMl" |
|
|
|
}, |
|
|
|
"source": [ |
|
|
|
"class Classifier(nn.Module):\n", |
|
|
|
" def __init__(self):\n", |
|
|
|
" super(Classifier, self).__init__()\n", |
|
|
|
" # The arguments for commonly used modules:\n", |
|
|
|
" # torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding)\n", |
|
|
|
" # torch.nn.MaxPool2d(kernel_size, stride, padding)\n", |
|
|
|
"\n", |
|
|
|
" # input image size: [3, 128, 128]\n", |
|
|
|
" self.cnn_layers = nn.Sequential(\n", |
|
|
|
" nn.Conv2d(3, 64, 3, 1, 1),\n", |
|
|
|
" nn.BatchNorm2d(64),\n", |
|
|
|
" nn.ReLU(),\n", |
|
|
|
" nn.MaxPool2d(2, 2, 0),\n", |
|
|
|
"\n", |
|
|
|
" nn.Conv2d(64, 128, 3, 1, 1),\n", |
|
|
|
" nn.BatchNorm2d(128),\n", |
|
|
|
" nn.ReLU(),\n", |
|
|
|
" nn.MaxPool2d(2, 2, 0),\n", |
|
|
|
"\n", |
|
|
|
" nn.Conv2d(128, 256, 3, 1, 1),\n", |
|
|
|
" nn.BatchNorm2d(256),\n", |
|
|
|
" nn.ReLU(),\n", |
|
|
|
" nn.MaxPool2d(4, 4, 0),\n", |
|
|
|
" )\n", |
|
|
|
" self.fc_layers = nn.Sequential(\n", |
|
|
|
" nn.Linear(256 * 8 * 8, 256),\n", |
|
|
|
" nn.ReLU(),\n", |
|
|
|
" nn.Linear(256, 256),\n", |
|
|
|
" nn.ReLU(),\n", |
|
|
|
" nn.Linear(256, 11)\n", |
|
|
|
" )\n", |
|
|
|
"\n", |
|
|
|
" def forward(self, x):\n", |
|
|
|
" # input (x): [batch_size, 3, 128, 128]\n", |
|
|
|
" # output: [batch_size, 11]\n", |
|
|
|
"\n", |
|
|
|
" # Extract features by convolutional layers.\n", |
|
|
|
" x = self.cnn_layers(x)\n", |
|
|
|
"\n", |
|
|
|
" # The extracted feature map must be flatten before going to fully-connected layers.\n", |
|
|
|
" x = x.flatten(1)\n", |
|
|
|
"\n", |
|
|
|
" # The features are transformed by fully-connected layers to obtain the final logits.\n", |
|
|
|
" x = self.fc_layers(x)\n", |
|
|
|
" return x" |
|
|
|
], |
|
|
|
"execution_count": null, |
|
|
|
"outputs": [] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"metadata": { |
|
|
|
"id": "aEnGbriXORN3" |
|
|
|
}, |
|
|
|
"source": [ |
|
|
|
"## **Training**\n", |
|
|
|
"\n", |
|
|
|
"You can finish supervised learning by simply running the provided code without any modification.\n", |
|
|
|
"\n", |
|
|
|
"The function \"get_pseudo_labels\" is used for semi-supervised learning.\n", |
|
|
|
"It is expected to get better performance if you use unlabeled data for semi-supervised learning.\n", |
|
|
|
"However, you have to implement the function on your own and need to adjust several hyperparameters manually.\n", |
|
|
|
"\n", |
|
|
|
"For more details about semi-supervised learning, please refer to [Prof. Lee's slides](https://speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/semi%20(v3).pdf).\n", |
|
|
|
"\n", |
|
|
|
"Again, please notice that utilizing external data (or pre-trained model) for training is **prohibited**." |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "code", |
|
|
|
"metadata": { |
|
|
|
"id": "swlf5EwA-hxA" |
|
|
|
}, |
|
|
|
"source": [ |
|
|
|
"def get_pseudo_labels(dataset, model, threshold=0.65):\n", |
|
|
|
" # This functions generates pseudo-labels of a dataset using given model.\n", |
|
|
|
" # It returns an instance of DatasetFolder containing images whose prediction confidences exceed a given threshold.\n", |
|
|
|
" # You are NOT allowed to use any models trained on external data for pseudo-labeling.\n", |
|
|
|
" device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n", |
|
|
|
"\n", |
|
|
|
" # Make sure the model is in eval mode.\n", |
|
|
|
" model.eval()\n", |
|
|
|
" # Define softmax function.\n", |
|
|
|
" softmax = nn.Softmax(dim=-1)\n", |
|
|
|
"\n", |
|
|
|
" # Iterate over the dataset by batches.\n", |
|
|
|
" for batch in tqdm(dataloader):\n", |
|
|
|
" img, _ = batch\n", |
|
|
|
"\n", |
|
|
|
" # Forward the data\n", |
|
|
|
" # Using torch.no_grad() accelerates the forward process.\n", |
|
|
|
" with torch.no_grad():\n", |
|
|
|
" logits = model(img.to(device))\n", |
|
|
|
"\n", |
|
|
|
" # Obtain the probability distributions by applying softmax on logits.\n", |
|
|
|
" probs = softmax(logits)\n", |
|
|
|
"\n", |
|
|
|
" # ---------- TODO ----------\n", |
|
|
|
" # Filter the data and construct a new dataset.\n", |
|
|
|
"\n", |
|
|
|
" # # Turn off the eval mode.\n", |
|
|
|
" model.train()\n", |
|
|
|
" return dataset" |
|
|
|
], |
|
|
|
"execution_count": null, |
|
|
|
"outputs": [] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "code", |
|
|
|
"metadata": { |
|
|
|
"id": "PHaFE-8oQtkC" |
|
|
|
}, |
|
|
|
"source": [ |
|
|
|
"# \"cuda\" only when GPUs are available.\n", |
|
|
|
"device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n", |
|
|
|
"\n", |
|
|
|
"# Initialize a model, and put it on the device specified.\n", |
|
|
|
"model = Classifier().to(device)\n", |
|
|
|
"model.device = device\n", |
|
|
|
"\n", |
|
|
|
"# For the classification task, we use cross-entropy as the measurement of performance.\n", |
|
|
|
"criterion = nn.CrossEntropyLoss()\n", |
|
|
|
"\n", |
|
|
|
"# Initialize optimizer, you may fine-tune some hyperparameters such as learning rate on your own.\n", |
|
|
|
"optimizer = torch.optim.Adam(model.parameters(), lr=0.0003, weight_decay=1e-5)\n", |
|
|
|
"\n", |
|
|
|
"# The number of training epochs.\n", |
|
|
|
"n_epochs = 80\n", |
|
|
|
"\n", |
|
|
|
"# Whether to do semi-supervised learning.\n", |
|
|
|
"do_semi = False\n", |
|
|
|
"\n", |
|
|
|
"for epoch in range(n_epochs):\n", |
|
|
|
" # ---------- TODO ----------\n", |
|
|
|
" # In each epoch, relabel the unlabeled dataset for semi-supervised learning.\n", |
|
|
|
" # Then you can combine the labeled dataset and pseudo-labeled dataset for the training.\n", |
|
|
|
" if do_semi:\n", |
|
|
|
" # Obtain pseudo-labels for unlabeled data using trained model.\n", |
|
|
|
" pseudo_set = get_pseudo_labels(unlabeled_set, model)\n", |
|
|
|
"\n", |
|
|
|
" # Construct a new dataset and a data loader for training.\n", |
|
|
|
" # This is used in semi-supervised learning only.\n", |
|
|
|
" concat_dataset = ConcatDataset([train_set, pseudo_set])\n", |
|
|
|
" train_loader = DataLoader(concat_dataset, batch_size=batch_size, shuffle=True, num_workers=8, pin_memory=True)\n", |
|
|
|
"\n", |
|
|
|
" # ---------- Training ----------\n", |
|
|
|
" # Make sure the model is in train mode before training.\n", |
|
|
|
" model.train()\n", |
|
|
|
"\n", |
|
|
|
" # These are used to record information in training.\n", |
|
|
|
" train_loss = []\n", |
|
|
|
" train_accs = []\n", |
|
|
|
"\n", |
|
|
|
" # Iterate the training set by batches.\n", |
|
|
|
" for batch in tqdm(train_loader):\n", |
|
|
|
"\n", |
|
|
|
" # A batch consists of image data and corresponding labels.\n", |
|
|
|
" imgs, labels = batch\n", |
|
|
|
"\n", |
|
|
|
" # Forward the data. (Make sure data and model are on the same device.)\n", |
|
|
|
" logits = model(imgs.to(device))\n", |
|
|
|
"\n", |
|
|
|
" # Calculate the cross-entropy loss.\n", |
|
|
|
" # We don't need to apply softmax before computing cross-entropy as it is done automatically.\n", |
|
|
|
" loss = criterion(logits, labels.to(device))\n", |
|
|
|
"\n", |
|
|
|
" # Gradients stored in the parameters in the previous step should be cleared out first.\n", |
|
|
|
" optimizer.zero_grad()\n", |
|
|
|
"\n", |
|
|
|
" # Compute the gradients for parameters.\n", |
|
|
|
" loss.backward()\n", |
|
|
|
"\n", |
|
|
|
" # Clip the gradient norms for stable training.\n", |
|
|
|
" grad_norm = nn.utils.clip_grad_norm_(model.parameters(), max_norm=10)\n", |
|
|
|
"\n", |
|
|
|
" # Update the parameters with computed gradients.\n", |
|
|
|
" optimizer.step()\n", |
|
|
|
"\n", |
|
|
|
" # Compute the accuracy for current batch.\n", |
|
|
|
" acc = (logits.argmax(dim=-1) == labels.to(device)).float().mean()\n", |
|
|
|
"\n", |
|
|
|
" # Record the loss and accuracy.\n", |
|
|
|
" train_loss.append(loss.item())\n", |
|
|
|
" train_accs.append(acc)\n", |
|
|
|
"\n", |
|
|
|
" # The average loss and accuracy of the training set is the average of the recorded values.\n", |
|
|
|
" train_loss = sum(train_loss) / len(train_loss)\n", |
|
|
|
" train_acc = sum(train_accs) / len(train_accs)\n", |
|
|
|
"\n", |
|
|
|
" # Print the information.\n", |
|
|
|
" print(f\"[ Train | {epoch + 1:03d}/{n_epochs:03d} ] loss = {train_loss:.5f}, acc = {train_acc:.5f}\")\n", |
|
|
|
"\n", |
|
|
|
" # ---------- Validation ----------\n", |
|
|
|
" # Make sure the model is in eval mode so that some modules like dropout are disabled and work normally.\n", |
|
|
|
" model.eval()\n", |
|
|
|
"\n", |
|
|
|
" # These are used to record information in validation.\n", |
|
|
|
" valid_loss = []\n", |
|
|
|
" valid_accs = []\n", |
|
|
|
"\n", |
|
|
|
" # Iterate the validation set by batches.\n", |
|
|
|
" for batch in tqdm(valid_loader):\n", |
|
|
|
"\n", |
|
|
|
" # A batch consists of image data and corresponding labels.\n", |
|
|
|
" imgs, labels = batch\n", |
|
|
|
"\n", |
|
|
|
" # We don't need gradient in validation.\n", |
|
|
|
" # Using torch.no_grad() accelerates the forward process.\n", |
|
|
|
" with torch.no_grad():\n", |
|
|
|
" logits = model(imgs.to(device))\n", |
|
|
|
"\n", |
|
|
|
" # We can still compute the loss (but not the gradient).\n", |
|
|
|
" loss = criterion(logits, labels.to(device))\n", |
|
|
|
"\n", |
|
|
|
" # Compute the accuracy for current batch.\n", |
|
|
|
" acc = (logits.argmax(dim=-1) == labels.to(device)).float().mean()\n", |
|
|
|
"\n", |
|
|
|
" # Record the loss and accuracy.\n", |
|
|
|
" valid_loss.append(loss.item())\n", |
|
|
|
" valid_accs.append(acc)\n", |
|
|
|
"\n", |
|
|
|
" # The average loss and accuracy for entire validation set is the average of the recorded values.\n", |
|
|
|
" valid_loss = sum(valid_loss) / len(valid_loss)\n", |
|
|
|
" valid_acc = sum(valid_accs) / len(valid_accs)\n", |
|
|
|
"\n", |
|
|
|
" # Print the information.\n", |
|
|
|
" print(f\"[ Valid | {epoch + 1:03d}/{n_epochs:03d} ] loss = {valid_loss:.5f}, acc = {valid_acc:.5f}\")" |
|
|
|
], |
|
|
|
"execution_count": null, |
|
|
|
"outputs": [] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "markdown", |
|
|
|
"metadata": { |
|
|
|
"id": "2o1oCMXy61_3" |
|
|
|
}, |
|
|
|
"source": [ |
|
|
|
"## **Testing**\n", |
|
|
|
"\n", |
|
|
|
"For inference, we need to make sure the model is in eval mode, and the order of the dataset should not be shuffled (\"shuffle=False\" in test_loader).\n", |
|
|
|
"\n", |
|
|
|
"Last but not least, don't forget to save the predictions into a single CSV file.\n", |
|
|
|
"The format of CSV file should follow the rules mentioned in the slides.\n", |
|
|
|
"\n", |
|
|
|
"### **WARNING -- Keep in Mind**\n", |
|
|
|
"\n", |
|
|
|
"Cheating includes but not limited to:\n", |
|
|
|
"1. using testing labels,\n", |
|
|
|
"2. submitting results to previous Kaggle competitions,\n", |
|
|
|
"3. sharing predictions with others,\n", |
|
|
|
"4. copying codes from any creatures on Earth,\n", |
|
|
|
"5. asking other people to do it for you.\n", |
|
|
|
"\n", |
|
|
|
"Any violations bring you punishments from getting a discount on the final grade to failing the course.\n", |
|
|
|
"\n", |
|
|
|
"It is your responsibility to check whether your code violates the rules.\n", |
|
|
|
"When citing codes from the Internet, you should know what these codes exactly do.\n", |
|
|
|
"You will **NOT** be tolerated if you break the rule and claim you don't know what these codes do.\n" |
|
|
|
] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "code", |
|
|
|
"metadata": { |
|
|
|
"id": "4HznI9_-ocrq" |
|
|
|
}, |
|
|
|
"source": [ |
|
|
|
"# Make sure the model is in eval mode.\n", |
|
|
|
"# Some modules like Dropout or BatchNorm affect if the model is in training mode.\n", |
|
|
|
"model.eval()\n", |
|
|
|
"\n", |
|
|
|
"# Initialize a list to store the predictions.\n", |
|
|
|
"predictions = []\n", |
|
|
|
"\n", |
|
|
|
"# Iterate the testing set by batches.\n", |
|
|
|
"for batch in tqdm(test_loader):\n", |
|
|
|
" # A batch consists of image data and corresponding labels.\n", |
|
|
|
" # But here the variable \"labels\" is useless since we do not have the ground-truth.\n", |
|
|
|
" # If printing out the labels, you will find that it is always 0.\n", |
|
|
|
" # This is because the wrapper (DatasetFolder) returns images and labels for each batch,\n", |
|
|
|
" # so we have to create fake labels to make it work normally.\n", |
|
|
|
" imgs, labels = batch\n", |
|
|
|
"\n", |
|
|
|
" # We don't need gradient in testing, and we don't even have labels to compute loss.\n", |
|
|
|
" # Using torch.no_grad() accelerates the forward process.\n", |
|
|
|
" with torch.no_grad():\n", |
|
|
|
" logits = model(imgs.to(device))\n", |
|
|
|
"\n", |
|
|
|
" # Take the class with greatest logit as prediction and record it.\n", |
|
|
|
" predictions.extend(logits.argmax(dim=-1).cpu().numpy().tolist())" |
|
|
|
], |
|
|
|
"execution_count": null, |
|
|
|
"outputs": [] |
|
|
|
}, |
|
|
|
{ |
|
|
|
"cell_type": "code", |
|
|
|
"metadata": { |
|
|
|
"id": "3t2q2Th85ZUE" |
|
|
|
}, |
|
|
|
"source": [ |
|
|
|
"# Save predictions into the file.\n", |
|
|
|
"with open(\"predict.csv\", \"w\") as f:\n", |
|
|
|
"\n", |
|
|
|
" # The first row must be \"Id, Category\"\n", |
|
|
|
" f.write(\"Id,Category\\n\")\n", |
|
|
|
"\n", |
|
|
|
" # For the rest of the rows, each image id corresponds to a predicted class.\n", |
|
|
|
" for i, pred in enumerate(predictions):\n", |
|
|
|
" f.write(f\"{i},{pred}\\n\")" |
|
|
|
], |
|
|
|
"execution_count": null, |
|
|
|
"outputs": [] |
|
|
|
} |
|
|
|
] |
|
|
|
} |