Visualising CNNs: The Class Appearance ModelΒΆ
In this example we will demonstrate the ClassAppearanceModel callback included in torchbearer. This implements one of the most simple (and therefore not always the most successful) deep visualisation techniques, discussed in the paper Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps
Note: The easiest way to use this tutorial is as a colab notebook, which allows you to dive in with no setup. We recommend you enable a free GPU with
Runtime βββββ Change runtime type βββββ Hardware Accelerator: GPU
Install TorchbearerΒΆ
First we install torchbearer if needed.
try:
import torchbearer
except:
!pip install -q torchbearer
import torchbearer
print(torchbearer.__version__)
0.4.0.dev
BackgroundΒΆ
The process to obtain Figure 1 from the paper is simple, given a particular target class $c$, we use back-propagation to obtain
\begin{equation} \arg\!\max_I \; S_c(I) - \lambda\Vert I \Vert_2^2\; , \end{equation}
where $S_c(I)$ is the un-normalised score of $c$ for the image $I$ given by the network. The regularisation term $\Vert I \Vert_2^2$ is necessary to prevent the resultant image from becoming overly noisy. More recent visualisation techniques use much more advanced regularisers to obtain smoother, more realistic images.
Loading the ModelΒΆ
Since we are just running the callback on a pre-trained model, we don't need to load any data in this example. Instead, we use torchvision to load a VGG16 trained on ImageNet with the following:
import torch.nn as nn
import torchvision
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.net = torchvision.models.vgg16(True)
def forward(self, input):
if input is not None:
return self.net(input)
model = Model()
We need to include the None check as we will initialise the Trial without a dataloader, and so it will pass
None to the model forward.
Running with the CallbackΒΆ
When producing gradient based visualisations, we commonly need to include a transform to convert the input to the right space. For torchvisions ImageNet models we can use the following:
from torchvision import transforms
normalize = transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]
)
Finally we can construct and run the Trial with:
%matplotlib inline
import matplotlib
matplotlib.rcParams['figure.figsize'] = (8.0, 8.0)
import torch
from torchbearer import Trial
from torchbearer.callbacks import imaging
device = 'cuda' if torch.cuda.is_available() else 'cpu'
trial = Trial(model, callbacks=[
imaging.ClassAppearanceModel(1000, (3, 500, 500), steps=128, target=951, decay=0.002, in_transform=normalize)
.on_val().to_pyplot().to_file('lemon.png'),
imaging.ClassAppearanceModel(1000, (3, 500, 500), steps=128, target=968, decay=0.002, in_transform=normalize)
.on_val().to_pyplot().to_file('cup.png'),
imaging.ClassAppearanceModel(1000, (3, 500, 500), steps=128, target=543, decay=0.002, in_transform=normalize)
.on_val().to_pyplot().to_file('dumbbell.png')
])
trial.for_val_steps(1).to(device)
trial.evaluate()
HBox(children=(IntProgress(value=0, description='0/1(e)', max=1), HTML(value='')))
{}
Here we create two ClassAppearanceModel instances which target the lemon, cup and dumbbell classes respectively. Since the .ClassAppearanceModel is an ImagingCallback, we use the imaging API to send each of these to files. Finally, we evaluate the model for a single step to generate the results.
ResultsΒΆ
There are some shapes which resemble a lemon, cup or dumbbell, however, not quite to the same extent shown in the paper. Because of the simplistic regularisation and objective, this model is highly sensitive to hyper-parameter choices. These results could almost certainly be improved with some more careful selection.