Overview
We cover three visualization techniques:- Intermediate activations - understand how successive layers transform input
- Filter visualization - see what visual patterns each filter responds to
- Class Activation Maps (CAM) - identify which parts of an image led to classification
Visualizing Intermediate Activations
Extract feature maps from convolution and pooling layers:- First layers act as edge detectors, retaining most input information
- Higher layers encode abstract concepts like “cat ear” or “cat eye”
- Activation sparsity increases with depth
Visualizing ConvNet Filters
Use gradient ascent to find input patterns that maximally activate each filter:block1_conv1: directional edges and colorsblock2_conv1: simple textures from edge combinations- Higher layers: natural textures (feathers, eyes, leaves)
Class Activation Maps (Grad-CAM)
Visualize which image regions led to classification decisions:- Debug classification mistakes
- Locate objects in images
- Understand model decision process
References
- Dumoulin, V., Visin, F. (2016). A guide to convolution arithmetic for deep learning.
- Peng, X., Sun, B., Ali, K., Saenko, K. (2015). What Do Deep CNNs Learn About Objects?.
- Selvaraju, R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., et al. (2016). Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization.
- Simonyan, K., Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition.
- Zeiler, M., Fergus, R. (2013). Visualizing and Understanding Convolutional Networks.

