Backbone Networks

ResNet-50
Why ResNet-50?
Feature Extraction

ResNet-50

Since we are dealing with grayscale images we replicate the single channel to three channels to match the input size of the ResNet-50v2 model and avoid redesigning backbones.

The timm (PyTorch Image Models) library is used to load the pretrained ResNet-50 model. The model summary shows:

Input: $3 \times 224 \times 224$ images (grayscale replicated to 3 channels)
Output: 2048-dimensional feature vector (before classification head)
Architecture: 50 layers with residual connections
Pretrained on: ImageNet (1000 classes)

Why ResNet-50?

ResNet-50 was chosen for several reasons:

Proven effectiveness: Widely used in transfer learning applications
Appropriate depth: Deep enough to learn hierarchical features without being overly complex
Efficient inference: Reasonable computational requirements for edge deployment
Available pretrained weights: Extensive pretraining on ImageNet provides strong general visual features

Feature Extraction

For anomaly detection, we use the ResNet-50 model as a feature extractor:

Remove the final classification layer
Extract features from the global average pooling layer
Obtain a 2048-dimensional embedding for each input image

These embeddings are then used with UMAP for dimensionality reduction and kNN for anomaly scoring.

Edit this page on GitHub or file an issue.

Experiment Tracking

Overview

Remote Sensing

Manufacturing QC

ResNet-50

Why ResNet-50?

Feature Extraction

Overview

Remote Sensing

Manufacturing QC

​ResNet-50

​Why ResNet-50?

​Feature Extraction

ResNet-50

Why ResNet-50?

Feature Extraction