ResNet-50
Since we are dealing with grayscale images we replicate the single channel to three channels to match the input size of the ResNet-50v2 model and avoid redesigning backbones.

- Input: images (grayscale replicated to 3 channels)
- Output: 2048-dimensional feature vector (before classification head)
- Architecture: 50 layers with residual connections
- Pretrained on: ImageNet (1000 classes)
Why ResNet-50?
ResNet-50 was chosen for several reasons:- Proven effectiveness: Widely used in transfer learning applications
- Appropriate depth: Deep enough to learn hierarchical features without being overly complex
- Efficient inference: Reasonable computational requirements for edge deployment
- Available pretrained weights: Extensive pretraining on ImageNet provides strong general visual features
Feature Extraction
For anomaly detection, we use the ResNet-50 model as a feature extractor:- Remove the final classification layer
- Extract features from the global average pooling layer
- Obtain a 2048-dimensional embedding for each input image

