Skip to main content

Introduction

The purpose of this section is to provide insights into the raw data space and show how similar images are between machine settings in a quantified but at the same time explainable way. To do so, we plot histograms of the raw images that are produced before any feature-changing transformation is applied. We use then the KL-Divergence between the histograms of the images as a metric to highlight the small probabilistic distance between nominal and anomalous densities. We also present the performance of detecting anomalies based on a threshold comparison of the KL-Divergence between nominal and anomalous histograms.

Histogram Analysis

Histograms of pixel intensities provide a simple but interpretable representation of image content. For grayscale images, the histogram shows the distribution of pixel values from 0 (black) to 255 (white). By comparing histograms between different machine settings, we can quantify how similar or different the images are at the raw pixel level.

KL-Divergence

The Kullback-Leibler divergence (KL-divergence) is a measure of how one probability distribution differs from another. For two discrete probability distributions P and Q: DKL(PQ)=iP(i)logP(i)Q(i)D_{KL}(P \| Q) = \sum_{i} P(i) \log \frac{P(i)}{Q(i)} When applied to image histograms:
  • A low KL-divergence indicates that two images have similar pixel distributions
  • A high KL-divergence indicates that the images are significantly different

Results

The analysis shows that the KL-divergence between nominal and anomalous images is relatively small, which explains why simple pixel-level features are insufficient for accurate anomaly detection. This motivates the use of more sophisticated feature extraction methods based on pretrained neural networks as described in the unsupervised learning models section.
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.