Skip to main content
The MVTec AD (MVTec Anomaly Detection) dataset is widely used in the anomaly detection community, particularly for benchmarking algorithms in computer vision tasks such as unsupervised anomaly detection where the goal is to detect anomalies without prior knowledge of what constitutes a “defect.” The dataset provides labeled data for testing purposes but encourages approaches that do not require extensive labeled training data.
Main Characteristics of MVTec-AD
Diverse Object Categories
The dataset contains 15 different object and texture categories. Each category includes both defect-free (normal) samples and samples with various defects.
Types of Defects
Defects in the dataset include structural anomalies like scratches, dents, or missing parts, as well as textural anomalies, such as discoloration or irregular surface patterns. This mix makes the dataset useful for evaluating models on both texture-based and structure-based anomaly detection tasks. However, for our evaluation we focus on textural defects and the wood texture is the closest to our seam images due to its linear grains.
Dataset Size and Variations
Each object category contains around 70-300 images for training and a similarly large number of test images with a variety of defects. The defects range from subtle imperfections to large, visible anomalies, providing a wide range of difficulty levels for detection.
The MVTec-AD dataset has been used to benchmark several state-of-the-art methods especially methods using self-supervised learning techniques, have shown promising results on this dataset.
Class Imbalance
Like many real-world anomaly detection problems, the dataset reflects an inherent class imbalance, with many more normal (non-defective) samples than defective ones. This mirrors real-world industrial settings, where defects are relatively rare but critical to detect.
Challenges
It is worth noting that the dataset is challenging for several reasons:
-
High Intra-Class Variability: The normal images have a high degree of variability within each class, making it challenging for models to learn a robust representation of “normality.”
-
Subtle Anomalies: Some defects are extremely subtle, making detection difficult for unsupervised models that rely on distinguishing between normal and abnormal instances.
-
Multiple Types of Anomalies: The variety of both structural and textural defects requires models that can adapt to different anomaly characteristics.