Model Training Pipeline

Introduction
Model Repository
Training Considerations
Data Pipeline Integration
Experiment Tracking
Hardware Requirements

Introduction

The training pipeline was built around the PyTorch framework and includes the elements shown below.

Although the models developed in other sections of representation learning include training, this section outlines what generally needs to be kept in mind when training AD models across the different approaches.

Model Repository

The trained model ends up in an S3 bucket and has a unique ID associated with the experiment ID of the training job. The experiment ID itself can track the exact version of the model development code based on the git commit.

Training Considerations

Data Pipeline Integration

The training pipeline consumes data from the data pipeline in the form of:

Parquet files for efficient batch loading
Transformed images with appropriate augmentations
Train/validation splits for model evaluation

Experiment Tracking

All training runs are tracked using ClearML, which captures:

Hyperparameters and configuration
Training metrics (loss, accuracy)
Model artifacts and checkpoints
Code version and git commit

Hardware Requirements

Training anomaly detection models requires:

GPU with sufficient VRAM for batch processing
Fast storage for data loading
Experiment tracking infrastructure

Edit this page on GitHub or file an issue.

The Supervised Approach

Model Evaluation & Verification Pipeline

Overview

Remote Sensing

Manufacturing QC

Model Training Pipeline

Introduction

Model Repository

Training Considerations

Data Pipeline Integration

Experiment Tracking

Hardware Requirements

Overview

Remote Sensing

Manufacturing QC

​Introduction

​Model Repository

​Training Considerations

​Data Pipeline Integration

​Experiment Tracking

​Hardware Requirements

Introduction

Model Repository

Training Considerations

Data Pipeline Integration

Experiment Tracking

Hardware Requirements