Systems Architecture
Torch-Diode Systems Architecture
This diagram shows how the torch-diode system components interact with PyTorch Inductor to provide ML-based kernel selection.
Key Architecture Components
1. Auto-registration System (Main Entry Point)
diode.__init__.pyorchestrates the entire integration processDiscovers available integrations automatically
Tests PyTorch interfaces with dummy functions
Loads actual models only when interfaces are available
Enables relevant PyTorch configurations
2. Integration Layer (PyTorch Interface)
BaseIntegration: Abstract framework for all integrationsMatmulIntegration: Specific implementation for matrix multiplicationDiodeInductorChoices: Extends PyTorch’s InductorChoices to inject ML predictionsSeamlessly overrides
_finalize_mm_configsmethod
3. Model Management System
ModelPointer: Metadata and path management for trained modelsModelWrapper: Loads, compiles, and runs inference on trained modelsModelRegistry: Centralized registry of all available modelsSupports multiple model formats and configurations
4. Data Collection Pipeline
MatmulDatasetCollector: Hooks into PyTorch’s feedback systemCollects timing data during autotune processes
Extracts features from problem shapes and kernel configurations
Stores structured datasets for model training
5. Type System
MMShape: Represents matrix multiplication problem characteristicsTritonGEMMConfig: Represents kernel configuration parametersStrong typing ensures data consistency across the system
6. Inductor Integration Flow
User imports diode → auto-registration begins
System tests PyTorch interfaces with dummy functions
Loads trained models from disk if interfaces are available
Registers
DiodeInductorChoicesas the active choice handlerDuring
torch.compile, the system intercepts kernel selectionExtracts features from the problem and available configurations
Runs ML model inference to predict kernel performance
Returns top-k configurations based on predictions
7. Data Flow
Training Time: Collector gathers timing data → Workflows train models → Models stored to disk
Inference Time: Models loaded → Features extracted → Predictions made → Best kernels selected
This architecture allows torch-diode to seamlessly integrate with PyTorch’s compilation pipeline while maintaining modularity and extensibility for future optimization targets beyond matrix multiplication.