5.0: ML-Enabled ADAS Systems
Key Terms: Machine Learning Primer
Before diving into the case study, here are essential ML/ADAS terms you'll encounter. If you're new to machine learning, read this section first.
Machine Learning Fundamentals
| Term | Definition | Why It Matters |
|---|---|---|
| ML (Machine Learning) | Software that learns patterns from data rather than following explicit rules | ADAS perception systems use ML to recognize lanes, objects, signs |
| CNN (Convolutional Neural Network) | A type of neural network that excels at processing images | The standard architecture for vision-based ADAS (cameras) |
| Neural Network | A computational model inspired by brain neurons—layers of mathematical operations | The foundation of modern AI; transforms inputs to predictions |
| Training | The process of teaching a model by showing it labeled examples | Your lane detection model learns from 250K+ labeled images |
| Inference | Running a trained model on new data to make predictions | What happens in the car: camera image → lane detection |
| Model | The trained neural network—a file containing learned parameters (weights) | The artifact you deploy to the ECU |
ADAS-Specific Terms
| Term | Definition | Why It Matters |
|---|---|---|
| ADAS | Advanced Driver Assistance Systems | The category of features like lane keeping, ACC, AEB |
| Perception | The system's ability to "see" and understand its environment | What ML models do: detect lanes, pedestrians, vehicles |
| ODD (Operational Design Domain) | The conditions under which the system is designed to operate | Defines where your ML model must work (highway, daylight, etc.) |
| SOTIF (Safety Of The Intended Functionality) | ISO 21448—addresses hazards from functional insufficiencies, including ML limitations | The safety standard specifically for ML/ADAS perception |
| Edge Case | A rare but important scenario the system might encounter | ML models struggle with edge cases (construction zones, unusual markings) |
ML Performance Metrics
| Term | Definition | Example |
|---|---|---|
| IoU (Intersection over Union) | Measures how well predicted area overlaps with ground truth | 95% IoU means lane prediction closely matches actual lane |
| Latency | Time from input to output | 25ms latency = 40 predictions per second |
| Accuracy | How often the model is correct | 95% accuracy = wrong 5% of the time |
| False Positive | Model detects something that isn't there | Detecting a lane line on a shadow |
| False Negative | Model misses something that is there | Failing to detect a faded lane line |
Model Architecture Terms
| Term | Definition |
|---|---|
| EfficientNet | A family of CNN architectures optimized for efficiency (speed vs accuracy trade-off) |
| DeepLabV3 | A neural network architecture specifically designed for image segmentation (pixel-by-pixel classification) |
| Backbone | The main feature extraction part of a neural network |
| Segmentation | Classifying each pixel in an image (e.g., this pixel = lane, that pixel = road) |
For Software Engineers New to ML: Think of a trained neural network as a highly complex function: it takes an input (image) and produces an output (lane positions). Unlike traditional code where you write the logic, ML "learns" the logic from data. The challenge for ASPICE/ISO 26262 is that you can't review the logic line-by-line—you must validate through extensive testing.
Case Study: Camera-Based Lane Keeping Assist (LKA)
Project Overview
Project: Lane Keeping Assist (LKA) with ML-Based Lane Detection Customer: AutoDrive GmbH (Tier-1 Automotive Supplier) Safety Standard: ISO 26262 ASIL-B + ISO 21448 (SOTIF) Machine Learning Model: Convolutional Neural Network (CNN) for lane line segmentation Target Platform: NVIDIA Jetson AGX Orin (254 TOPS AI performance) Duration: 24 months (including ML dataset collection, training, validation) Budget: €3.5M Team Size: 15 FTE
System Description
Lane Keeping Assist (LKA) Function
Purpose: Automatically steer vehicle to keep it centered in the lane
Operational Design Domain (ODD):
- Highway and rural roads (lane markings present)
- Speed: 60-130 km/h
- Weather: Dry, light rain (not heavy rain, snow, fog)
- Lighting: Daytime, dusk (not night with poor visibility)
System Architecture: The following diagram shows the LKA system architecture with its ML-based perception pipeline (CNN lane detection) separated from the deterministic control layer (PID steering), illustrating the safety decomposition boundary.
Key Characteristic: ML-based perception (CNN for lane detection) + Traditional control (PID steering)
Safety Decomposition Strategy: The ML model (CNN) handles perception (ASIL-QM rated), while traditional deterministic code handles control (ASIL-B rated). This decomposition allows safety certification of the deterministic control layer using established ISO 26262 methods, while ML perception is validated through ISO 21448 (SOTIF) scenario testing.
Machine Learning Context
ML Model: Lane Detection CNN
Architecture: EfficientNet-Lite4 backbone + DeepLabV3 segmentation head
Training Data:
- Dataset Size: 250,000 images (labeled lane lines)
- Sources: Public datasets (TuSimple, CULane) + proprietary data (50,000 images)
- Annotation: Pixel-wise lane line masks (manual labeling, 20,000 hours)
- Augmentation: Brightness/contrast variation, rain/fog simulation (synthetic)
Training Infrastructure:
- GPU Cluster: 8x NVIDIA A100 GPUs (640 GB total VRAM)
- Framework: PyTorch 2.0 + TorchVision
- Training Time: 120 hours (200 epochs, batch size 32)
- Validation Split: 80% training, 10% validation, 10% test
Performance Metrics:
- IoU (Intersection over Union): 95.2% (test set)
- Precision: 96.8% (lane pixels correctly classified)
- Recall: 93.7% (actual lane pixels detected)
- False Positive Rate: 2.1% (non-lane pixels misclassified as lane)
Model Size:
- Parameters: 12.3 million
- Quantized (INT8): 48 MB (for deployment on Jetson Orin)
- Inference Latency: 25ms (NVIDIA TensorRT optimized)
Reproducibility Requirement: For safety assessment, ML training must be reproducible. This requires: (1) Fixed random seeds, (2) Dataset versioning (DVC), (3) Experiment tracking (MLflow), (4) Hardware specification (GPU type affects results). TÜV assessors may request reproduction of training results as part of certification.
ASPICE Integration for ML Systems
Challenge: ML in Safety-Critical Context
Problem: ASPICE (SWE.1-6) designed for deterministic software, not ML models
ML Characteristics:
- [FAIL] Non-deterministic: Same input can produce different outputs (stochastic training)
- [FAIL] No explicit requirements: Model learns from data, not coded logic
- [FAIL] Black box: CNN has 12M parameters, hard to verify correctness
ASPICE Gaps for ML:
| ASPICE Process | Traditional Software | ML Model | Gap |
|---|---|---|---|
| SWE.1 Requirements | Explicit requirements (500 textual specs) | Implicit (learned from 250k images) | [WARN] How to specify "detect lanes"? |
| SWE.3 Detailed Design | Functions, algorithms (C code) | Neural network architecture (12M params) | [WARN] How to review CNN design? |
| SWE.4 Unit Testing | Test individual functions | Test individual layers? (impractical) | [WARN] What is a "unit" in ML? |
| SWE.6 Qualification | 100% requirements coverage | 95% IoU accuracy (never 100%) | [WARN] How much is "good enough"? |
Solution: ASPICE + MLE Process Extension
MLE (Machine Learning Engineering): Emerging discipline for ML lifecycle management
Standards:
- ISO/PAS 8800 (Draft, 2024): AI/ML in road vehicles
- SOTIF ISO 21448: Safety of the Intended Functionality (covers ML limitations)
- IEEE P2851: AI Assurance (currently in development)
Augmented ASPICE for ML (hybrid approach): The following diagram shows how traditional ASPICE SWE processes are extended with MLE (Machine Learning Engineering) processes to handle ML model training, validation, and deployment alongside conventional software development.
Key Principle: Partition ML from safety-critical logic
- [PASS] ML Model (ASIL QM): Lane detection CNN (not safety-rated directly)
- [PASS] Safety Monitor (ASIL-B): Validates ML output, falls back to safe state if confidence low
- [PASS] Control Logic (ASIL-B): Traditional C code (ASPICE-compliant, deterministic)
Technology Stack
ML Development Tools
ML Training Environment:
├── GPU Cluster: 8x NVIDIA A100 (80 GB each)
├── Framework: PyTorch 2.0.1 (Python 3.10)
├── Experiment Tracking: MLflow (track 150 training runs)
├── Dataset Versioning: DVC (Data Version Control, Git-like for datasets)
├── Annotation Tool: CVAT (Computer Vision Annotation Tool, web-based)
└── Hyperparameter Tuning: Optuna (Bayesian optimization)
ML Deployment Environment:
├── Target ECU: NVIDIA Jetson AGX Orin (254 TOPS, 64 GB RAM)
├── OS: NVIDIA JetPack 5.1 (Linux-based, real-time kernel)
├── Inference Engine: TensorRT 8.6 (optimizes PyTorch model, 3× speedup)
├── Model Format: ONNX → TensorRT (quantized INT8)
├── Camera Driver: V4L2 (Video4Linux2, standard Linux camera API)
└── CAN Communication: SocketCAN (ISO 11898, 500 kbps)
Traditional Software Tools (Control Layer):
├── Compiler: NVIDIA CUDA C++ Compiler (nvcc)
├── Static Analyzer: Coverity (MISRA C++:2008 compliance)
├── Unit Testing: Google Test + Google Mock
├── Requirements: Jama Connect (traceability)
└── CI/CD: GitLab CI (automated builds, tests, model deployment)
Project Metrics
Timeline (24 Months)
| Phase | Duration | Activities | Deliverables |
|---|---|---|---|
| Requirements | Month 1-3 | ODD definition, accuracy targets, HARA | MLE.1 ML Requirements Spec |
| Dataset Collection | Month 4-9 | Drive 500,000 km, collect 250k images | Annotated dataset (DVC v1.0) |
| Model Development | Month 10-15 | Train 150 CNN variants, select best | Trained model (95.2% IoU) |
| Model Verification | Month 16-18 | Test 10,000 edge cases, SOTIF analysis | MLE.4 Verification Report |
| Integration | Month 19-21 | Integrate ML + control, HIL testing | SWE.5 Integration Test Report |
| Validation | Month 22-23 | Proving ground (5,000 km), SOTIF validation | SWE.6 Validation Report |
| Certification | Month 24 | TÜV SÜD assessment (ISO 26262 + SOTIF) | ASIL-B certificate |
Team Structure
| Role | FTE | Responsibility |
|---|---|---|
| Project Manager | 1.0 | ASPICE + MLE compliance, TÜV coordination |
| ML Engineer | 4.0 | CNN training, hyperparameter tuning, PyTorch |
| Data Engineer | 2.0 | Dataset curation, annotation quality, DVC |
| Perception SW Engineer | 2.0 | Post-processing (curve fitting), integration |
| Control Engineer | 2.0 | LKA steering control (PID), safety monitor |
| Test Engineer | 2.0 | HIL testing, proving ground validation |
| Safety Engineer | 1.0 | ISO 26262 HARA, SOTIF analysis, TÜV liaison |
| Systems Engineer | 1.0 | Hardware selection (Jetson Orin), ODD definition |
Total: 15 FTE
Budget Breakdown
| Category | Cost | Notes |
|---|---|---|
| Engineering Labor | €2,160,000 | 15 FTE × 24 months × €6,000/FTE/month |
| GPU Cluster (A100) | €400,000 | 8x NVIDIA A100 GPUs (purchase) |
| Dataset Annotation | €300,000 | 20,000 hours × €15/hour (manual labeling) |
| Jetson Orin (Prototypes) | €80,000 | 30x development kits @ €2,500 each |
| Test Vehicles | €200,000 | 3x instrumented vehicles (camera, data logging) |
| HIL Bench | €150,000 | dSPACE SCALEXIO (camera simulator, CAN) |
| TÜV Certification | €120,000 | ISO 26262 + SOTIF assessment |
| Contingency | €90,000 | 2.6% buffer |
Total: €3,500,000
Key Challenges
1. ML Model Explainability (ASIL-B Requirement)
Problem: ISO 26262 requires safety argumentation, but CNN is "black box"
Challenge: How to argue that 12M-parameter CNN is safe?
Solution: See 28.02 (SOTIF approach: define ODD, test edge cases)
2. Dataset Bias and Corner Cases
Problem: Real-world lane markings vary: faded, occluded, non-standard
Example:
- Training data: 90% well-marked highways (USA/Europe)
- Deployment: 20% poorly marked rural roads (India, Brazil)
- Result: Model accuracy drops from 95% to 78% (underfit to ODD)
Solution: See 28.03 (data augmentation, synthetic corner case generation)
3. SOTIF Scenarios (ISO 21448)
Problem: ML can fail even when functioning as designed (performance limitation)
Example:
- Scenario: Heavy rain, lane lines barely visible
- Model: Trained on light rain, outputs low-confidence segmentation (0.3)
- Safety question: Should LKA engage? Disengage? Alert driver?
Solution: See 28.02 (SOTIF analysis, safe degradation strategy)
Success Criteria
Project Outcome: [PASS] Successful (TÜV SÜD ASIL-B + SOTIF certification)
- [PASS] On Time: 24 months (as planned)
- [PASS] On Budget: €3,480,000 (€20k under budget)
- [PASS] ASIL-B Certified: ISO 26262 compliance (TÜV SÜD)
- [PASS] SOTIF Validated: ISO 21448 (10,000 scenarios tested)
- [PASS] Model Accuracy: 95.2% IoU (exceeds 92% target)
- [PASS] Latency: 25ms (within 30ms requirement)
- [PASS] Field Performance: 98.7% lane keeping success rate (5,000 km proving ground)
Chapter Structure
This chapter explores ML-enabled ADAS development across 4 sections:
- Chapter 28.1: MLE Process Application: ML requirements, dataset management, model verification
- Chapter 28.2: SOTIF Considerations: ISO 21448, ODD, corner case testing, safe degradation
- Chapter 28.3: Perception Pipeline: CNN training, deployment, lessons learned
Message: Machine learning in safety-critical ADAS is cutting-edge and challenging. ASPICE provides foundation for traditional control code, but ML perception requires new MLE processes (data management, model verification). ISO 21448 (SOTIF) is essential for ML safety argumentation. AI tools help train models, but safety validation is rigorous (10,000 scenarios).