5.2: SOTIF Considerations
ISO 21448 (SOTIF) Overview
Safety of the Intended Functionality
ISO 21448: Road vehicles — Safety of the Intended Functionality (SOTIF)
Publication: 2022 (final standard, replaces ISO/PAS 21448:2019 draft)
Scope: Addresses hazards caused by performance limitations and misuse, not random hardware faults
Difference from ISO 26262:
| Aspect | ISO 26262 (Functional Safety) | ISO 21448 (SOTIF) |
|---|---|---|
| Hazard Source | Random hardware failures, systematic software bugs | Performance limitations, inadequate specification |
| Example | Sensor fails (short circuit) → No detection | Sensor works but limited (fog) → Missed detection |
| Approach | Fault tolerance, redundancy, diagnostics | ODD definition, scenario testing, safe degradation |
| Application | All safety functions (ABS, airbag, LKA) | ADAS/AD with perception (ML, sensors) |
Why SOTIF for ML?:
- ML models have inherent limitations (never 100% accurate)
- Trained on finite dataset (may not cover all real-world scenarios)
- "Black box" behavior (hard to predict failures)
SOTIF Process for Lane Detection CNN
SOTIF Hazard Analysis
Goal: Identify hazards due to performance limitations of lane detection ML model
SOTIF 4-Quadrant Analysis: The following diagram presents the ISO 21448 four-quadrant framework (Known Safe, Known Unsafe, Unknown Safe, Unknown Unsafe) applied to the lane detection ML model's performance limitations.
SOTIF Goal: Minimize "Unknown Unsafe" quadrant through scenario identification and testing
Scenario Discovery Methods: Techniques for discovering "Unknown Unsafe" scenarios include: (1) Expert brainstorming (safety engineers, test drivers), (2) Field data analysis (driver interventions, confidence drops), (3) Literature review (published ADAS failures), (4) Generative AI brainstorming (ChatGPT for scenario ideas), (5) Adversarial testing (intentional edge cases). Document scenario sources in the SOTIF validation report.
Scenario-Based Testing
SOTIF Requirement: Test system in 10,000+ scenarios covering ODD + edge cases
Scenario Catalog: 10,000 scenarios organized by category
| Category | Scenarios | Source | Purpose |
|---|---|---|---|
| ODD Nominal | 6,000 | Real-world driving (test vehicles) | Verify performance within ODD |
| ODD Boundary | 2,000 | Controlled collection (e.g., dusk, rain) | Test limits of ODD |
| Known Unsafe | 1,000 | Rare events (dataset augmentation) | Verify LKA disables correctly |
| Unknown (Discovered) | 1,000 | Field failures, simulation, expert input | Move to "Known" quadrant |
Example Scenario: Construction Zone with Temporary Lane Markings
Scenario ID: SOTIF-SCEN-0342 Category: Unknown Safe → Known Safe (after testing)
Scenario: Construction Zone - Temporary Lane Shift
─────────────────────────────────────────────────────
Description:
Highway construction zone with temporary yellow lane markings
redirecting traffic (original white lanes still visible but crossed out)
Visual Characteristics:
- Original white lanes: Faded, partially covered with black paint
- Temporary yellow lanes: Bright yellow, narrower width (2.8m vs 3.5m)
- Construction signs: Orange cones, "LANE SHIFT" warning signs
ODD Classification:
- Road type: Highway [PASS] (within ODD)
- Speed: 80 km/h [PASS] (within 60-130 km/h)
- Weather: Dry [PASS]
- Lighting: Daytime [PASS]
- BUT: Non-standard lane markings [WARN] (edge case)
Expected Behavior:
- Model should detect temporary yellow lanes (not crossed-out white)
- Confidence may be lower (yellow vs white, training data bias)
- If confidence < 0.5 → LKA disables (safe degradation)
Test Method:
1. Collect 200 images from real construction zones (3 test vehicles)
2. Annotate temporary lane markings (ground truth)
3. Run CNN inference, measure IoU accuracy
4. Log confidence scores
Test Results:
- Mean IoU: 78% (lower than nominal 95%, expected)
- Mean confidence: 0.62 (above 0.5 threshold, LKA remains active)
- False detections: 5% (model confused by old white lanes)
Verdict: [PASS] Known Safe
- LKA can operate in construction zones (with reduced confidence)
- Recommendation: Add construction zone images to training data v1.1
Outcome: Scenario moved from "Unknown Safe" → "Known Safe" (validated through testing)
Safe Degradation Strategy
Confidence-Based LKA Disable
Problem: ML model never 100% accurate → How to prevent unsafe operation?
Solution: Confidence scoring + Graceful degradation
Confidence Score Calculation:
def calculate_confidence(pred_mask, features):
"""
Calculate lane detection confidence score (0.0 - 1.0)
Inputs:
- pred_mask: Binary segmentation mask (lane pixels)
- features: Image features (brightness, contrast, lane width)
Output:
- confidence: Float [0.0, 1.0] (1.0 = high confidence)
"""
# Heuristic 1: Predicted lane width (should be 2.5-3.7m)
lane_width = estimate_lane_width(pred_mask)
if 2.5 <= lane_width <= 3.7:
width_conf = 1.0
else:
width_conf = max(0.0, 1.0 - abs(lane_width - 3.0) / 2.0)
# Heuristic 2: Lane continuity (should be smooth, not fragmented)
continuity = measure_continuity(pred_mask) # 0.0-1.0
continuity_conf = continuity
# Heuristic 3: Image brightness (too dark → low confidence)
brightness = np.mean(image) # 0-255
if brightness < 30: # Very dark (night)
brightness_conf = 0.2
elif brightness < 60: # Dusk
brightness_conf = 0.6
else:
brightness_conf = 1.0
# Heuristic 4: Model output entropy (low entropy = confident)
# (For segmentation, use average softmax probability)
model_conf = np.mean(pred_mask) # Simplified (actual: softmax scores)
# Combine heuristics (weighted average)
confidence = (
0.3 * width_conf +
0.3 * continuity_conf +
0.2 * brightness_conf +
0.2 * model_conf
)
return confidence
LKA State Machine with Confidence Thresholds:
┌──────────────────────────────────────────────────────────────┐
│ LKA State Machine (Confidence-Based) │
├──────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ │
│ │ LKA OFF │ ←───────────────────────┐ │
│ │ (Disabled) │ │ │
│ └─────────────┘ │ │
│ │ │ │
│ │ Driver enables LKA │ │
│ │ (button press) │ │
│ ▼ │ │
│ ┌─────────────┐ │ │
│ │ LKA STANDBY │ │ │
│ │ (Monitoring)│ │ │
│ └─────────────┘ │ │
│ │ │ │
│ │ Confidence ≥ 0.7 (high) │ │
│ │ Speed ≥ 60 km/h │ │
│ ▼ │ │
│ ┌─────────────┐ │ │
│ │ LKA ACTIVE │ │ │
│ │ (Steering) │ │ │
│ └─────────────┘ │ │
│ │ │ │
│ │ Confidence < 0.5 (low) │ │
│ │ OR Speed < 60 km/h │ │
│ │ OR Driver override (hands on) │ │
│ └─────────────────────────────────┘ │
│ │
└───────────────────────────────────────────────────────────────┘
Thresholds:
- Confidence ≥ 0.7: LKA can engage (high confidence, nominal operation)
- 0.5 ≤ Confidence < 0.7: LKA remains active but alerts driver ("Lane detection reduced")
- Confidence < 0.5: LKA disables (too uncertain, unsafe to steer)
Example: Heavy Rain Scenario
t=0s: Daytime, dry → Confidence = 0.92 → LKA ACTIVE
t=60s: Light rain starts → Confidence = 0.78 → LKA ACTIVE (still above 0.7)
t=120s: Heavy rain, lane lines barely visible → Confidence = 0.45
→ LKA disables (below 0.5 threshold)
→ Display: "LKA unavailable - Lane markings not detected"
→ Audible beep (driver takeover)
Safety Benefit: Prevents LKA from steering based on unreliable lane detection (avoids unintended lane departures)
ODD Monitoring and Exit Strategy
Real-Time ODD Validation
Challenge: Ensure vehicle operates only within ODD (dynamic conditions)
ODD Monitors (runtime checks):
class ODD_Monitor:
"""
Real-time ODD validation (runs every 100ms)
"""
def __init__(self):
self.speed_limit = (60, 130) # km/h
self.confidence_threshold = 0.5
self.brightness_min = 30 # Minimum image brightness (daytime)
def check_ODD_compliance(self, vehicle_state, camera_image, lane_confidence):
"""
Returns: (in_ODD: bool, exit_reason: str)
"""
# Check 1: Speed within ODD range
if not (self.speed_limit[0] <= vehicle_state.speed <= self.speed_limit[1]):
return False, f"Speed {vehicle_state.speed} km/h out of ODD (60-130 km/h)"
# Check 2: Lane confidence sufficient
if lane_confidence < self.confidence_threshold:
return False, f"Lane confidence {lane_confidence:.2f} < 0.5 (low)"
# Check 3: Lighting conditions (daytime only)
brightness = np.mean(camera_image)
if brightness < self.brightness_min:
return False, f"Low light (brightness {brightness:.0f} < 30, night)"
# Check 4: Rain sensor (if equipped, optional)
if vehicle_state.rain_intensity > 50: # Heavy rain
return False, "Heavy rain detected (ODD excludes heavy rain)"
return True, "OK" # Within ODD
def handle_ODD_exit(self, exit_reason):
"""
Safe degradation when ODD exit detected
"""
print(f"ODD EXIT: {exit_reason}")
# Disable LKA gradually (not abrupt)
lka_controller.ramp_down_torque(duration=2.0) # 2-second ramp to 0 Nm
# Alert driver
hmi_display.show_message("LKA disabled: " + exit_reason)
hmi_audio.play_beep(priority="medium")
# Log event (for post-market analysis)
event_logger.log({
"timestamp": time.time(),
"event": "ODD_EXIT",
"reason": exit_reason,
"location": vehicle_state.gps_coords
})
ODD Exit Example: Tunnel Entrance (Lighting Transition)
t=0s: Daytime highway, confidence = 0.88 → LKA ACTIVE
t=10s: Approaching tunnel, brightness drops from 180 → 120 (still OK)
t=12s: Tunnel entrance, brightness = 25 → ODD EXIT (below 30)
→ LKA disables over 2 seconds (ramp down steering torque)
→ Display: "LKA disabled: Low light"
t=15s: Inside tunnel, driver has full manual control
Unknown Unsafe Scenario Discovery
Field Data Analysis (Post-Market)
Goal: Identify "Unknown Unsafe" scenarios from real-world operation
Data Collection (Shadow Mode):
- Duration: 12 months post-launch (100 vehicles in field)
- Data: Camera images + LKA state + driver interventions
- Trigger: Log data when:
- LKA confidence drops <0.3 (very low, unexpected)
- Driver overrides LKA (hands on wheel, steering torque >5 Nm)
- Lane departure warning triggers (unintended lane crossing)
Example Discovered Scenario: Road Art Confusion
Scenario ID: SOTIF-FIELD-0012 Date Discovered: Month 8 post-launch Location: Urban street in Berlin, Germany
Discovered Scenario: Road Art Mistaken for Lane Line
────────────────────────────────────────────────────────
Description:
Crosswalk painted with decorative patterns (zebra stripes + colorful art)
CNN misinterpreted art as lane lines → False lane detection
Visual Evidence:
[Camera Image: Crosswalk with geometric patterns resembling lane markings]
LKA Behavior:
- Confidence: 0.68 (above 0.5 threshold, LKA remained active)
- Steering command: 3 Nm left (attempting to "follow" fake lane)
- Driver intervention: YES (driver countersteered, 8 Nm right)
Root Cause:
- Training data: 99.9% standard white/yellow lane lines
- Crosswalk art: Not in training data (dataset gap)
- CNN: Generalized incorrectly (colorful patterns → lane-like features)
Risk Assessment:
- Severity: S=2 (Minor injury, driver corrects easily)
- Probability: P=1 (Rare, <1 in 10,000 km)
- Risk: 2 (Low, acceptable)
Mitigation:
1. Add 500 crosswalk images to dataset v1.1 (retrain model)
2. Update confidence scoring (penalize high-frequency patterns, crosswalks)
3. ODD refinement: Exclude urban streets with crosswalks (update ODD to highways only)
Status: Mitigated in software v2.1 (released Month 10)
SOTIF Benefit: Continuous learning from field failures → Safer system over time
SOTIF Validation Report
Test Coverage Summary
Total Scenarios Tested: 10,542
| Category | Scenarios | Pass | Fail | Pass Rate |
|---|---|---|---|---|
| ODD Nominal | 6,000 | 5,987 | 13 | 99.8% [PASS] |
| ODD Boundary | 2,000 | 1,856 | 144 | 92.8% [PASS] |
| Known Unsafe | 1,000 | 982 | 18 | 98.2% [PASS] (LKA disabled correctly) |
| Unknown (Discovered) | 542 | 498 | 44 | 91.9% [PASS] |
Overall: 10,323 / 10,542 = 97.9% pass rate
Failure Analysis (219 failures):
- 144 ODD Boundary: Confidence 0.45-0.55 (near threshold, borderline cases)
- 44 Unknown scenarios: Novel situations (e.g., road art, temporary markings)
- 18 Known Unsafe: LKA failed to disable (confidence scoring bug, fixed)
- 13 ODD Nominal: Unexplained (investigated, no pattern found)
Verdict: [PASS] SOTIF compliant (ISO 21448 requires >95% scenario coverage, we achieved 97.9%)
TÜV Assessment for SOTIF
Assessor Questions (Common)
Q1: "How do you ensure the ML model doesn't fail on scenarios not in the training data?"
Answer:
- ODD Definition: Clear boundaries (highway, dry, daytime) → Model validated only within ODD
- Confidence Scoring: Low-confidence predictions (<0.5) → LKA disables (safe degradation)
- 10,000+ Scenario Testing: Edge cases, ODD boundary, discovered scenarios → 97.9% pass rate
- Field Monitoring: Shadow mode (100 vehicles, 12 months) → Continuous scenario discovery
Q2: "What happens if the model encounters an adversarial attack (e.g., projected fake lane lines)?"
Answer:
- Threat Model: Adversarial attacks considered "Out of ODD" (malicious intent, not normal driving)
- Risk: Low probability (requires physical proximity, projection equipment)
- Mitigation:
- Confidence scoring detects anomalies (fake lanes → unusual patterns → low confidence)
- Driver monitoring (driver can override LKA anytime)
- Security measures: Encrypted camera feed (future hardening)
Q3: "How do you validate that the 10,000 scenarios are representative of real-world driving?"
Answer:
- Data Collection: 500,000 km driven across Europe (Germany, France, Italy)
- ODD Coverage: 95% of scenarios within ODD (highway, dry, daytime)
- Expert Review: Safety engineer + domain experts reviewed scenario catalog
- Field Validation: 100 vehicles, 12 months → 0 SOTIF-related incidents
Outcome: [PASS] TÜV SÜD approved SOTIF compliance (ISO 21448 certificate issued)
Summary
SOTIF for ML-Based Lane Detection:
| SOTIF Activity | Deliverable | Tool/Method | ISO 21448 Clause |
|---|---|---|---|
| Hazard Analysis | SOTIF 4-quadrant analysis (25 hazards) | Brainstorming, FMEA | Clause 6.3 |
| Scenario Testing | 10,542 test scenarios (97.9% pass rate) | Simulation, real-world | Clause 7 |
| Safe Degradation | Confidence-based LKA disable (<0.5) | Runtime monitoring | Clause 8.3 |
| ODD Definition | ODD specification (highway, 60-130 km/h, dry) | Requirements | Clause 5 |
| Field Monitoring | Shadow mode (100 vehicles, 12 months) | Data logging, analytics | Clause 10 |
AI Contribution to SOTIF:
- Scenario generation: Generative AI (ChatGPT-4) suggested 200 edge case scenarios (experts validated)
- Confidence scoring: Heuristic-based (classical CV features + ML output), not AI
- Field data analysis: ML clustering (group similar failure modes, 50% faster root cause analysis)
Key Insight: SOTIF is essential for ML safety - ISO 26262 alone insufficient (doesn't address ML performance limitations)
ISO/PAS 8800 Update: The upcoming ISO/PAS 8800 (AI/ML in road vehicles) will provide more specific guidance for ML safety beyond SOTIF. Monitor this standard (expected final publication 2025-2026) for updates to MLE process requirements and ML-specific safety argumentation patterns.
Next: Perception pipeline implementation and lessons learned (28.03).