5.2: SOTIF Considerations

ISO 21448 (SOTIF) Overview

Safety of the Intended Functionality

ISO 21448: Road vehicles — Safety of the Intended Functionality (SOTIF)

Publication: 2022 (final standard, replaces ISO/PAS 21448:2019 draft)

Scope: Addresses hazards caused by performance limitations and misuse, not random hardware faults

Difference from ISO 26262:

Aspect	ISO 26262 (Functional Safety)	ISO 21448 (SOTIF)
Hazard Source	Random hardware failures, systematic software bugs	Performance limitations, inadequate specification
Example	Sensor fails (short circuit) → No detection	Sensor works but limited (fog) → Missed detection
Approach	Fault tolerance, redundancy, diagnostics	ODD definition, scenario testing, safe degradation
Application	All safety functions (ABS, airbag, LKA)	ADAS/AD with perception (ML, sensors)

Why SOTIF for ML?:

ML models have inherent limitations (never 100% accurate)
Trained on finite dataset (may not cover all real-world scenarios)
"Black box" behavior (hard to predict failures)

SOTIF Process for Lane Detection CNN

SOTIF Hazard Analysis

Goal: Identify hazards due to performance limitations of lane detection ML model

SOTIF 4-Quadrant Analysis: The following diagram presents the ISO 21448 four-quadrant framework (Known Safe, Known Unsafe, Unknown Safe, Unknown Unsafe) applied to the lane detection ML model's performance limitations.

SOTIF Hazard Analysis

SOTIF Goal: Minimize "Unknown Unsafe" quadrant through scenario identification and testing

Scenario Discovery Methods: Techniques for discovering "Unknown Unsafe" scenarios include: (1) Expert brainstorming (safety engineers, test drivers), (2) Field data analysis (driver interventions, confidence drops), (3) Literature review (published ADAS failures), (4) Generative AI brainstorming (ChatGPT for scenario ideas), (5) Adversarial testing (intentional edge cases). Document scenario sources in the SOTIF validation report.

Scenario-Based Testing

SOTIF Requirement: Test system in 10,000+ scenarios covering ODD + edge cases

Scenario Catalog: 10,000 scenarios organized by category

Category	Scenarios	Source	Purpose
ODD Nominal	6,000	Real-world driving (test vehicles)	Verify performance within ODD
ODD Boundary	2,000	Controlled collection (e.g., dusk, rain)	Test limits of ODD
Known Unsafe	1,000	Rare events (dataset augmentation)	Verify LKA disables correctly
Unknown (Discovered)	1,000	Field failures, simulation, expert input	Move to "Known" quadrant

Example Scenario: Construction Zone with Temporary Lane Markings

Scenario ID: SOTIF-SCEN-0342 Category: Unknown Safe → Known Safe (after testing)

Scenario: Construction Zone - Temporary Lane Shift
─────────────────────────────────────────────────────

Description:
  Highway construction zone with temporary yellow lane markings
  redirecting traffic (original white lanes still visible but crossed out)

Visual Characteristics:
  - Original white lanes: Faded, partially covered with black paint
  - Temporary yellow lanes: Bright yellow, narrower width (2.8m vs 3.5m)
  - Construction signs: Orange cones, "LANE SHIFT" warning signs

ODD Classification:
  - Road type: Highway [PASS] (within ODD)
  - Speed: 80 km/h [PASS] (within 60-130 km/h)
  - Weather: Dry [PASS]
  - Lighting: Daytime [PASS]
  - BUT: Non-standard lane markings [WARN] (edge case)

Expected Behavior:
  - Model should detect temporary yellow lanes (not crossed-out white)
  - Confidence may be lower (yellow vs white, training data bias)
  - If confidence < 0.5 → LKA disables (safe degradation)

Test Method:
  1. Collect 200 images from real construction zones (3 test vehicles)
  2. Annotate temporary lane markings (ground truth)
  3. Run CNN inference, measure IoU accuracy
  4. Log confidence scores

Test Results:
  - Mean IoU: 78% (lower than nominal 95%, expected)
  - Mean confidence: 0.62 (above 0.5 threshold, LKA remains active)
  - False detections: 5% (model confused by old white lanes)

Verdict: [PASS] Known Safe
  - LKA can operate in construction zones (with reduced confidence)
  - Recommendation: Add construction zone images to training data v1.1

Outcome: Scenario moved from "Unknown Safe" → "Known Safe" (validated through testing)

Safe Degradation Strategy

Confidence-Based LKA Disable

Problem: ML model never 100% accurate → How to prevent unsafe operation?

Solution: Confidence scoring + Graceful degradation

Confidence Score Calculation:

def calculate_confidence(pred_mask, features):
    """
    Calculate lane detection confidence score (0.0 - 1.0)

    Inputs:
      - pred_mask: Binary segmentation mask (lane pixels)
      - features: Image features (brightness, contrast, lane width)

    Output:
      - confidence: Float [0.0, 1.0] (1.0 = high confidence)
    """
    # Heuristic 1: Predicted lane width (should be 2.5-3.7m)
    lane_width = estimate_lane_width(pred_mask)
    if 2.5 <= lane_width <= 3.7:
        width_conf = 1.0
    else:
        width_conf = max(0.0, 1.0 - abs(lane_width - 3.0) / 2.0)

    # Heuristic 2: Lane continuity (should be smooth, not fragmented)
    continuity = measure_continuity(pred_mask)  # 0.0-1.0
    continuity_conf = continuity

    # Heuristic 3: Image brightness (too dark → low confidence)
    brightness = np.mean(image)  # 0-255
    if brightness < 30:  # Very dark (night)
        brightness_conf = 0.2
    elif brightness < 60:  # Dusk
        brightness_conf = 0.6
    else:
        brightness_conf = 1.0

    # Heuristic 4: Model output entropy (low entropy = confident)
    # (For segmentation, use average softmax probability)
    model_conf = np.mean(pred_mask)  # Simplified (actual: softmax scores)

    # Combine heuristics (weighted average)
    confidence = (
        0.3 * width_conf +
        0.3 * continuity_conf +
        0.2 * brightness_conf +
        0.2 * model_conf
    )

    return confidence

LKA State Machine with Confidence Thresholds:

┌──────────────────────────────────────────────────────────────┐
│            LKA State Machine (Confidence-Based)              │
├──────────────────────────────────────────────────────────────┤
│                                                               │
│  ┌─────────────┐                                             │
│  │   LKA OFF   │ ←───────────────────────┐                   │
│  │ (Disabled)  │                         │                   │
│  └─────────────┘                         │                   │
│        │                                 │                   │
│        │ Driver enables LKA              │                   │
│        │ (button press)                  │                   │
│        ▼                                 │                   │
│  ┌─────────────┐                         │                   │
│  │ LKA STANDBY │                         │                   │
│  │ (Monitoring)│                         │                   │
│  └─────────────┘                         │                   │
│        │                                 │                   │
│        │ Confidence ≥ 0.7 (high)         │                   │
│        │ Speed ≥ 60 km/h                 │                   │
│        ▼                                 │                   │
│  ┌─────────────┐                         │                   │
│  │ LKA ACTIVE  │                         │                   │
│  │ (Steering)  │                         │                   │
│  └─────────────┘                         │                   │
│        │                                 │                   │
│        │ Confidence < 0.5 (low)          │                   │
│        │ OR Speed < 60 km/h              │                   │
│        │ OR Driver override (hands on)   │                   │
│        └─────────────────────────────────┘                   │
│                                                               │
└───────────────────────────────────────────────────────────────┘

Thresholds:

Confidence ≥ 0.7: LKA can engage (high confidence, nominal operation)
0.5 ≤ Confidence < 0.7: LKA remains active but alerts driver ("Lane detection reduced")
Confidence < 0.5: LKA disables (too uncertain, unsafe to steer)

Example: Heavy Rain Scenario

t=0s:   Daytime, dry → Confidence = 0.92 → LKA ACTIVE
t=60s:  Light rain starts → Confidence = 0.78 → LKA ACTIVE (still above 0.7)
t=120s: Heavy rain, lane lines barely visible → Confidence = 0.45
        → LKA disables (below 0.5 threshold)
        → Display: "LKA unavailable - Lane markings not detected"
        → Audible beep (driver takeover)

Safety Benefit: Prevents LKA from steering based on unreliable lane detection (avoids unintended lane departures)

ODD Monitoring and Exit Strategy

Real-Time ODD Validation

Challenge: Ensure vehicle operates only within ODD (dynamic conditions)

ODD Monitors (runtime checks):

class ODD_Monitor:
    """
    Real-time ODD validation (runs every 100ms)
    """

    def __init__(self):
        self.speed_limit = (60, 130)  # km/h
        self.confidence_threshold = 0.5
        self.brightness_min = 30  # Minimum image brightness (daytime)

    def check_ODD_compliance(self, vehicle_state, camera_image, lane_confidence):
        """
        Returns: (in_ODD: bool, exit_reason: str)
        """
        # Check 1: Speed within ODD range
        if not (self.speed_limit[0] <= vehicle_state.speed <= self.speed_limit[1]):
            return False, f"Speed {vehicle_state.speed} km/h out of ODD (60-130 km/h)"

        # Check 2: Lane confidence sufficient
        if lane_confidence < self.confidence_threshold:
            return False, f"Lane confidence {lane_confidence:.2f} < 0.5 (low)"

        # Check 3: Lighting conditions (daytime only)
        brightness = np.mean(camera_image)
        if brightness < self.brightness_min:
            return False, f"Low light (brightness {brightness:.0f} < 30, night)"

        # Check 4: Rain sensor (if equipped, optional)
        if vehicle_state.rain_intensity > 50:  # Heavy rain
            return False, "Heavy rain detected (ODD excludes heavy rain)"

        return True, "OK"  # Within ODD

    def handle_ODD_exit(self, exit_reason):
        """
        Safe degradation when ODD exit detected
        """
        print(f"ODD EXIT: {exit_reason}")

        # Disable LKA gradually (not abrupt)
        lka_controller.ramp_down_torque(duration=2.0)  # 2-second ramp to 0 Nm

        # Alert driver
        hmi_display.show_message("LKA disabled: " + exit_reason)
        hmi_audio.play_beep(priority="medium")

        # Log event (for post-market analysis)
        event_logger.log({
            "timestamp": time.time(),
            "event": "ODD_EXIT",
            "reason": exit_reason,
            "location": vehicle_state.gps_coords
        })

ODD Exit Example: Tunnel Entrance (Lighting Transition)

t=0s:   Daytime highway, confidence = 0.88 → LKA ACTIVE
t=10s:  Approaching tunnel, brightness drops from 180 → 120 (still OK)
t=12s:  Tunnel entrance, brightness = 25 → ODD EXIT (below 30)
        → LKA disables over 2 seconds (ramp down steering torque)
        → Display: "LKA disabled: Low light"
t=15s:  Inside tunnel, driver has full manual control

Unknown Unsafe Scenario Discovery

Field Data Analysis (Post-Market)

Goal: Identify "Unknown Unsafe" scenarios from real-world operation

Data Collection (Shadow Mode):

Duration: 12 months post-launch (100 vehicles in field)
Data: Camera images + LKA state + driver interventions
Trigger: Log data when:
1. LKA confidence drops <0.3 (very low, unexpected)
2. Driver overrides LKA (hands on wheel, steering torque >5 Nm)
3. Lane departure warning triggers (unintended lane crossing)

Example Discovered Scenario: Road Art Confusion

Scenario ID: SOTIF-FIELD-0012 Date Discovered: Month 8 post-launch Location: Urban street in Berlin, Germany

Discovered Scenario: Road Art Mistaken for Lane Line
────────────────────────────────────────────────────────

Description:
  Crosswalk painted with decorative patterns (zebra stripes + colorful art)
  CNN misinterpreted art as lane lines → False lane detection

Visual Evidence:
  [Camera Image: Crosswalk with geometric patterns resembling lane markings]

LKA Behavior:
  - Confidence: 0.68 (above 0.5 threshold, LKA remained active)
  - Steering command: 3 Nm left (attempting to "follow" fake lane)
  - Driver intervention: YES (driver countersteered, 8 Nm right)

Root Cause:
  - Training data: 99.9% standard white/yellow lane lines
  - Crosswalk art: Not in training data (dataset gap)
  - CNN: Generalized incorrectly (colorful patterns → lane-like features)

Risk Assessment:
  - Severity: S=2 (Minor injury, driver corrects easily)
  - Probability: P=1 (Rare, <1 in 10,000 km)
  - Risk: 2 (Low, acceptable)

Mitigation:
  1. Add 500 crosswalk images to dataset v1.1 (retrain model)
  2. Update confidence scoring (penalize high-frequency patterns, crosswalks)
  3. ODD refinement: Exclude urban streets with crosswalks (update ODD to highways only)

Status: Mitigated in software v2.1 (released Month 10)

SOTIF Benefit: Continuous learning from field failures → Safer system over time

SOTIF Validation Report

Test Coverage Summary

Total Scenarios Tested: 10,542

Category	Scenarios	Pass	Fail	Pass Rate
ODD Nominal	6,000	5,987	13	99.8% [PASS]
ODD Boundary	2,000	1,856	144	92.8% [PASS]
Known Unsafe	1,000	982	18	98.2% [PASS] (LKA disabled correctly)
Unknown (Discovered)	542	498	44	91.9% [PASS]

Overall: 10,323 / 10,542 = 97.9% pass rate

Failure Analysis (219 failures):

144 ODD Boundary: Confidence 0.45-0.55 (near threshold, borderline cases)
44 Unknown scenarios: Novel situations (e.g., road art, temporary markings)
18 Known Unsafe: LKA failed to disable (confidence scoring bug, fixed)
13 ODD Nominal: Unexplained (investigated, no pattern found)

Verdict: [PASS] SOTIF compliant (ISO 21448 requires >95% scenario coverage, we achieved 97.9%)

TÜV Assessment for SOTIF

Assessor Questions (Common)

Q1: "How do you ensure the ML model doesn't fail on scenarios not in the training data?"

Answer:

ODD Definition: Clear boundaries (highway, dry, daytime) → Model validated only within ODD
Confidence Scoring: Low-confidence predictions (<0.5) → LKA disables (safe degradation)
10,000+ Scenario Testing: Edge cases, ODD boundary, discovered scenarios → 97.9% pass rate
Field Monitoring: Shadow mode (100 vehicles, 12 months) → Continuous scenario discovery

Q2: "What happens if the model encounters an adversarial attack (e.g., projected fake lane lines)?"

Answer:

Threat Model: Adversarial attacks considered "Out of ODD" (malicious intent, not normal driving)
Risk: Low probability (requires physical proximity, projection equipment)
Mitigation:
- Confidence scoring detects anomalies (fake lanes → unusual patterns → low confidence)
- Driver monitoring (driver can override LKA anytime)
- Security measures: Encrypted camera feed (future hardening)

Q3: "How do you validate that the 10,000 scenarios are representative of real-world driving?"

Answer:

Data Collection: 500,000 km driven across Europe (Germany, France, Italy)
ODD Coverage: 95% of scenarios within ODD (highway, dry, daytime)
Expert Review: Safety engineer + domain experts reviewed scenario catalog
Field Validation: 100 vehicles, 12 months → 0 SOTIF-related incidents

Outcome: [PASS] TÜV SÜD approved SOTIF compliance (ISO 21448 certificate issued)

Summary

SOTIF for ML-Based Lane Detection:

SOTIF Activity	Deliverable	Tool/Method	ISO 21448 Clause
Hazard Analysis	SOTIF 4-quadrant analysis (25 hazards)	Brainstorming, FMEA	Clause 6.3
Scenario Testing	10,542 test scenarios (97.9% pass rate)	Simulation, real-world	Clause 7
Safe Degradation	Confidence-based LKA disable (<0.5)	Runtime monitoring	Clause 8.3
ODD Definition	ODD specification (highway, 60-130 km/h, dry)	Requirements	Clause 5
Field Monitoring	Shadow mode (100 vehicles, 12 months)	Data logging, analytics	Clause 10

AI Contribution to SOTIF:

Scenario generation: Generative AI (ChatGPT-4) suggested 200 edge case scenarios (experts validated)
Confidence scoring: Heuristic-based (classical CV features + ML output), not AI
Field data analysis: ML clustering (group similar failure modes, 50% faster root cause analysis)

Key Insight: SOTIF is essential for ML safety - ISO 26262 alone insufficient (doesn't address ML performance limitations)

ISO/PAS 8800 Update: The upcoming ISO/PAS 8800 (AI/ML in road vehicles) will provide more specific guidance for ML safety beyond SOTIF. Monitor this standard (expected final publication 2025-2026) for updates to MLE process requirements and ML-specific safety argumentation patterns.

Next: Perception pipeline implementation and lessons learned (28.03).