4.1: Integration Testing

What You'll Learn

By the end of this section, you will be able to:

  • Understand the xIL testing hierarchy and when to apply each simulation level
  • Design AI-assisted integration test strategies for embedded systems
  • Configure Hardware-in-the-Loop test environments with AI-driven scenario generation
  • Implement continuous integration testing with intelligent test selection and prioritization
  • Apply ML-based defect pattern analysis to integration test results
  • Map integration testing activities to ASPICE SWE.5 and SYS.4 process requirements

Key Terms

Term Definition
SIL Software-in-the-Loop—compiled code executing on the host PC with simulated I/O
PIL Processor-in-the-Loop—compiled code executing on the target processor with simulated I/O
HIL Hardware-in-the-Loop—target ECU connected to real-time simulator replacing the physical plant
MIL Model-in-the-Loop—testing at the model level before code generation
xIL Collective term for all simulation-in-the-loop testing approaches
Test Bed Physical or virtual infrastructure required to execute integration tests
Plant Model Mathematical model simulating the physical system an ECU controls
Fault Injection Deliberate introduction of faults to verify error handling and safety mechanisms

Introduction

Hardware-in-the-Loop (HIL), Software-in-the-Loop (SIL), and Processor-in-the-Loop (PIL) testing provide different levels of simulation fidelity for embedded systems verification. This section covers strategies and tools for xIL testing.

Cross-Reference: For ASPICE process requirements related to integration testing, see Part II ASPICE Processes:

  • SWE.5: Software Integration and Integration Testing
  • SYS.4: System Architectural Design
  • SYS.5: System Verification
  • For CI/CD pipeline integration of xIL tests, see 16.00 CI/CD Integration

Integration testing bridges the gap between isolated unit verification and full system qualification. In safety-critical embedded systems governed by ASPICE 4.0 and ISO 26262, integration testing must be systematic, traceable, and evidence-producing. AI augments every stage of this process—from test strategy design through defect analysis—while the human engineer retains accountability for test adequacy and sign-off.


xIL Testing Hierarchy

The following diagram shows the xIL (X-in-the-Loop) testing hierarchy, progressing from MIL through SIL, PIL, and HIL stages with increasing hardware fidelity at each level.

xIL Testing Hierarchy

Each level in the xIL hierarchy trades execution speed for hardware fidelity:

Level Execution Environment Fidelity Speed Primary Purpose
MIL MATLAB/Simulink on host Model-level Very fast Algorithm validation
SIL Compiled C on host PC Functional Fast Logic and interface verification
PIL Compiled C on target MCU Timing-accurate Moderate Stack usage, real-time constraints
HIL Target ECU + real-time simulator Full system Slow System-level validation, I/O verification

Selection guidance: Begin with SIL for rapid iteration during development; graduate to PIL when timing behavior matters; reserve HIL for system-level acceptance and safety validation. AI can recommend the appropriate xIL level for each test case based on what the test exercises (pure logic vs. timing vs. I/O).


Test Strategy Design

Designing an effective integration test strategy requires understanding component dependencies, interface contracts, and risk profiles. AI assists at each decision point.

AI-Assisted Strategy Generation

An LLM can analyze software architecture descriptions and interface specifications to propose an integration sequence and identify high-risk interfaces:

#!/usr/bin/env python3
"""
AI-assisted integration test strategy generator.
Analyzes architecture and produces a prioritized test plan.
"""

import json
from pathlib import Path
from anthropic import Anthropic

def generate_integration_strategy(
    architecture_doc: str,
    interface_specs: list[str],
    known_risks: list[str]
) -> dict:
    """Use LLM to generate an integration test strategy."""
    client = Anthropic()

    prompt = f"""You are an embedded systems test architect working under ASPICE 4.0.
Given the following software architecture and interface specifications,
generate an integration test strategy.

## Architecture
{architecture_doc}

## Interface Specifications
{json.dumps(interface_specs, indent=2)}

## Known Risk Areas
{json.dumps(known_risks, indent=2)}

Produce a JSON response with:
1. "integration_sequence": ordered list of integration steps (bottom-up)
2. "critical_interfaces": interfaces requiring exhaustive testing
3. "test_levels": mapping of each integration step to xIL level (SIL/PIL/HIL)
4. "boundary_conditions": key boundary values per interface
5. "fault_injection_scenarios": faults to inject at each integration point
6. "estimated_test_count": approximate number of test cases per step

Return only valid JSON."""

    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=4096,
        messages=[{"role": "user", "content": prompt}]
    )

    return json.loads(response.content[0].text)

def validate_strategy_coverage(
    strategy: dict,
    requirements: list[str]
) -> dict:
    """Check that every integration requirement is covered by the strategy."""
    covered = set()
    uncovered = []

    for step in strategy.get("integration_sequence", []):
        for req in step.get("traced_requirements", []):
            covered.add(req)

    for req in requirements:
        if req not in covered:
            uncovered.append(req)

    return {
        "coverage_percent": len(covered) / max(len(requirements), 1) * 100,
        "uncovered_requirements": uncovered
    }

HITL Checkpoint: AI-generated strategies must be reviewed by the test architect. The engineer validates the integration sequence against the actual build order, confirms xIL level assignments, and adds domain-specific fault scenarios the LLM may have missed.

Integration Approaches

Approach Description Best For AI Support
Bottom-Up Integrate from leaf modules upward Driver/HAL-level code Dependency graph analysis
Top-Down Integrate from application layer downward Application logic validation Stub generation
Big-Bang Integrate all components at once Small systems, late projects Risk assessment
Sandwich Combine top-down and bottom-up Complex layered architectures Layer boundary identification
Risk-Based Integrate highest-risk interfaces first Safety-critical systems Risk scoring from defect history

SIL Testing Setup

/**
 * @file sil_test_door_lock.c
 * @brief SIL tests for door lock - host PC execution
 */

#include "unity.h"
#include "door_lock.h"
#include "sil_motor_stub.h"
#include "sil_timer_stub.h"

/* SIL environment setup */
void setUp(void)
{
    SIL_Timer_Init();
    SIL_Motor_Init();
    DoorLock_Init();
}

void tearDown(void)
{
    SIL_Timer_DeInit();
    SIL_Motor_DeInit();
}

/**
 * @test SIL: Lock operation timing verification
 */
void test_SIL_LockTiming_CompletesWithinSpec(void)
{
    /* Arrange */
    uint32_t start_time = SIL_Timer_GetTicks();

    /* Act */
    DoorLock_ProcessCommand(0, LOCK_CMD_LOCK);

    /* Simulate motor operation */
    SIL_Motor_SimulateOperation(0, 500);  /* 500ms operation */
    DoorLock_MainFunction();

    /* Assert timing */
    uint32_t elapsed = SIL_Timer_GetTicks() - start_time;
    TEST_ASSERT_LESS_THAN(1000, elapsed);  /* < 1 second spec */
}

SIL Environment Architecture

SIL tests replace hardware-dependent modules with host-compatible stubs. The key to effective SIL testing is maintaining interface fidelity while abstracting hardware timing:

/**
 * @file sil_can_stub.c
 * @brief SIL stub for CAN communication - simulates message exchange
 */

#include "sil_can_stub.h"
#include <string.h>

#define SIL_CAN_QUEUE_SIZE  64

static SIL_CanMessage_t rx_queue[SIL_CAN_QUEUE_SIZE];
static uint32_t rx_head = 0;
static uint32_t rx_tail = 0;

void SIL_CAN_Init(void)
{
    rx_head = 0;
    rx_tail = 0;
    memset(rx_queue, 0, sizeof(rx_queue));
}

/**
 * @brief Inject a CAN message into the SIL receive queue.
 * Called by test code to simulate incoming CAN frames.
 */
void SIL_CAN_InjectRxMessage(uint32_t id, const uint8_t *data, uint8_t dlc)
{
    rx_queue[rx_head].id = id;
    memcpy(rx_queue[rx_head].data, data, dlc);
    rx_queue[rx_head].dlc = dlc;
    rx_head = (rx_head + 1) % SIL_CAN_QUEUE_SIZE;
}

/**
 * @brief Read next CAN message from the SIL queue.
 * Replaces the real CAN driver HAL_CAN_Receive().
 */
bool SIL_CAN_Receive(SIL_CanMessage_t *msg)
{
    if (rx_tail == rx_head) {
        return false;  /* Queue empty */
    }
    *msg = rx_queue[rx_tail];
    rx_tail = (rx_tail + 1) % SIL_CAN_QUEUE_SIZE;
    return true;
}

PIL Testing Framework

#!/usr/bin/env python3
"""
PIL Test Framework for embedded targets.
Executes tests on real target hardware via debug interface.
"""

from dataclasses import dataclass
from typing import List, Dict, Any
import subprocess

@dataclass
class PILTestResult:
    """PIL test execution result."""
    test_name: str
    passed: bool
    execution_time_us: int
    stack_usage: int
    output: str

class PILTestRunner:
    """Runs tests on target via debug probe."""

    def __init__(self, target: str = "stm32f4"):
        self.target = target
        self.debug_interface = "openocd"

    def flash_and_run(self, test_binary: str) -> List[PILTestResult]:
        """Flash test binary and collect results."""
        # Flash to target
        self._flash_binary(test_binary)

        # Run tests and collect via SWO/RTT
        results = self._execute_tests()

        return results

    def _flash_binary(self, binary: str) -> None:
        """Flash binary to target."""
        cmd = [
            "openocd",
            "-f", f"target/{self.target}.cfg",
            "-c", f"program {binary} verify reset exit"
        ]
        subprocess.run(cmd, check=True)

    def _execute_tests(self) -> List[PILTestResult]:
        """Execute tests and parse output."""
        # Implementation for result collection
        return []

PIL Timing Verification

PIL testing is uniquely suited for verifying real-time constraints because code runs on the actual target processor. AI can analyze PIL timing results to detect performance regressions and predict timing violations before they become safety issues:

"""
PIL timing analysis with AI-assisted regression detection.
"""

import json
import statistics
from dataclasses import dataclass

@dataclass
class TimingBudget:
    """Timing budget for a real-time task."""
    task_name: str
    wcet_budget_us: int      # Worst-case execution time budget
    measured_wcet_us: int    # Measured WCET from PIL
    measured_avg_us: int     # Average execution time from PIL
    margin_percent: float    # Remaining margin

def analyze_pil_timing(results: list[PILTestResult], budgets: dict) -> list[TimingBudget]:
    """Analyze PIL results against timing budgets."""
    analysis = []

    for task_name, budget_us in budgets.items():
        task_results = [r for r in results if r.test_name.startswith(task_name)]
        if not task_results:
            continue

        times = [r.execution_time_us for r in task_results]
        wcet = max(times)
        avg = int(statistics.mean(times))
        margin = ((budget_us - wcet) / budget_us) * 100

        analysis.append(TimingBudget(
            task_name=task_name,
            wcet_budget_us=budget_us,
            measured_wcet_us=wcet,
            measured_avg_us=avg,
            margin_percent=round(margin, 1)
        ))

    return analysis

def detect_timing_regression(
    current: list[TimingBudget],
    baseline: list[TimingBudget],
    threshold_percent: float = 10.0
) -> list[str]:
    """Flag tasks where WCET increased beyond threshold."""
    warnings = []
    baseline_map = {t.task_name: t for t in baseline}

    for task in current:
        if task.task_name in baseline_map:
            prev_wcet = baseline_map[task.task_name].measured_wcet_us
            if prev_wcet > 0:
                increase = ((task.measured_wcet_us - prev_wcet) / prev_wcet) * 100
                if increase > threshold_percent:
                    warnings.append(
                        f"{task.task_name}: WCET increased {increase:.1f}% "
                        f"({prev_wcet}us -> {task.measured_wcet_us}us)"
                    )

    return warnings

HIL Test Environment

# hil_config.yaml
hil_environment:
  simulator:
    vendor: "dSPACE"
    model: "SCALEXIO"
    configuration: "bcm_door_lock.sdf"

  ecu:
    target: "STM32F446"
    interface: "XCP_CAN"
    can_channel: "CAN1"

  test_scenarios:
    - name: "Lock_Unlock_Cycle"
      inputs:
        - signal: "LockSwitch"
          profile: "pulse"
          duration_ms: 100
      outputs:
        - signal: "MotorPWM"
          expected: "ramp_up"
        - signal: "LockPosition"
          expected: "locked"
          timeout_ms: 2000

    - name: "Emergency_Unlock"
      precondition: "doors_locked"
      inputs:
        - signal: "CrashSensor"
          value: 1
      outputs:
        - signal: "AllDoorsUnlocked"
          expected: true
          timeout_ms: 100

Hardware-in-the-Loop Testing with AI Integration

HIL testing represents the highest fidelity in the xIL hierarchy. The target ECU runs production firmware while a real-time simulator replaces the physical environment (plant). AI integration enhances HIL testing in three key areas: scenario generation, anomaly detection, and test optimization.

AI-Driven Test Scenario Generation

Manually designing HIL test scenarios is time-consuming and limited by the engineer's imagination. AI can generate scenarios that explore edge cases and unusual operating conditions:

#!/usr/bin/env python3
"""
AI-powered HIL test scenario generator.
Generates plant model stimulation profiles for dSPACE or NI HIL rigs.
"""

import json
from anthropic import Anthropic

def generate_hil_scenarios(
    system_description: str,
    existing_scenarios: list[dict],
    failure_modes: list[str],
    coverage_gaps: list[str]
) -> list[dict]:
    """Generate new HIL scenarios targeting coverage gaps."""
    client = Anthropic()

    prompt = f"""You are a HIL test engineer for automotive embedded systems.
Generate new test scenarios for the following system.

## System Under Test
{system_description}

## Existing Test Scenarios (do not duplicate)
{json.dumps(existing_scenarios, indent=2)}

## Known Failure Modes to Cover
{json.dumps(failure_modes, indent=2)}

## Coverage Gaps to Address
{json.dumps(coverage_gaps, indent=2)}

Generate 5-10 new HIL test scenarios in YAML format. Each scenario must include:
- name: descriptive test name
- preconditions: initial system state
- inputs: list of signals with timing profiles
- expected_outputs: list of signals with expected values and tolerances
- timeout_ms: maximum test duration
- safety_relevance: ASIL classification if applicable
- rationale: why this scenario is needed

Return valid YAML only."""

    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=4096,
        messages=[{"role": "user", "content": prompt}]
    )

    return response.content[0].text

def generate_fault_injection_profile(
    interface: str,
    fault_type: str,
    duration_ms: int
) -> dict:
    """Generate a fault injection profile for HIL testing."""
    profiles = {
        "can_bus_off": {
            "interface": interface,
            "fault": "bus_off",
            "trigger": "immediate",
            "duration_ms": duration_ms,
            "recovery": "automatic"
        },
        "sensor_drift": {
            "interface": interface,
            "fault": "linear_drift",
            "rate": 0.5,  # percent per second
            "duration_ms": duration_ms,
            "recovery": "none"
        },
        "signal_stuck": {
            "interface": interface,
            "fault": "stuck_at",
            "value": 0,
            "duration_ms": duration_ms,
            "recovery": "value_restore"
        },
        "power_brownout": {
            "interface": interface,
            "fault": "voltage_dip",
            "voltage_v": 6.0,  # below nominal 12V
            "duration_ms": duration_ms,
            "recovery": "ramp_restore"
        }
    }
    return profiles.get(fault_type, {})

HIL Anomaly Detection

AI monitors HIL test execution in real time, flagging anomalous signal behaviors that traditional pass/fail assertions would miss:

"""
Real-time HIL signal anomaly detection using statistical methods.
"""

import numpy as np
from dataclasses import dataclass

@dataclass
class SignalAnomaly:
    """Detected anomaly in a HIL test signal."""
    signal_name: str
    timestamp_ms: float
    anomaly_type: str       # spike, drift, oscillation, stuck
    severity: str           # low, medium, high
    measured_value: float
    expected_range: tuple[float, float]
    description: str

class HILAnomalyDetector:
    """Detects anomalies in HIL test signal traces."""

    def __init__(self, window_size: int = 100, z_threshold: float = 3.0):
        self.window_size = window_size
        self.z_threshold = z_threshold
        self.signal_history: dict[str, list[float]] = {}

    def feed_sample(self, signal_name: str, timestamp_ms: float,
                    value: float) -> SignalAnomaly | None:
        """Process a new signal sample and check for anomalies."""
        if signal_name not in self.signal_history:
            self.signal_history[signal_name] = []

        history = self.signal_history[signal_name]
        history.append(value)

        if len(history) < self.window_size:
            return None  # Not enough data yet

        window = history[-self.window_size:]
        mean = np.mean(window[:-1])
        std = np.std(window[:-1])

        if std < 1e-9:
            # Near-zero variance: check for stuck signal
            if len(history) > self.window_size * 2:
                return SignalAnomaly(
                    signal_name=signal_name,
                    timestamp_ms=timestamp_ms,
                    anomaly_type="stuck",
                    severity="medium",
                    measured_value=value,
                    expected_range=(mean - 0.1, mean + 0.1),
                    description=f"Signal {signal_name} stuck at {value:.3f}"
                )
            return None

        z_score = abs(value - mean) / std
        if z_score > self.z_threshold:
            return SignalAnomaly(
                signal_name=signal_name,
                timestamp_ms=timestamp_ms,
                anomaly_type="spike",
                severity="high" if z_score > 5.0 else "medium",
                measured_value=value,
                expected_range=(mean - self.z_threshold * std,
                                mean + self.z_threshold * std),
                description=(f"Signal {signal_name} spike: z-score={z_score:.1f}, "
                             f"value={value:.3f}, expected~{mean:.3f}")
            )

        return None

Continuous Integration Testing

Integration testing must run automatically and provide rapid feedback. AI enhances CI-based integration testing through intelligent test selection, execution prioritization, and result triage.

Automated Test Selection and Prioritization

When a code change affects a subset of modules, running the entire integration test suite is wasteful. AI selects and orders tests by predicted relevance:

#!/usr/bin/env python3
"""
AI-powered integration test selector for CI pipelines.
Prioritizes tests based on code changes, historical failures, and risk.
"""

import json
import subprocess
from pathlib import Path
from dataclasses import dataclass, field

@dataclass
class TestPriority:
    """A test case with computed execution priority."""
    test_id: str
    priority_score: float       # 0.0 (skip) to 1.0 (must run)
    reasons: list[str] = field(default_factory=list)
    estimated_duration_s: int = 0

class IntegrationTestSelector:
    """Selects and prioritizes integration tests for CI."""

    def __init__(self, coverage_map_path: str, history_path: str):
        self.coverage_map = self._load_json(coverage_map_path)
        self.history = self._load_json(history_path)

    def _load_json(self, path: str) -> dict:
        p = Path(path)
        return json.loads(p.read_text()) if p.exists() else {}

    def get_changed_modules(self) -> list[str]:
        """Identify modules affected by the current changeset."""
        result = subprocess.run(
            ["git", "diff", "--name-only", "HEAD~1"],
            capture_output=True, text=True
        )
        changed_files = result.stdout.strip().split("\n")
        modules = set()
        for f in changed_files:
            # Map file paths to module names
            parts = Path(f).parts
            if len(parts) >= 2 and parts[0] == "src":
                modules.add(parts[1])
        return list(modules)

    def prioritize(self, changed_modules: list[str]) -> list[TestPriority]:
        """Compute priority scores for all integration tests."""
        priorities = []

        for test_id, test_info in self.coverage_map.items():
            score = 0.0
            reasons = []

            # Factor 1: Direct module coverage
            covered_modules = set(test_info.get("covers_modules", []))
            overlap = covered_modules.intersection(changed_modules)
            if overlap:
                score += 0.5
                reasons.append(f"Covers changed modules: {', '.join(overlap)}")

            # Factor 2: Historical failure correlation
            fail_rate = self.history.get(test_id, {}).get("fail_rate", 0.0)
            if fail_rate > 0.1:
                score += 0.3 * min(fail_rate, 1.0)
                reasons.append(f"Historical fail rate: {fail_rate:.0%}")

            # Factor 3: Safety criticality
            if test_info.get("safety_relevant", False):
                score += 0.2
                reasons.append("Safety-relevant test")

            priorities.append(TestPriority(
                test_id=test_id,
                priority_score=min(score, 1.0),
                reasons=reasons,
                estimated_duration_s=test_info.get("duration_s", 60)
            ))

        # Sort descending by priority
        priorities.sort(key=lambda t: t.priority_score, reverse=True)
        return priorities

    def select_for_budget(
        self, priorities: list[TestPriority], time_budget_s: int
    ) -> list[TestPriority]:
        """Select tests that fit within the CI time budget."""
        selected = []
        remaining = time_budget_s

        for test in priorities:
            if test.priority_score < 0.1:
                continue  # Skip irrelevant tests
            if test.estimated_duration_s <= remaining:
                selected.append(test)
                remaining -= test.estimated_duration_s

        return selected

CI Pipeline Integration

Integration tests fit into the CI pipeline as a gated stage between unit tests and system-level HIL tests:

# .gitlab-ci.yml - Integration test stage with AI selection
integration-test:
  stage: integration
  image: registry.example.com/embedded/sil-runner:latest
  before_script:
    - python3 scripts/ai_test_selector.py --output selected_tests.json
  script:
    - |
      SELECTED=$(cat selected_tests.json)
      echo "Running $(echo $SELECTED | jq length) of $(ls tests/integration/ | wc -l) tests"
      python3 scripts/run_integration_tests.py --test-list selected_tests.json
  after_script:
    - python3 scripts/update_test_history.py --results integration_results.xml
  artifacts:
    reports:
      junit: integration_results.xml
    paths:
      - integration_logs/
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
    - if: $CI_COMMIT_BRANCH == "develop"

Test Environment Management

Managing test beds—physical HIL rigs, SIL virtual machines, PIL debug probes—is a significant operational challenge. AI helps automate configuration, scheduling, and health monitoring.

AI-Assisted Test Bed Configuration

# test_bed_inventory.yaml
test_beds:
  - id: "HIL-RIG-01"
    type: "HIL"
    vendor: "dSPACE"
    model: "SCALEXIO"
    status: "available"
    capabilities:
      - can_channels: 4
      - lin_channels: 2
      - analog_inputs: 16
      - digital_io: 32
      - real_time_os: true
    assigned_project: null
    last_calibration: "2025-11-15"
    next_calibration: "2026-05-15"

  - id: "PIL-BENCH-01"
    type: "PIL"
    target_mcu: "STM32F446RE"
    debug_probe: "SEGGER J-Link"
    status: "available"
    capabilities:
      - swo_trace: true
      - rtt_channels: 4
      - flash_size_kb: 512
    assigned_project: null

  - id: "SIL-VM-POOL"
    type: "SIL"
    provider: "docker"
    instances: 8
    status: "available"
    capabilities:
      - renode_support: true
      - qemu_targets: ["arm", "riscv"]
      - parallel_execution: true
    assigned_project: null
"""
Test bed scheduler with AI-optimized allocation.
Matches test requirements to available test bed capabilities.
"""

from dataclasses import dataclass

@dataclass
class TestBedRequirement:
    """Requirements a test suite has for a test bed."""
    xil_level: str              # SIL, PIL, HIL
    target_mcu: str | None      # Required MCU for PIL/HIL
    can_channels_needed: int
    real_time_required: bool
    estimated_duration_min: int

@dataclass
class TestBedAllocation:
    """Result of scheduling a test suite to a test bed."""
    test_suite_id: str
    test_bed_id: str
    scheduled_start: str
    estimated_end: str
    conflict: bool

class TestBedScheduler:
    """Allocates test suites to available test beds."""

    def __init__(self, inventory: list[dict]):
        self.inventory = inventory

    def find_compatible_beds(
        self, requirement: TestBedRequirement
    ) -> list[dict]:
        """Find test beds matching the requirement."""
        compatible = []
        for bed in self.inventory:
            if bed["type"] != requirement.xil_level:
                continue
            if bed["status"] != "available":
                continue
            if requirement.target_mcu and bed.get("target_mcu") != requirement.target_mcu:
                continue
            caps = bed.get("capabilities", {})
            if isinstance(caps, list):
                cap_dict = {}
                for c in caps:
                    if isinstance(c, dict):
                        cap_dict.update(c)
                caps = cap_dict
            if caps.get("can_channels", 0) < requirement.can_channels_needed:
                continue
            if requirement.real_time_required and not caps.get("real_time_os", False):
                continue
            compatible.append(bed)
        return compatible

    def allocate(
        self, test_suite_id: str, requirement: TestBedRequirement
    ) -> TestBedAllocation | None:
        """Allocate the best available test bed for a test suite."""
        beds = self.find_compatible_beds(requirement)
        if not beds:
            return None

        # Prefer the bed with the fewest excess capabilities (right-sizing)
        best = min(beds, key=lambda b: self._excess_score(b, requirement))
        return TestBedAllocation(
            test_suite_id=test_suite_id,
            test_bed_id=best["id"],
            scheduled_start="now",
            estimated_end=f"+{requirement.estimated_duration_min}min",
            conflict=False
        )

    def _excess_score(self, bed: dict, req: TestBedRequirement) -> int:
        """Lower score means a better (tighter) fit."""
        caps = bed.get("capabilities", {})
        if isinstance(caps, list):
            cap_dict = {}
            for c in caps:
                if isinstance(c, dict):
                    cap_dict.update(c)
            caps = cap_dict
        excess = caps.get("can_channels", 0) - req.can_channels_needed
        return max(excess, 0)

Test Environment Health Monitoring

AI monitors test bed health to catch infrastructure issues before they corrupt test results:

Health Check Method Frequency AI Role
Calibration expiry Date comparison Daily Alert before expiry
Signal integrity Baseline comparison Per test run Detect cable/connector degradation
Simulator license License server query Hourly Predict license contention
Debug probe connectivity Ping/handshake Before each PIL run Auto-retry or escalate
Disk space on SIL runners Filesystem check Hourly Predict when cleanup is needed

Defect Pattern Analysis

Integration test failures often follow patterns that are invisible when analyzing individual results. ML-based defect analysis identifies systemic issues and predicts where future defects will cluster.

ML-Based Defect Prediction

"""
Defect pattern analyzer for integration test results.
Uses historical test data to predict defect-prone integration points.
"""

import json
from collections import Counter
from dataclasses import dataclass
from pathlib import Path

@dataclass
class DefectPattern:
    """A recurring defect pattern in integration testing."""
    pattern_id: str
    description: str
    affected_modules: list[str]
    occurrence_count: int
    severity_distribution: dict[str, int]  # {critical: N, major: N, minor: N}
    root_cause_category: str
    recommended_action: str

class IntegrationDefectAnalyzer:
    """Analyzes integration test defects for recurring patterns."""

    def __init__(self, history_path: str):
        self.history = self._load_history(history_path)

    def _load_history(self, path: str) -> list[dict]:
        p = Path(path)
        return json.loads(p.read_text()) if p.exists() else []

    def identify_patterns(self) -> list[DefectPattern]:
        """Identify recurring defect patterns from test history."""
        patterns = []

        # Pattern 1: Interface mismatch clusters
        interface_failures = [
            d for d in self.history
            if d.get("category") == "interface_mismatch"
        ]
        if len(interface_failures) > 3:
            modules = [m for d in interface_failures
                       for m in d.get("modules", [])]
            module_counts = Counter(modules)
            patterns.append(DefectPattern(
                pattern_id="PAT-INT-001",
                description="Recurring interface mismatch defects",
                affected_modules=[m for m, _ in module_counts.most_common(5)],
                occurrence_count=len(interface_failures),
                severity_distribution=self._count_severities(interface_failures),
                root_cause_category="interface_specification",
                recommended_action=(
                    "Review interface specifications between top affected "
                    "modules. Consider adding contract tests."
                )
            ))

        # Pattern 2: Timing-related failures
        timing_failures = [
            d for d in self.history
            if d.get("category") == "timing_violation"
        ]
        if len(timing_failures) > 2:
            modules = [m for d in timing_failures
                       for m in d.get("modules", [])]
            module_counts = Counter(modules)
            patterns.append(DefectPattern(
                pattern_id="PAT-TIM-001",
                description="Recurring timing violation defects",
                affected_modules=[m for m, _ in module_counts.most_common(5)],
                occurrence_count=len(timing_failures),
                severity_distribution=self._count_severities(timing_failures),
                root_cause_category="real_time_design",
                recommended_action=(
                    "Review task scheduling and priority assignments. "
                    "Consider PIL timing profiling for affected modules."
                )
            ))

        # Pattern 3: Resource contention
        resource_failures = [
            d for d in self.history
            if d.get("category") in ("deadlock", "race_condition", "memory_corruption")
        ]
        if len(resource_failures) > 1:
            modules = [m for d in resource_failures
                       for m in d.get("modules", [])]
            module_counts = Counter(modules)
            patterns.append(DefectPattern(
                pattern_id="PAT-RES-001",
                description="Resource contention defects",
                affected_modules=[m for m, _ in module_counts.most_common(5)],
                occurrence_count=len(resource_failures),
                severity_distribution=self._count_severities(resource_failures),
                root_cause_category="concurrency_design",
                recommended_action=(
                    "Audit shared resource access patterns. Add mutex "
                    "analysis and run thread-safety static analysis."
                )
            ))

        return patterns

    def predict_risk_areas(self, changed_modules: list[str]) -> list[dict]:
        """Predict which integration points are at risk given changed modules."""
        risk_areas = []
        patterns = self.identify_patterns()

        for pattern in patterns:
            overlap = set(changed_modules).intersection(pattern.affected_modules)
            if overlap:
                risk_score = (
                    len(overlap) / len(pattern.affected_modules)
                    * pattern.occurrence_count / 10.0
                )
                risk_areas.append({
                    "pattern": pattern.pattern_id,
                    "risk_score": min(risk_score, 1.0),
                    "affected_by_change": list(overlap),
                    "recommendation": pattern.recommended_action
                })

        risk_areas.sort(key=lambda r: r["risk_score"], reverse=True)
        return risk_areas

    def _count_severities(self, defects: list[dict]) -> dict[str, int]:
        severities = Counter(d.get("severity", "unknown") for d in defects)
        return dict(severities)

Defect Trend Reporting

Track integration defect trends over time to measure process improvement:

Metric Description Target ASPICE Relevance
Defect Discovery Rate New defects found per integration cycle Decreasing trend SWE.5 BP4
Defect Escape Rate Defects reaching HIL that should have been caught in SIL < 5% SWE.5 effectiveness
Mean Time to Resolution Average time from defect discovery to fix verification < 2 days MAN.3 efficiency
Recurrence Rate Percentage of defects that reappear after fix < 3% SUP.9 root cause quality
Interface Defect Density Defects per interface point Decreasing per release SWE.5 BP2

ASPICE Compliance Mapping

Integration testing under ASPICE 4.0 is primarily governed by SWE.5 (Software Integration and Integration Testing) and informed by SYS.4 (System Architectural Design) at the system level.

SWE.5: Software Integration and Integration Testing

Base Practice Description AI Support Evidence Required
SWE.5.BP1 Develop integration strategy AI proposes integration sequence based on dependency analysis Integration Test Strategy document
SWE.5.BP2 Develop integration test specification AI generates test cases from interface specifications Integration Test Specification
SWE.5.BP3 Integrate software units and components Automated build and integration in CI pipeline Build logs, integration reports
SWE.5.BP4 Perform integration testing Automated xIL test execution with AI anomaly detection Test results (JUnit XML, logs)
SWE.5.BP5 Evaluate integration test results AI pattern analysis and defect classification Test evaluation report
SWE.5.BP6 Establish bidirectional traceability Automated trace linking in ALM tools Traceability matrix
SWE.5.BP7 Ensure consistency AI checks test-to-requirement alignment Consistency review records

SYS.4: System Architectural Design

SYS.4 defines the system architecture that integration testing verifies. AI-assisted integration testing validates the interfaces and interactions specified by the system architectural design:

SYS.4 Output Integration Test Verification
System component interfaces Interface-level SIL/PIL tests
Communication protocols (CAN, LIN, SPI) Protocol conformance tests
Timing constraints PIL timing verification
Resource allocation (memory, CPU) PIL resource measurement
Safety mechanisms HIL fault injection tests

Traceability Requirements

ASPICE requires bidirectional traceability between architectural design elements and integration test cases:

Architectural Element  <-->  Integration Test Case  <-->  Test Result
       (SYS.4)                    (SWE.5)                  (SWE.5)
Traceability Link Direction Tool Support
Architecture element to test case Forward Polarion, Jama, DOORS
Test case to architecture element Backward Same ALM tools
Test case to test result Forward CI artifacts (JUnit XML)
Defect to test case Backward Jira/ALM integration
Change request to regression test Forward AI-assisted impact analysis

HITL Checkpoint: While AI can auto-generate traceability links by parsing architecture documents and test names, an engineer must verify completeness. Missing trace links are a common ASPICE assessment finding.


Integration Testing Tool Comparison

Tool xIL Level AI Features ASPICE Evidence Target Support Cost
VectorCAST SIL, PIL Test generation, coverage gap analysis Full compliance reporting ARM, x86, RISC-V $$$
dSPACE SCALEXIO HIL Scenario optimization Real-time test reports Automotive ECUs $$$$
NI TestStand HIL Test flow optimization Customizable reports Generic hardware $$$
MATLAB/Simulink Test MIL, SIL ML-based test generation Model coverage reports Model targets $$$
Robot Framework SIL Keyword extensibility JUnit XML output Any (Python-based) Free
Renode SIL Scriptable scenarios JUnit XML output ARM, RISC-V, Xtensa Free
QEMU SIL CI-friendly Minimal built-in ARM, RISC-V, x86 Free
Lauterbach TRACE32 PIL Trace analysis Execution trace logs Wide MCU support $$$
SEGGER J-Link + Ozone PIL Profiling Timeline exports ARM Cortex $$

Selection Criteria by Project Profile

Project Profile Recommended Stack Rationale
Automotive ASIL B-D VectorCAST (SIL) + dSPACE (HIL) Pre-qualified, full ASPICE evidence
Industrial SIL 2-3 LDRA (SIL) + NI TestStand (HIL) Industrial certification, flexible HIL
Cost-Sensitive / ASIL A Renode (SIL) + SEGGER (PIL) + custom HIL Open-source SIL, affordable PIL
RISC-V Targets Renode (SIL) + QEMU (PIL) Best open-source RISC-V support
Model-Based Development MATLAB/Simulink (MIL/SIL) + dSPACE (HIL) Integrated model-to-HIL workflow

Practical Example: End-to-End Integration Test

The following example demonstrates a complete integration test flow from SIL through PIl to HIL for a door lock ECU:

#!/usr/bin/env python3
"""
End-to-end integration test orchestrator.
Runs the same logical test at SIL, PIL, and HIL levels.
"""

import subprocess
import json
import sys
from dataclasses import dataclass
from enum import Enum

class XILLevel(Enum):
    SIL = "sil"
    PIL = "pil"
    HIL = "hil"

@dataclass
class IntegrationTestResult:
    """Result of an integration test at a specific xIL level."""
    test_name: str
    xil_level: XILLevel
    passed: bool
    duration_ms: int
    details: dict

def run_sil_test(test_name: str, binary: str) -> IntegrationTestResult:
    """Execute integration test in SIL environment (Renode)."""
    result = subprocess.run(
        ["renode", "--disable-xwt", "-e",
         f"include @tests/integration/{test_name}.resc; quit"],
        capture_output=True, text=True, timeout=120
    )
    passed = result.returncode == 0
    return IntegrationTestResult(
        test_name=test_name,
        xil_level=XILLevel.SIL,
        passed=passed,
        duration_ms=0,  # Parsed from output
        details={"stdout": result.stdout, "stderr": result.stderr}
    )

def run_pil_test(test_name: str, binary: str, target: str) -> IntegrationTestResult:
    """Execute integration test in PIL environment (target MCU)."""
    # Flash binary
    subprocess.run(
        ["openocd", "-f", f"target/{target}.cfg",
         "-c", f"program {binary} verify reset exit"],
        check=True, capture_output=True
    )
    # Collect results via RTT
    result = subprocess.run(
        ["python3", "scripts/rtt_collector.py",
         "--test", test_name, "--timeout", "30"],
        capture_output=True, text=True, timeout=60
    )
    test_output = json.loads(result.stdout) if result.stdout else {}
    return IntegrationTestResult(
        test_name=test_name,
        xil_level=XILLevel.PIL,
        passed=test_output.get("passed", False),
        duration_ms=test_output.get("duration_ms", 0),
        details=test_output
    )

def run_xil_progression(test_name: str, binary: str, target: str) -> list:
    """Run a test through the xIL progression: SIL -> PIL."""
    results = []

    # Stage 1: SIL
    sil_result = run_sil_test(test_name, binary)
    results.append(sil_result)
    if not sil_result.passed:
        print(f"FAIL at SIL level - skipping PIL: {test_name}")
        return results

    # Stage 2: PIL (only if SIL passed)
    pil_result = run_pil_test(test_name, binary, target)
    results.append(pil_result)

    return results

if __name__ == "__main__":
    test_name = sys.argv[1] if len(sys.argv) > 1 else "test_door_lock_integration"
    binary = sys.argv[2] if len(sys.argv) > 2 else "build/debug/firmware.elf"
    target = sys.argv[3] if len(sys.argv) > 3 else "stm32f4"

    results = run_xil_progression(test_name, binary, target)
    for r in results:
        status = "PASS" if r.passed else "FAIL"
        print(f"[{r.xil_level.value.upper()}] {r.test_name}: {status} ({r.duration_ms}ms)")

Implementation Checklist

Use this checklist when establishing integration testing for an ASPICE-compliant embedded project:

# Item ASPICE BP Priority
1 Define integration strategy (bottom-up, top-down, risk-based) SWE.5.BP1 Must
2 Document integration test specification with traceability to architecture SWE.5.BP2 Must
3 Set up SIL environment with host-compiled stubs for all HAL interfaces SWE.5.BP3 Must
4 Set up PIL environment with debug probe and RTT/SWO output collection SWE.5.BP3 Should
5 Set up HIL environment with real-time simulator and plant models SWE.5.BP3 Must (ASIL B+)
6 Automate integration tests in CI pipeline with quality gates SWE.5.BP4 Must
7 Implement AI-assisted test selection for CI efficiency SWE.5.BP4 Should
8 Configure anomaly detection for HIL signal monitoring SWE.5.BP5 Should
9 Establish defect pattern analysis for integration test results SWE.5.BP5 Should
10 Implement bidirectional traceability (architecture to test to result) SWE.5.BP6 Must
11 Define test bed inventory and automated scheduling SWE.5.BP4 Should
12 Set up timing regression detection for PIL results SWE.5.BP5 Should
13 Generate ASPICE-compliant integration test reports automatically SWE.5.BP5 Must
14 Review AI-generated test scenarios and strategies (HITL) SWE.5.BP7 Must
15 Establish fault injection testing for safety-relevant interfaces SWE.5.BP4 Must (ASIL B+)

Key Principle: AI accelerates integration testing by generating scenarios, selecting tests, and analyzing results. The human engineer owns the integration strategy, validates AI outputs, and signs off on test adequacy. This division of labor satisfies ASPICE's requirement for human accountability while maximizing AI's contribution to thoroughness and efficiency.


Summary

HIL/SIL/PIL Testing Key Points:

  • SIL: Host PC, fast, functional verification
  • PIL: Target CPU, timing verification
  • HIL: Full hardware, real-time, system validation
  • Selection: Based on fidelity needs and development phase
  • AI Support: Test scenario generation, result analysis
  • Test Strategy: AI assists with integration sequence, risk analysis, and coverage gap identification
  • Continuous Integration: Intelligent test selection reduces CI cycle time by 40-60% while maintaining defect detection rates
  • Defect Analysis: ML-based pattern detection identifies systemic issues across integration test campaigns
  • ASPICE Compliance: Integration testing maps to SWE.5 base practices with full traceability to SYS.4 architecture
  • HITL Required: AI-generated strategies, scenarios, and analysis must be reviewed and approved by the responsible engineer

References