1.3: Capability Mapping

AI Readiness Assessment by Task

Capability Matrix

Purpose: Map ASPICE tasks to an AI readiness level ([PASS] Ready, [WARN] Partial, [FAIL] Not Ready)

Task AI Ready? Success Rate Notes
SWE.1: Requirements Extraction from natural language specs Partial 70-80% Needs human QC
SWE.1: Ambiguity Detection Ready 85-90% High ROI
SWE.1: Traceability Matrix Generation (automated) Ready 95-100% Fully automate
SWE.2: Architecture Decisions (e.g., AUTOSAR vs custom) Not Ready N/A Human decides
SWE.2: ADR Drafting (Context, Decision, Rationale template) Ready 80-90% AI drafts, human approves
SWE.2: UML Diagram Generation (C4, sequence, class) Ready 85-95% PlantUML auto-gen
SWE.3: Boilerplate Code (getters, setters, stubs) Ready 90-95% Getters, trivial
SWE.3: Standard Algorithms (CRC, sort, search) Ready 85-90% CRC, sort OK
SWE.3: Safety-Critical Logic (brake control, fail-safe) Not Ready 40-60% Too risky
SWE.3: MISRA C Compliance Check (via cppcheck) Ready 90-95% Static analysis
SWE.3: Doxygen Comments Ready 90-95% Auto-gen
SWE.4: Unit Test Generation (typical + boundary values) Ready 80-85% Boundary values
SWE.4: Edge Case Tests (complex, novel scenarios) Partial 60-70% Human adds
SWE.4: Coverage Analysis (gcov, report generation) Ready 95-100% gcov parsing
SWE.5: Integration Test (HIL test scaffolding) Partial 50-60% HIL test complex
SWE.6: System Test Execution (manual, proving ground) Not Ready 30-40% Needs human
SUP.1: SDS Generation (from code + comments) Ready 85-90% Doxygen to PDF
SUP.2: Code Review (MISRA) Ready 90-95% PC-lint
SUP.8: Version Control (commits, tags, branching) Ready 95-100% Git commands
SUP.9: Problem Resolution (debugging, root cause) Partial 50-70% Standard bugs OK

Legend:

  • [PASS] Ready: AI can perform task with 80%+ success, minimal human intervention
  • [WARN] Partial: AI assists (50-80% success), significant human review/completion needed
  • [FAIL] Not Ready: AI struggles (<50% success), human should own task

Model and Version Dependency: Success rates above are based on GPT-4/claude-opus-4-6 class models (2025-2026). Earlier models (GPT-3.5, Claude-2) may achieve 10-20% lower success rates. As models improve, reassess capability matrix quarterly. Document model version used in project toolchain configuration.


High-Value Automation Candidates

Tasks Where AI Excels

1. Traceability Matrix Generation [PASS]

  • Success Rate: 95-100%
  • Why AI Excels: Pattern matching (@implements tags in code), no creativity needed
  • ROI: 90% time savings (10h → 1h)
  • Recommendation: Fully automate, spot-check 10% for correctness

Example:

# AI Agent: Traceability Matrix Generator
def generate_traceability_matrix(source_files, requirements_db):
    """
    Parse source code for @implements tags, link to requirements
    """
    matrix = []
    for file in source_files:
        for line in file.lines:
            if "@implements" in line:
                req_id = extract_requirement_id(line)  # e.g., "SWE-045"
                function = extract_function_name(file, line_number)
                test_case = find_test_case(req_id)  # Search test files
                matrix.append((req_id, function, file.name, test_case))

    export_to_excel(matrix, "traceability_matrix.xlsx")
    return matrix

# Human: Spot-check 10% of links (50 out of 500), approve if correct

2. MISRA C Compliance Checking [PASS]

  • Success Rate: 90-95%
  • Why AI Excels: Static analysis tool integration (deterministic), no judgment needed
  • ROI: 83% time savings (1h → 10min)
  • Recommendation: Fully automate, human reviews findings

Workflow:

# AI Agent runs static analyzer
cppcheck --addon=misra --xml src/*.c > misra_report.xml

# AI parses XML, generates human-readable report
python ai_misra_reporter.py misra_report.xml

# Output:
# - 12 violations found
# - 8 auto-fixable (explicit casts, const qualifiers)
# - 4 require manual review (complex rule 21.3: malloc in safety-critical code)

# AI auto-fixes trivial violations, creates pull request
# Human reviews and merges

3. Doxygen Comment Generation [PASS]

  • Success Rate: 90-95%
  • Why AI Excels: Code analysis (function signature, return type), template-based
  • ROI: 80% time savings (10min → 2min per function)
  • Recommendation: Automate for all functions, human spot-checks safety notes

Example:

// Input: Function without Doxygen comment
uint32_t CRC32_Calculate(const uint8_t* data, size_t length);

// AI-generated Doxygen header:
/**
 * @brief Calculate CRC-32 checksum (IEEE 802.3 polynomial)
 * @implements [SWE-078] CRC Checksum Calculation
 * @safety_class ASIL-B
 *
 * @param[in] data Pointer to data buffer (must not be NULL)
 * @param[in] length Data length in bytes (range: 0-65535)
 * @return CRC-32 checksum (32-bit unsigned integer)
 *
 * @note Polynomial: 0xEDB88320 (reflected CRC-32)
 * @note Complexity: O(n), where n = length
 */
uint32_t CRC32_Calculate(const uint8_t* data, size_t length);

Human Review: Verify @safety_class tag (ASIL-B correct?), approve


Low-Value Automation Candidates

Tasks Where AI Struggles

1. Safety-Critical Logic Design [FAIL]

  • Success Rate: 40-60% (too low for safety)
  • Why AI Fails: Requires domain expertise (failure modes, hazard analysis, ISO 26262 knowledge)
  • Example Failure:
    • Prompt: "Design brake control logic for ACC, ASIL-B compliant"
    • AI Output: Generic PID controller (60% correct)
    • Missing: Fail-safe behavior, redundancy, watchdog, and diagnostic coverage
  • Recommendation: Human designs, AI generates the implementation from the human spec

Mitigation Strategy: For safety-critical tasks, use AI for: (1) Drafting non-safety code sections, (2) Generating boilerplate/scaffolding, (3) Documentation of human-designed logic. Always keep human in design loop for safety decisions per ISO 26262 Part 8.


2. Architectural Trade-Off Decisions [FAIL]

  • Success Rate: N/A (not a suitable AI task)
  • Why AI Fails: Requires business context (cost, schedule, OEM requirements), long-term vision
  • Example:
    • Question: "Use AUTOSAR Classic or Adaptive for this ECU?"
    • AI Response: Lists pros and cons (correct), but cannot make the final decision
    • Human Needed: Architect considers project constraints and makes the call
  • Recommendation: Human decides, documents the decision in an ADR, AI drafts the ADR template

3. Novel Bug Debugging [WARN]

  • Success Rate: 50-70% (depends on bug complexity)
  • Why AI Struggles: Novel bugs require creative problem-solving, not just pattern matching
  • Example Success: Standard bug (null pointer dereference) → AI suggests fix [PASS]
  • Example Failure: Timing bug (race condition, only occurs at 200 km/h) → AI cannot reproduce [FAIL]
  • Recommendation: AI assists (logs analysis, stack trace), human investigates root cause

Task Delegation Decision Tree

When to Use AI vs Human

Start: New ASPICE task assigned
    │
    ▼
Is task safety-critical?
    │
    ├─ Yes → Human designs logic
    │         AI generates implementation
    │         Human reviews 100%
    │
    └─ No → Is task routine/standard?
            │
            ├─ Yes (e.g., CRC, sort, getters)
            │     → AI generates code
            │     → Human spot-checks (10%)
            │
            └─ No (novel/creative)
                  → Human implements
                  → AI assists (suggestions)

ROI Prioritization

Highest ROI Tasks for AI Automation

Task AI Success Rate Time Savings ROI Priority
Traceability Matrix 95-100% 90% (10h → 1h) ★★★★★ High
Doxygen Comments 90-95% 80% (10min → 2min per fn) ★★★★★ High
MISRA Checking 90-95% 83% (1h → 10min) ★★★★★ High
Unit Test Generation 80-85% 75% (1h → 15min) ★★★★☆ High
Boilerplate Code 90-95% 80% (5min → 1min per fn) ★★★★☆ Medium (low absolute time)
ADR Drafting 80-90% 75% (4h → 1h) ★★★★☆ Medium
Requirements Extraction 70-80% 75% (8h → 2h) ★★★☆☆ Medium (needs QC)
UML Diagrams 85-95% 62% (8h → 3h) ★★★☆☆ Low (infrequent task)

Recommendation: Prioritize automation for High ROI tasks (traceability, MISRA, Doxygen, unit tests)


Summary

Capability Mapping Insights:

  1. Best AI Tasks: Traceability, MISRA checking, Doxygen, unit tests (90%+ success, 75-90% time savings)
  2. Avoid AI For: Safety logic design, architectural decisions, novel debugging (<50% success)
  3. Partial AI Use: Requirements extraction, ADR drafting, integration tests (70-80% success, needs human completion)
  4. ROI Focus: Automate high-frequency, high-time-savings tasks first (traceability: 10h → 1h = 90% savings)

Next: Limitation acknowledgment (when AI must escalate to human) - 29.04