1.3: Capability Mapping

AI Readiness Assessment by Task

Capability Matrix

Purpose: Map ASPICE tasks to an AI readiness level ([PASS] Ready, [WARN] Partial, [FAIL] Not Ready)

Task	AI Ready?	Success Rate	Notes
SWE.1: Requirements Extraction from natural language specs	Partial	70-80%	Needs human QC
SWE.1: Ambiguity Detection	Ready	85-90%	High ROI
SWE.1: Traceability Matrix Generation (automated)	Ready	95-100%	Fully automate
SWE.2: Architecture Decisions (e.g., AUTOSAR vs custom)	Not Ready	N/A	Human decides
SWE.2: ADR Drafting (Context, Decision, Rationale template)	Ready	80-90%	AI drafts, human approves
SWE.2: UML Diagram Generation (C4, sequence, class)	Ready	85-95%	PlantUML auto-gen
SWE.3: Boilerplate Code (getters, setters, stubs)	Ready	90-95%	Getters, trivial
SWE.3: Standard Algorithms (CRC, sort, search)	Ready	85-90%	CRC, sort OK
SWE.3: Safety-Critical Logic (brake control, fail-safe)	Not Ready	40-60%	Too risky
SWE.3: MISRA C Compliance Check (via cppcheck)	Ready	90-95%	Static analysis
SWE.3: Doxygen Comments	Ready	90-95%	Auto-gen
SWE.4: Unit Test Generation (typical + boundary values)	Ready	80-85%	Boundary values
SWE.4: Edge Case Tests (complex, novel scenarios)	Partial	60-70%	Human adds
SWE.4: Coverage Analysis (gcov, report generation)	Ready	95-100%	gcov parsing
SWE.5: Integration Test (HIL test scaffolding)	Partial	50-60%	HIL test complex
SWE.6: System Test Execution (manual, proving ground)	Not Ready	30-40%	Needs human
SUP.1: SDS Generation (from code + comments)	Ready	85-90%	Doxygen to PDF
SUP.2: Code Review (MISRA)	Ready	90-95%	PC-lint
SUP.8: Version Control (commits, tags, branching)	Ready	95-100%	Git commands
SUP.9: Problem Resolution (debugging, root cause)	Partial	50-70%	Standard bugs OK

Legend:

[PASS] Ready: AI can perform task with 80%+ success, minimal human intervention
[WARN] Partial: AI assists (50-80% success), significant human review/completion needed
[FAIL] Not Ready: AI struggles (<50% success), human should own task

Model and Version Dependency: Success rates above are based on GPT-4/claude-opus-4-6 class models (2025-2026). Earlier models (GPT-3.5, Claude-2) may achieve 10-20% lower success rates. As models improve, reassess capability matrix quarterly. Document model version used in project toolchain configuration.

High-Value Automation Candidates

Tasks Where AI Excels

1. Traceability Matrix Generation [PASS]

Success Rate: 95-100%
Why AI Excels: Pattern matching (@implements tags in code), no creativity needed
ROI: 90% time savings (10h → 1h)
Recommendation: Fully automate, spot-check 10% for correctness

Example:

# AI Agent: Traceability Matrix Generator
def generate_traceability_matrix(source_files, requirements_db):
    """
    Parse source code for @implements tags, link to requirements
    """
    matrix = []
    for file in source_files:
        for line in file.lines:
            if "@implements" in line:
                req_id = extract_requirement_id(line)  # e.g., "SWE-045"
                function = extract_function_name(file, line_number)
                test_case = find_test_case(req_id)  # Search test files
                matrix.append((req_id, function, file.name, test_case))

    export_to_excel(matrix, "traceability_matrix.xlsx")
    return matrix

# Human: Spot-check 10% of links (50 out of 500), approve if correct

2. MISRA C Compliance Checking [PASS]

Success Rate: 90-95%
Why AI Excels: Static analysis tool integration (deterministic), no judgment needed
ROI: 83% time savings (1h → 10min)
Recommendation: Fully automate, human reviews findings

Workflow:

# AI Agent runs static analyzer
cppcheck --addon=misra --xml src/*.c > misra_report.xml

# AI parses XML, generates human-readable report
python ai_misra_reporter.py misra_report.xml

# Output:
# - 12 violations found
# - 8 auto-fixable (explicit casts, const qualifiers)
# - 4 require manual review (complex rule 21.3: malloc in safety-critical code)

# AI auto-fixes trivial violations, creates pull request
# Human reviews and merges

3. Doxygen Comment Generation [PASS]

Success Rate: 90-95%
Why AI Excels: Code analysis (function signature, return type), template-based
ROI: 80% time savings (10min → 2min per function)
Recommendation: Automate for all functions, human spot-checks safety notes

Example:

// Input: Function without Doxygen comment
uint32_t CRC32_Calculate(const uint8_t* data, size_t length);

// AI-generated Doxygen header:
/**
 * @brief Calculate CRC-32 checksum (IEEE 802.3 polynomial)
 * @implements [SWE-078] CRC Checksum Calculation
 * @safety_class ASIL-B
 *
 * @param[in] data Pointer to data buffer (must not be NULL)
 * @param[in] length Data length in bytes (range: 0-65535)
 * @return CRC-32 checksum (32-bit unsigned integer)
 *
 * @note Polynomial: 0xEDB88320 (reflected CRC-32)
 * @note Complexity: O(n), where n = length
 */
uint32_t CRC32_Calculate(const uint8_t* data, size_t length);

Human Review: Verify @safety_class tag (ASIL-B correct?), approve

Low-Value Automation Candidates

Tasks Where AI Struggles

1. Safety-Critical Logic Design [FAIL]

Success Rate: 40-60% (too low for safety)
Why AI Fails: Requires domain expertise (failure modes, hazard analysis, ISO 26262 knowledge)
Example Failure:
- Prompt: "Design brake control logic for ACC, ASIL-B compliant"
- AI Output: Generic PID controller (60% correct)
- Missing: Fail-safe behavior, redundancy, watchdog, and diagnostic coverage
Recommendation: Human designs, AI generates the implementation from the human spec

Mitigation Strategy: For safety-critical tasks, use AI for: (1) Drafting non-safety code sections, (2) Generating boilerplate/scaffolding, (3) Documentation of human-designed logic. Always keep human in design loop for safety decisions per ISO 26262 Part 8.

2. Architectural Trade-Off Decisions [FAIL]

Success Rate: N/A (not a suitable AI task)
Why AI Fails: Requires business context (cost, schedule, OEM requirements), long-term vision
Example:
- Question: "Use AUTOSAR Classic or Adaptive for this ECU?"
- AI Response: Lists pros and cons (correct), but cannot make the final decision
- Human Needed: Architect considers project constraints and makes the call
Recommendation: Human decides, documents the decision in an ADR, AI drafts the ADR template

3. Novel Bug Debugging [WARN]

Success Rate: 50-70% (depends on bug complexity)
Why AI Struggles: Novel bugs require creative problem-solving, not just pattern matching
Example Success: Standard bug (null pointer dereference) → AI suggests fix [PASS]
Example Failure: Timing bug (race condition, only occurs at 200 km/h) → AI cannot reproduce [FAIL]
Recommendation: AI assists (logs analysis, stack trace), human investigates root cause

Task Delegation Decision Tree

When to Use AI vs Human

Start: New ASPICE task assigned
    │
    ▼
Is task safety-critical?
    │
    ├─ Yes → Human designs logic
    │         AI generates implementation
    │         Human reviews 100%
    │
    └─ No → Is task routine/standard?
            │
            ├─ Yes (e.g., CRC, sort, getters)
            │     → AI generates code
            │     → Human spot-checks (10%)
            │
            └─ No (novel/creative)
                  → Human implements
                  → AI assists (suggestions)

ROI Prioritization

Highest ROI Tasks for AI Automation

Task	AI Success Rate	Time Savings	ROI	Priority
Traceability Matrix	95-100%	90% (10h → 1h)	★★★★★	High
Doxygen Comments	90-95%	80% (10min → 2min per fn)	★★★★★	High
MISRA Checking	90-95%	83% (1h → 10min)	★★★★★	High
Unit Test Generation	80-85%	75% (1h → 15min)	★★★★☆	High
Boilerplate Code	90-95%	80% (5min → 1min per fn)	★★★★☆	Medium (low absolute time)
ADR Drafting	80-90%	75% (4h → 1h)	★★★★☆	Medium
Requirements Extraction	70-80%	75% (8h → 2h)	★★★☆☆	Medium (needs QC)
UML Diagrams	85-95%	62% (8h → 3h)	★★★☆☆	Low (infrequent task)

Recommendation: Prioritize automation for High ROI tasks (traceability, MISRA, Doxygen, unit tests)

Summary

Capability Mapping Insights:

Best AI Tasks: Traceability, MISRA checking, Doxygen, unit tests (90%+ success, 75-90% time savings)
Avoid AI For: Safety logic design, architectural decisions, novel debugging (<50% success)
Partial AI Use: Requirements extraction, ADR drafting, integration tests (70-80% success, needs human completion)
ROI Focus: Automate high-frequency, high-time-savings tasks first (traceability: 10h → 1h = 90% savings)

Next: Limitation acknowledgment (when AI must escalate to human) - 29.04