1.3: Capability Mapping
AI Readiness Assessment by Task
Capability Matrix
Purpose: Map ASPICE tasks to an AI readiness level ([PASS] Ready, [WARN] Partial, [FAIL] Not Ready)
| Task | AI Ready? | Success Rate | Notes |
|---|---|---|---|
| SWE.1: Requirements Extraction from natural language specs | Partial | 70-80% | Needs human QC |
| SWE.1: Ambiguity Detection | Ready | 85-90% | High ROI |
| SWE.1: Traceability Matrix Generation (automated) | Ready | 95-100% | Fully automate |
| SWE.2: Architecture Decisions (e.g., AUTOSAR vs custom) | Not Ready | N/A | Human decides |
| SWE.2: ADR Drafting (Context, Decision, Rationale template) | Ready | 80-90% | AI drafts, human approves |
| SWE.2: UML Diagram Generation (C4, sequence, class) | Ready | 85-95% | PlantUML auto-gen |
| SWE.3: Boilerplate Code (getters, setters, stubs) | Ready | 90-95% | Getters, trivial |
| SWE.3: Standard Algorithms (CRC, sort, search) | Ready | 85-90% | CRC, sort OK |
| SWE.3: Safety-Critical Logic (brake control, fail-safe) | Not Ready | 40-60% | Too risky |
| SWE.3: MISRA C Compliance Check (via cppcheck) | Ready | 90-95% | Static analysis |
| SWE.3: Doxygen Comments | Ready | 90-95% | Auto-gen |
| SWE.4: Unit Test Generation (typical + boundary values) | Ready | 80-85% | Boundary values |
| SWE.4: Edge Case Tests (complex, novel scenarios) | Partial | 60-70% | Human adds |
| SWE.4: Coverage Analysis (gcov, report generation) | Ready | 95-100% | gcov parsing |
| SWE.5: Integration Test (HIL test scaffolding) | Partial | 50-60% | HIL test complex |
| SWE.6: System Test Execution (manual, proving ground) | Not Ready | 30-40% | Needs human |
| SUP.1: SDS Generation (from code + comments) | Ready | 85-90% | Doxygen to PDF |
| SUP.2: Code Review (MISRA) | Ready | 90-95% | PC-lint |
| SUP.8: Version Control (commits, tags, branching) | Ready | 95-100% | Git commands |
| SUP.9: Problem Resolution (debugging, root cause) | Partial | 50-70% | Standard bugs OK |
Legend:
- [PASS] Ready: AI can perform task with 80%+ success, minimal human intervention
- [WARN] Partial: AI assists (50-80% success), significant human review/completion needed
- [FAIL] Not Ready: AI struggles (<50% success), human should own task
Model and Version Dependency: Success rates above are based on GPT-4/claude-opus-4-6 class models (2025-2026). Earlier models (GPT-3.5, Claude-2) may achieve 10-20% lower success rates. As models improve, reassess capability matrix quarterly. Document model version used in project toolchain configuration.
High-Value Automation Candidates
Tasks Where AI Excels
1. Traceability Matrix Generation [PASS]
- Success Rate: 95-100%
- Why AI Excels: Pattern matching (@implements tags in code), no creativity needed
- ROI: 90% time savings (10h → 1h)
- Recommendation: Fully automate, spot-check 10% for correctness
Example:
# AI Agent: Traceability Matrix Generator
def generate_traceability_matrix(source_files, requirements_db):
"""
Parse source code for @implements tags, link to requirements
"""
matrix = []
for file in source_files:
for line in file.lines:
if "@implements" in line:
req_id = extract_requirement_id(line) # e.g., "SWE-045"
function = extract_function_name(file, line_number)
test_case = find_test_case(req_id) # Search test files
matrix.append((req_id, function, file.name, test_case))
export_to_excel(matrix, "traceability_matrix.xlsx")
return matrix
# Human: Spot-check 10% of links (50 out of 500), approve if correct
2. MISRA C Compliance Checking [PASS]
- Success Rate: 90-95%
- Why AI Excels: Static analysis tool integration (deterministic), no judgment needed
- ROI: 83% time savings (1h → 10min)
- Recommendation: Fully automate, human reviews findings
Workflow:
# AI Agent runs static analyzer
cppcheck --addon=misra --xml src/*.c > misra_report.xml
# AI parses XML, generates human-readable report
python ai_misra_reporter.py misra_report.xml
# Output:
# - 12 violations found
# - 8 auto-fixable (explicit casts, const qualifiers)
# - 4 require manual review (complex rule 21.3: malloc in safety-critical code)
# AI auto-fixes trivial violations, creates pull request
# Human reviews and merges
3. Doxygen Comment Generation [PASS]
- Success Rate: 90-95%
- Why AI Excels: Code analysis (function signature, return type), template-based
- ROI: 80% time savings (10min → 2min per function)
- Recommendation: Automate for all functions, human spot-checks safety notes
Example:
// Input: Function without Doxygen comment
uint32_t CRC32_Calculate(const uint8_t* data, size_t length);
// AI-generated Doxygen header:
/**
* @brief Calculate CRC-32 checksum (IEEE 802.3 polynomial)
* @implements [SWE-078] CRC Checksum Calculation
* @safety_class ASIL-B
*
* @param[in] data Pointer to data buffer (must not be NULL)
* @param[in] length Data length in bytes (range: 0-65535)
* @return CRC-32 checksum (32-bit unsigned integer)
*
* @note Polynomial: 0xEDB88320 (reflected CRC-32)
* @note Complexity: O(n), where n = length
*/
uint32_t CRC32_Calculate(const uint8_t* data, size_t length);
Human Review: Verify @safety_class tag (ASIL-B correct?), approve
Low-Value Automation Candidates
Tasks Where AI Struggles
1. Safety-Critical Logic Design [FAIL]
- Success Rate: 40-60% (too low for safety)
- Why AI Fails: Requires domain expertise (failure modes, hazard analysis, ISO 26262 knowledge)
- Example Failure:
- Prompt: "Design brake control logic for ACC, ASIL-B compliant"
- AI Output: Generic PID controller (60% correct)
- Missing: Fail-safe behavior, redundancy, watchdog, and diagnostic coverage
- Recommendation: Human designs, AI generates the implementation from the human spec
Mitigation Strategy: For safety-critical tasks, use AI for: (1) Drafting non-safety code sections, (2) Generating boilerplate/scaffolding, (3) Documentation of human-designed logic. Always keep human in design loop for safety decisions per ISO 26262 Part 8.
2. Architectural Trade-Off Decisions [FAIL]
- Success Rate: N/A (not a suitable AI task)
- Why AI Fails: Requires business context (cost, schedule, OEM requirements), long-term vision
- Example:
- Question: "Use AUTOSAR Classic or Adaptive for this ECU?"
- AI Response: Lists pros and cons (correct), but cannot make the final decision
- Human Needed: Architect considers project constraints and makes the call
- Recommendation: Human decides, documents the decision in an ADR, AI drafts the ADR template
3. Novel Bug Debugging [WARN]
- Success Rate: 50-70% (depends on bug complexity)
- Why AI Struggles: Novel bugs require creative problem-solving, not just pattern matching
- Example Success: Standard bug (null pointer dereference) → AI suggests fix [PASS]
- Example Failure: Timing bug (race condition, only occurs at 200 km/h) → AI cannot reproduce [FAIL]
- Recommendation: AI assists (logs analysis, stack trace), human investigates root cause
Task Delegation Decision Tree
When to Use AI vs Human
Start: New ASPICE task assigned
│
▼
Is task safety-critical?
│
├─ Yes → Human designs logic
│ AI generates implementation
│ Human reviews 100%
│
└─ No → Is task routine/standard?
│
├─ Yes (e.g., CRC, sort, getters)
│ → AI generates code
│ → Human spot-checks (10%)
│
└─ No (novel/creative)
→ Human implements
→ AI assists (suggestions)
ROI Prioritization
Highest ROI Tasks for AI Automation
| Task | AI Success Rate | Time Savings | ROI | Priority |
|---|---|---|---|---|
| Traceability Matrix | 95-100% | 90% (10h → 1h) | ★★★★★ | High |
| Doxygen Comments | 90-95% | 80% (10min → 2min per fn) | ★★★★★ | High |
| MISRA Checking | 90-95% | 83% (1h → 10min) | ★★★★★ | High |
| Unit Test Generation | 80-85% | 75% (1h → 15min) | ★★★★☆ | High |
| Boilerplate Code | 90-95% | 80% (5min → 1min per fn) | ★★★★☆ | Medium (low absolute time) |
| ADR Drafting | 80-90% | 75% (4h → 1h) | ★★★★☆ | Medium |
| Requirements Extraction | 70-80% | 75% (8h → 2h) | ★★★☆☆ | Medium (needs QC) |
| UML Diagrams | 85-95% | 62% (8h → 3h) | ★★★☆☆ | Low (infrequent task) |
Recommendation: Prioritize automation for High ROI tasks (traceability, MISRA, Doxygen, unit tests)
Summary
Capability Mapping Insights:
- Best AI Tasks: Traceability, MISRA checking, Doxygen, unit tests (90%+ success, 75-90% time savings)
- Avoid AI For: Safety logic design, architectural decisions, novel debugging (<50% success)
- Partial AI Use: Requirements extraction, ADR drafting, integration tests (70-80% success, needs human completion)
- ROI Focus: Automate high-frequency, high-time-savings tasks first (traceability: 10h → 1h = 90% savings)
Next: Limitation acknowledgment (when AI must escalate to human) - 29.04