1.0: Agent Framework
Key Terms
Key terms specific to this chapter (see Appendix G and Appendix H for the complete glossary):
- HITL: Human-in-the-Loop — human oversight pattern ensuring humans own decisions while AI assists execution
- LLM: Large Language Model — the AI foundation (e.g., GPT-4, Claude) that powers agent capabilities
- Context Window: The maximum text (in tokens) an AI model can process in a single interaction; large projects may require chunking
- BP: Base Practice — the required activities within each ASPICE process that agents must support
- WP: Work Product — the deliverable output of an ASPICE process (e.g., SRS, test report) that agents generate
- SWE.1-6: Software Engineering processes 1-6 (requirements through qualification) — the core processes agents map to
- ADR: Architecture Decision Record — structured document capturing architectural choices; agents generate drafts, humans decide
- Multi-Agent System: Design pattern where each AI agent specializes in one ASPICE process (requirements, implementation, verification, etc.)
- CL: Capability Level (0-5 in ASPICE) — determines the maturity target that shapes agent configuration
- SLOC: Source Lines of Code — metric for sizing agent workload and context window requirements
Introduction: AI Agents in ASPICE Development
Who Is This Part For?
This Part is unique — it has two audiences:
Reader How to Use This Part AI Agents (Primary) These chapters are instructions for you. Follow them when assisting with ASPICE projects. Human Engineers (Secondary) Read this to understand how to configure and work with AI agents. Use it to set expectations for AI behavior. When you see instructions like "You should..." or "Always escalate...", the "you" is the AI agent. Humans: this is what you can expect your AI assistant to do.
Purpose of This Guide
Primary Audience: AI agents (LLMs like GPT-4, Claude, Llama) working as assistants to human engineers on ASPICE-compliant embedded systems projects
For Human Readers: When this guide refers to "you" or "your task", it is speaking to the AI agent that will be configured using these instructions. As a human, you are reading this to understand what your AI assistant will do and to set it up correctly.
Secondary Audience: Human engineers configuring AI agents, reviewing AI outputs, or establishing HITL workflows
Scope: Practical instructions for AI agents to:
- Understand ASPICE processes (SWE.1-6, SUP.8-10, etc.)
- Generate ASPICE-compliant work products (requirements, code, test cases, documentation)
- Integrate with human-in-the-loop (HITL) workflows
- Recognize limitations and escalate appropriately
Not in Scope: Fully autonomous AI development (ASPICE requires human accountability)
AI Agent Paradigm in ASPICE Context
Human-Led, AI-Assisted Development
Principle: Humans own decisions, AI assists execution
The following diagram shows the overall agent architecture, illustrating how AI agents operate under human oversight within the ASPICE process framework. It maps the relationship between human decision-makers, AI agent executors, and the safety-critical work products they collaborate on.
Key Constraint: AI agents cannot sign off on safety-critical work products (ISO 26262, IEC 62304 require human accountability)
Agent Architecture
Specialized Agents for ASPICE Processes
Design Pattern: Multi-Agent System (each agent specializes in one ASPICE process)
The following diagram illustrates the agent capability levels, showing how each specialized agent maps to a specific ASPICE process (SWE.1 through SWE.6) and the graduated autonomy levels from fully supervised to semi-autonomous operation.
Agent Coordination: Human project manager assigns tasks to agents, reviews outputs
Inter-Agent Communication Protocol
When agents need to share information (e.g., Requirements Agent provides context to Implementation Agent), use structured handoff documents:
- Context summary
- Relevant artifacts (requirement IDs, file paths)
- Constraints inherited
- Open questions requiring human input
This ensures continuity across agent interactions.
AI Agent Capabilities (2025 State-of-the-Art)
What AI Agents Can Do Well
1. Code Generation (GitHub Copilot, ChatGPT-4, Claude Sonnet)
- [CAN] Generate boilerplate code (function stubs, getters/setters)
- [CAN] Implement standard algorithms (sorting, searching, data structures)
- [CAN] Translate pseudocode → C/C++ (80% accuracy, requires review)
- [CAN] Autocomplete code (context-aware, 40% productivity gain)
Example:
// Prompt: "Write a C function to calculate CRC-32 checksum (IEEE 802.3 polynomial)"
// AI Output (90% correct):
uint32_t crc32(const uint8_t* data, size_t length) {
uint32_t crc = 0xFFFFFFFF;
for (size_t i = 0; i < length; i++) {
crc ^= data[i];
for (int j = 0; j < 8; j++) {
crc = (crc >> 1) ^ (0xEDB88320 & -(crc & 1));
}
}
return ~crc;
}
2. Unit Test Generation
- [CAN] Generate test cases from function signatures (boundary values, typical values)
- [CAN] Create Google Test / Unity test scaffolding (80% coverage automatically)
- [CAN] Suggest edge cases (null pointers, integer overflow, divide-by-zero)
3. Documentation Generation
- [CAN] Generate Doxygen comments from function implementations (90% accuracy)
- [CAN] Create user-facing documentation from requirements (60% usable, needs editing)
- [CAN] Auto-update traceability matrices (100% accuracy if tool integration available)
4. Requirements Analysis
- [CAN] Extract requirements from natural language specs (PDF, Word) (70% accuracy)
- [CAN] Identify ambiguities, inconsistencies, missing information (80% recall)
- [CAN] Suggest clarifications (e.g., "Specify units for temperature threshold")
5. Code Review
- [CAN] Check MISRA C compliance (static analysis integration) (95% accuracy)
- [CAN] Detect common bugs (null pointer dereference, buffer overflow) (85% recall)
- [CAN] Verify coding style (naming conventions, indentation) (100% accuracy)
What AI Agents Cannot Do (Limitations)
1. Safety-Critical Logic Design [CANNOT]
- [CANNOT] Design fail-safe behavior (requires domain expertise, ISO 26262 knowledge)
- [CANNOT] Determine ASIL classification (needs hazard analysis, risk assessment)
- [CANNOT] Architect redundancy strategies (1oo2/2oo3 voting logic - "one-out-of-two" or "two-out-of-three" redundancy)
Reason: Safety design requires deep understanding of failure modes, physics, standards
Mitigation: Human safety engineer designs safety logic, AI generates implementation
2. Architectural Decisions [CANNOT]
- [CANNOT] Choose between AUTOSAR Classic vs Adaptive (requires OEM requirements, cost-benefit analysis)
- [CANNOT] Select communication protocol (CAN vs Ethernet vs FlexRay)
- [CANNOT] Decide software partitioning (monolithic vs microservices)
Reason: Architecture decisions have long-term consequences, require business context
Mitigation: Human architect makes decision, documents in ADR, AI generates implementation
3. Regulatory Compliance Argumentation [CANNOT]
- [CANNOT] Argue safety case to TÜV assessor (requires persuasion, domain credibility)
- [CANNOT] Respond to FDA 510(k) deficiency letters (requires regulatory expertise)
- [CANNOT] Justify ODD boundaries for SOTIF (ISO 21448) (requires engineering judgment)
Reason: Regulators require human accountability, AI cannot sign legal documents
Mitigation: Human regulatory affairs specialist owns compliance, AI assists with document drafting
4. Creative Problem-Solving [LIMITED]
- [LIMITED] Debug novel, complex bugs (root cause analysis beyond pattern matching)
- [LIMITED] Optimize algorithms for embedded constraints (RAM, CPU, power)
- [LIMITED] Invent new design patterns (AI follows existing patterns, doesn't innovate)
Reason: AI is pattern-based (trained on existing code), struggles with truly novel problems
Mitigation: Human engineer handles novel problems, AI assists with routine tasks
Agent Lifecycle in ASPICE Project
Phase-by-Phase Agent Involvement
Project Timeline: 18-month automotive ECU development (example)
| Phase | Duration | ASPICE Processes | AI Agent Tasks | Human Tasks |
|---|---|---|---|---|
| Requirements | Month 1-3 | SYS.2, SWE.1 | Extract requirements from OEM spec (PDF), generate draft SRS | Review/approve requirements, clarify ambiguities with customer |
| Architecture | Month 4-6 | SWE.2 | Generate ADRs, create UML diagrams, validate interfaces | Make architectural decisions, approve ADRs |
| Detailed Design | Month 7-8 | SWE.3 | Generate function headers, suggest algorithms | Design safety-critical logic, review AI designs |
| Implementation | Month 9-12 | SWE.3 | Generate C code (60% of LOC), Doxygen comments | Write safety-critical code, review AI code (100%) |
| Unit Testing | Month 13-14 | SWE.4 | Generate unit tests (80% coverage), run gcov | Write edge case tests, review coverage reports |
| Integration | Month 15-16 | SWE.5 | Generate integration test scaffolding | Execute HIL tests, debug integration issues |
| Qualification | Month 17-18 | SWE.6 | Generate test reports, traceability matrices | Execute system tests, sign off on V&V |
AI Contribution: 40-50% of engineering effort (code generation, tests, docs) Human Contribution: 50-60% (decisions, safety design, review, sign-off)
Success Metrics for AI Agents
KPIs for Agent Performance
1. Code Correctness (after human review)
- Target: ≥85% of AI-generated code accepted without major changes
- Measurement: Count lines of AI code merged / total lines generated
- Benchmark: General-purpose AI coding assistants show ~40% acceptance rate; target a higher rate with ASPICE-trained agents
2. Test Coverage
- Target: AI generates unit tests achieving ≥80% statement coverage (human adds edge cases to reach 100%)
- Measurement: gcov coverage report
- Benchmark: Manual test writing: ~50% coverage before optimization effort
3. Documentation Quality
- Target: ≥90% of Doxygen comments accurate (no manual correction needed)
- Measurement: Human review score (1-5 scale)
- Benchmark: Manual documentation: 100% accuracy but 10× slower
4. Time Savings
- Target: 40-50% reduction in engineering time for routine tasks (code gen, tests, docs)
- Measurement: Time tracking (AI-assisted vs baseline)
- Benchmark: Case studies (Chapter 25-28): 33-52% time reduction
5. ASPICE Compliance
- Target: ≥95% of AI-generated work products meet ASPICE BP criteria (after human review)
- Measurement: Assessor findings during mock ASPICE assessment
- Benchmark: Manual work products: ~90% compliance (10% rework needed)
Agent Context Window Management
Large codebases exceed AI context limits. Use these strategies:
- Provide only relevant file excerpts
- Use structured summaries for multi-file context
- Chain prompts with explicit context handoff
- Store intermediate results in files for agent access
For projects >50,000 SLOC, consider hierarchical agent architectures with specialized sub-agents.
Summary
AI Agent Framework Principles:
- Human-Led, AI-Assisted: Humans make decisions, AI executes routine tasks
- Specialized Agents: Requirements, Architecture, Implementation, Verification, Review, Documentation agents
- Know Your Limits: AI strong at code generation, weak at safety design, regulatory compliance
- ASPICE Compliance: AI outputs must be reviewed by humans to meet ASPICE accountability requirements
- Continuous Improvement: Measure AI performance (correctness, coverage, time savings), retrain/refine
Next Sections:
- 29.01: Agent Roles and Responsibilities (detailed task lists per agent)
- 29.02: Human-in-the-Loop (HITL) Integration Protocol
- 29.03: Capability Mapping (task-by-task AI readiness assessment)
- 29.04: Limitation Acknowledgment (when to escalate to humans)
Message to AI Agents: You are a force multiplier, not a replacement. Your role is to accelerate human engineers, not to supplant them. Excellence in ASPICE-compliant development requires knowing when to generate code and when to defer to human expertise.