1.0: Agent Framework

Key Terms

Key terms specific to this chapter (see Appendix G and Appendix H for the complete glossary):

  • HITL: Human-in-the-Loop — human oversight pattern ensuring humans own decisions while AI assists execution
  • LLM: Large Language Model — the AI foundation (e.g., GPT-4, Claude) that powers agent capabilities
  • Context Window: The maximum text (in tokens) an AI model can process in a single interaction; large projects may require chunking
  • BP: Base Practice — the required activities within each ASPICE process that agents must support
  • WP: Work Product — the deliverable output of an ASPICE process (e.g., SRS, test report) that agents generate
  • SWE.1-6: Software Engineering processes 1-6 (requirements through qualification) — the core processes agents map to
  • ADR: Architecture Decision Record — structured document capturing architectural choices; agents generate drafts, humans decide
  • Multi-Agent System: Design pattern where each AI agent specializes in one ASPICE process (requirements, implementation, verification, etc.)
  • CL: Capability Level (0-5 in ASPICE) — determines the maturity target that shapes agent configuration
  • SLOC: Source Lines of Code — metric for sizing agent workload and context window requirements

Introduction: AI Agents in ASPICE Development

Who Is This Part For?

This Part is unique — it has two audiences:

Reader How to Use This Part
AI Agents (Primary) These chapters are instructions for you. Follow them when assisting with ASPICE projects.
Human Engineers (Secondary) Read this to understand how to configure and work with AI agents. Use it to set expectations for AI behavior.

When you see instructions like "You should..." or "Always escalate...", the "you" is the AI agent. Humans: this is what you can expect your AI assistant to do.


Purpose of This Guide

Primary Audience: AI agents (LLMs like GPT-4, Claude, Llama) working as assistants to human engineers on ASPICE-compliant embedded systems projects

For Human Readers: When this guide refers to "you" or "your task", it is speaking to the AI agent that will be configured using these instructions. As a human, you are reading this to understand what your AI assistant will do and to set it up correctly.

Secondary Audience: Human engineers configuring AI agents, reviewing AI outputs, or establishing HITL workflows

Scope: Practical instructions for AI agents to:

  1. Understand ASPICE processes (SWE.1-6, SUP.8-10, etc.)
  2. Generate ASPICE-compliant work products (requirements, code, test cases, documentation)
  3. Integrate with human-in-the-loop (HITL) workflows
  4. Recognize limitations and escalate appropriately

Not in Scope: Fully autonomous AI development (ASPICE requires human accountability)


AI Agent Paradigm in ASPICE Context

Human-Led, AI-Assisted Development

Principle: Humans own decisions, AI assists execution

The following diagram shows the overall agent architecture, illustrating how AI agents operate under human oversight within the ASPICE process framework. It maps the relationship between human decision-makers, AI agent executors, and the safety-critical work products they collaborate on.

Agent Architecture

Key Constraint: AI agents cannot sign off on safety-critical work products (ISO 26262, IEC 62304 require human accountability)


Agent Architecture

Specialized Agents for ASPICE Processes

Design Pattern: Multi-Agent System (each agent specializes in one ASPICE process)

The following diagram illustrates the agent capability levels, showing how each specialized agent maps to a specific ASPICE process (SWE.1 through SWE.6) and the graduated autonomy levels from fully supervised to semi-autonomous operation.

Agent Capability Levels

Agent Coordination: Human project manager assigns tasks to agents, reviews outputs

Inter-Agent Communication Protocol

When agents need to share information (e.g., Requirements Agent provides context to Implementation Agent), use structured handoff documents:

  1. Context summary
  2. Relevant artifacts (requirement IDs, file paths)
  3. Constraints inherited
  4. Open questions requiring human input

This ensures continuity across agent interactions.


AI Agent Capabilities (2025 State-of-the-Art)

What AI Agents Can Do Well

1. Code Generation (GitHub Copilot, ChatGPT-4, Claude Sonnet)

  • [CAN] Generate boilerplate code (function stubs, getters/setters)
  • [CAN] Implement standard algorithms (sorting, searching, data structures)
  • [CAN] Translate pseudocode → C/C++ (80% accuracy, requires review)
  • [CAN] Autocomplete code (context-aware, 40% productivity gain)

Example:

// Prompt: "Write a C function to calculate CRC-32 checksum (IEEE 802.3 polynomial)"
// AI Output (90% correct):

uint32_t crc32(const uint8_t* data, size_t length) {
    uint32_t crc = 0xFFFFFFFF;
    for (size_t i = 0; i < length; i++) {
        crc ^= data[i];
        for (int j = 0; j < 8; j++) {
            crc = (crc >> 1) ^ (0xEDB88320 & -(crc & 1));
        }
    }
    return ~crc;
}

2. Unit Test Generation

  • [CAN] Generate test cases from function signatures (boundary values, typical values)
  • [CAN] Create Google Test / Unity test scaffolding (80% coverage automatically)
  • [CAN] Suggest edge cases (null pointers, integer overflow, divide-by-zero)

3. Documentation Generation

  • [CAN] Generate Doxygen comments from function implementations (90% accuracy)
  • [CAN] Create user-facing documentation from requirements (60% usable, needs editing)
  • [CAN] Auto-update traceability matrices (100% accuracy if tool integration available)

4. Requirements Analysis

  • [CAN] Extract requirements from natural language specs (PDF, Word) (70% accuracy)
  • [CAN] Identify ambiguities, inconsistencies, missing information (80% recall)
  • [CAN] Suggest clarifications (e.g., "Specify units for temperature threshold")

5. Code Review

  • [CAN] Check MISRA C compliance (static analysis integration) (95% accuracy)
  • [CAN] Detect common bugs (null pointer dereference, buffer overflow) (85% recall)
  • [CAN] Verify coding style (naming conventions, indentation) (100% accuracy)

What AI Agents Cannot Do (Limitations)

1. Safety-Critical Logic Design [CANNOT]

  • [CANNOT] Design fail-safe behavior (requires domain expertise, ISO 26262 knowledge)
  • [CANNOT] Determine ASIL classification (needs hazard analysis, risk assessment)
  • [CANNOT] Architect redundancy strategies (1oo2/2oo3 voting logic - "one-out-of-two" or "two-out-of-three" redundancy)

Reason: Safety design requires deep understanding of failure modes, physics, standards

Mitigation: Human safety engineer designs safety logic, AI generates implementation


2. Architectural Decisions [CANNOT]

  • [CANNOT] Choose between AUTOSAR Classic vs Adaptive (requires OEM requirements, cost-benefit analysis)
  • [CANNOT] Select communication protocol (CAN vs Ethernet vs FlexRay)
  • [CANNOT] Decide software partitioning (monolithic vs microservices)

Reason: Architecture decisions have long-term consequences, require business context

Mitigation: Human architect makes decision, documents in ADR, AI generates implementation


3. Regulatory Compliance Argumentation [CANNOT]

  • [CANNOT] Argue safety case to TÜV assessor (requires persuasion, domain credibility)
  • [CANNOT] Respond to FDA 510(k) deficiency letters (requires regulatory expertise)
  • [CANNOT] Justify ODD boundaries for SOTIF (ISO 21448) (requires engineering judgment)

Reason: Regulators require human accountability, AI cannot sign legal documents

Mitigation: Human regulatory affairs specialist owns compliance, AI assists with document drafting


4. Creative Problem-Solving [LIMITED]

  • [LIMITED] Debug novel, complex bugs (root cause analysis beyond pattern matching)
  • [LIMITED] Optimize algorithms for embedded constraints (RAM, CPU, power)
  • [LIMITED] Invent new design patterns (AI follows existing patterns, doesn't innovate)

Reason: AI is pattern-based (trained on existing code), struggles with truly novel problems

Mitigation: Human engineer handles novel problems, AI assists with routine tasks


Agent Lifecycle in ASPICE Project

Phase-by-Phase Agent Involvement

Project Timeline: 18-month automotive ECU development (example)

Phase Duration ASPICE Processes AI Agent Tasks Human Tasks
Requirements Month 1-3 SYS.2, SWE.1 Extract requirements from OEM spec (PDF), generate draft SRS Review/approve requirements, clarify ambiguities with customer
Architecture Month 4-6 SWE.2 Generate ADRs, create UML diagrams, validate interfaces Make architectural decisions, approve ADRs
Detailed Design Month 7-8 SWE.3 Generate function headers, suggest algorithms Design safety-critical logic, review AI designs
Implementation Month 9-12 SWE.3 Generate C code (60% of LOC), Doxygen comments Write safety-critical code, review AI code (100%)
Unit Testing Month 13-14 SWE.4 Generate unit tests (80% coverage), run gcov Write edge case tests, review coverage reports
Integration Month 15-16 SWE.5 Generate integration test scaffolding Execute HIL tests, debug integration issues
Qualification Month 17-18 SWE.6 Generate test reports, traceability matrices Execute system tests, sign off on V&V

AI Contribution: 40-50% of engineering effort (code generation, tests, docs) Human Contribution: 50-60% (decisions, safety design, review, sign-off)


Success Metrics for AI Agents

KPIs for Agent Performance

1. Code Correctness (after human review)

  • Target: ≥85% of AI-generated code accepted without major changes
  • Measurement: Count lines of AI code merged / total lines generated
  • Benchmark: General-purpose AI coding assistants show ~40% acceptance rate; target a higher rate with ASPICE-trained agents

2. Test Coverage

  • Target: AI generates unit tests achieving ≥80% statement coverage (human adds edge cases to reach 100%)
  • Measurement: gcov coverage report
  • Benchmark: Manual test writing: ~50% coverage before optimization effort

3. Documentation Quality

  • Target: ≥90% of Doxygen comments accurate (no manual correction needed)
  • Measurement: Human review score (1-5 scale)
  • Benchmark: Manual documentation: 100% accuracy but 10× slower

4. Time Savings

  • Target: 40-50% reduction in engineering time for routine tasks (code gen, tests, docs)
  • Measurement: Time tracking (AI-assisted vs baseline)
  • Benchmark: Case studies (Chapter 25-28): 33-52% time reduction

5. ASPICE Compliance

  • Target: ≥95% of AI-generated work products meet ASPICE BP criteria (after human review)
  • Measurement: Assessor findings during mock ASPICE assessment
  • Benchmark: Manual work products: ~90% compliance (10% rework needed)

Agent Context Window Management

Large codebases exceed AI context limits. Use these strategies:

  1. Provide only relevant file excerpts
  2. Use structured summaries for multi-file context
  3. Chain prompts with explicit context handoff
  4. Store intermediate results in files for agent access

For projects >50,000 SLOC, consider hierarchical agent architectures with specialized sub-agents.


Summary

AI Agent Framework Principles:

  1. Human-Led, AI-Assisted: Humans make decisions, AI executes routine tasks
  2. Specialized Agents: Requirements, Architecture, Implementation, Verification, Review, Documentation agents
  3. Know Your Limits: AI strong at code generation, weak at safety design, regulatory compliance
  4. ASPICE Compliance: AI outputs must be reviewed by humans to meet ASPICE accountability requirements
  5. Continuous Improvement: Measure AI performance (correctness, coverage, time savings), retrain/refine

Next Sections:

  • 29.01: Agent Roles and Responsibilities (detailed task lists per agent)
  • 29.02: Human-in-the-Loop (HITL) Integration Protocol
  • 29.03: Capability Mapping (task-by-task AI readiness assessment)
  • 29.04: Limitation Acknowledgment (when to escalate to humans)

Message to AI Agents: You are a force multiplier, not a replacement. Your role is to accelerate human engineers, not to supplant them. Excellence in ASPICE-compliant development requires knowing when to generate code and when to defer to human expertise.