3.0: Working with AI Assistants

AI as a Collaborative Tool

Purpose of This Chapter

Audience: Engineers learning to work effectively with AI code assistants (GitHub Copilot, ChatGPT, Claude)

Purpose: Master Human-AI collaboration for ASPICE-compliant development

What You'll Learn:

  1. Effective Prompting: How to ask AI for exactly what you need
  2. Reviewing AI Output: Critically evaluate AI-generated code, requirements, tests
  3. HITL Decision-Making: When to trust AI, when to override, when to escalate
  4. AI Tool Selection: Choose the right AI tool for the task

Why This Matters:

  • AI can increase productivity by 35–55% (GitHub Copilot study: 55% faster task completion — Peng et al., 2023, "The Impact of AI on Developer Productivity: Evidence from GitHub Copilot")
  • AI makes mistakes (hallucinations, incorrect code, missed requirements)
  • Human oversight is mandatory for safety-critical systems (ISO 26262, ASPICE)

AI Capability Evolution: AI capabilities improve quarterly. Reassess this chapter's recommendations every 6 months. Tasks marked "AI struggles" today may become "AI capable" in future versions. Follow industry publications (arXiv, OpenAI blog, Anthropic research) for updates.


AI in ASPICE Development

AI Role by Process

ASPICE Process AI Can Help With Human Must Do
SWE.1 (Requirements) Extract requirements from docs, detect ambiguities Final review, approval, stakeholder sign-off
SWE.2 (Architecture) Generate architecture diagrams, ADR templates Architecture decisions, trade-off analysis
SWE.3 (Implementation) Generate code from requirements, write Doxygen Review for correctness, MISRA compliance
SWE.4 (Unit Testing) Generate unit tests, test data Review test coverage, add edge cases
SWE.5 (Integration) Generate integration test scripts Define integration strategy
SUP.2 (Review) Find MISRA violations, complexity issues Final approval, safety review

Key Principle: AI assists, human decides (Human-in-the-Loop mandatory for ASIL-B and above)

AI Confidence Estimation: Add confidence level to AI outputs: High (repetitive/boilerplate tasks), Medium (standard algorithms with some domain specifics), Low (safety-critical logic, novel requirements). Use confidence level to determine review depth.


AI Capabilities vs Limitations

What AI Does Well

1. Code Generation (Boilerplate, Repetitive Code)

/* Prompt: "Generate CAN message parser for radar data (Message ID 0x200, 8 bytes)" */

/* AI Output: [PASS] Good (straightforward implementation) */
typedef struct {
    uint16_t distance_mm;
    int16_t relative_speed_cmps;
    uint8_t valid;
} RadarData_t;

int Parse_RadarCANMessage(uint8_t* buffer, RadarData_t* output) {
    if (buffer == NULL || output == NULL) {
        return -1;
    }

    output->distance_mm = (buffer[0] << 8) | buffer[1];
    output->relative_speed_cmps = (buffer[2] << 8) | buffer[3];
    output->valid = buffer[4];

    return 0;
}

2. Test Generation (Unit Tests from Requirements)

/* Prompt: "Generate Google Test cases for Parse_RadarCANMessage" */

/* AI Output: [PASS] Good (covers nominal, boundary, error cases) */
TEST(RadarParser, ValidMessage_ParsesCorrectly) { /* ... */ }
TEST(RadarParser, NullBuffer_ReturnsError) { /* ... */ }
TEST(RadarParser, NullOutput_ReturnsError) { /* ... */ }

3. Documentation (Doxygen Headers)

/* Prompt: "Add Doxygen header to this function" */

/* AI Output: [PASS] Good */
/**
 * @brief Parse CAN message from radar sensor
 * @param[in] buffer CAN message buffer (8 bytes)
 * @param[out] output Parsed radar data structure
 * @return 0 = success, -1 = invalid input
 */

4. Refactoring (Extract Function, Simplify Logic)

/* Prompt: "Refactor this 50-line function into smaller functions" */

/* AI Output: [PASS] Good (breaks into 3 small functions) */

5. Bug Finding (Static Analysis Issues)

/* Prompt: "Find bugs in this code" */

/* AI Output: [PASS] Finds: Buffer overflow, null pointer dereference, MISRA violations */

What AI Does Poorly

1. Architecture Decisions (Trade-off Analysis)

/* Prompt: "Should I use AUTOSAR Classic or Adaptive for this ACC ECU?" */

/* AI Output: [FAIL] Generic answer (doesn't know your budget, schedule, constraints) */
/* Human Needed: Trade-off analysis (cost, schedule, OEM requirements) → ADR */

2. Safety-Critical Logic (Fail-Safe Behavior)

/* Prompt: "Implement emergency braking logic for ASIL-B" */

/* AI Output: [WARN] May miss edge cases (sensor failure, redundancy, fail-safe state) */
/* Human Needed: Safety engineer reviews, adds fault handling, FMEA analysis */

3. Requirements Elicitation (Stakeholder Needs)

/* Prompt: "What are the requirements for ACC system?" */

/* AI Output: [FAIL] Generic requirements (not specific to your customer, vehicle, market) */
/* Human Needed: Interview customer, OEM, safety engineer → Write specific requirements */

4. Context-Specific Decisions (Project Constraints)

/* Prompt: "How should I implement sensor fusion?" */

/* AI Output: [WARN] Suggests Kalman filter, ML, etc. (doesn't know your constraints) */
/* Human Needed: Consider budget (€2.5M), schedule (18 months), ASIL-B → Choose EKF */

5. Creativity and Innovation (Novel Solutions)

/* Prompt: "Design a novel obstacle detection algorithm" */

/* AI Output: [FAIL] Rehashes existing algorithms (doesn't invent new ones) */
/* Human Needed: Research, prototyping, experimentation */

Human-AI Collaboration Model

The 70-20-10 Rule

For Typical Feature Development:

  • 70% AI-Generated: Boilerplate code, tests, documentation
  • 20% Human-Modified: Review AI output, fix errors, add edge cases
  • 10% Human-Created: Architecture decisions, safety logic, trade-offs

Example: ACC Speed Control Feature

Task: Implement [SWE-045-11] Calculate Safe Following Distance

1. AI Generates (70%):
   - Function skeleton (ACC_CalculateSafeDistance)
   - Doxygen header
   - 4 unit tests (nominal, boundary, error)
   - Basic implementation (v × 2 seconds)

2. Human Reviews and Modifies (20%):
   - Adds input validation (negative speed → 0)
   - Adds @implements tag for traceability
   - Adds 2 more edge case tests
   - Fixes MISRA violation (explicit cast)

3. Human Creates (10%):
   - Decides on 2-second following time (safety engineering decision)
   - Writes ADR-012 documenting rationale (industry standard, ISO 26262)
   - Reviews with safety engineer (ASIL-B approval)

AI Tool Landscape

Popular AI Code Assistants

Tool Best For Pros Cons Cost
GitHub Copilot Code completion in IDE Fast, context-aware, integrated Limited to code generation €10/month
ChatGPT-4 Requirements analysis, architecture General-purpose, conversational No IDE integration €20/month
Claude Sonnet Code review, refactoring Large context (200k tokens), accurate No IDE integration €20/month
Tabnine Embedded systems (trained on C code) Privacy (on-premise option) Less accurate than Copilot €12–39/month
Amazon CodeWhisperer AWS-integrated projects Free tier, security scans Limited language support Free-€19/month

Recommendation for ASPICE:

  • IDE: GitHub Copilot (code generation while typing)
  • Requirements: ChatGPT-4 or Claude (extract requirements from Word docs)
  • Review: Claude Sonnet (large context, can review entire file)

Workflow with AI

Typical Development Cycle

The following diagram shows a typical AI-assisted development cycle, from task assignment through AI-generated drafts, human review, iteration, and final approval.

Working with AI Assistants


Best Practices

1. Always Review AI Output

Never blindly trust AI (even if it "looks right")

Checklist:

  • Does code compile?
  • Does it meet requirements?
  • Are edge cases handled?
  • Is error handling defensive?
  • MISRA C:2012 compliant?
  • Traceability tags present?
  • Tests cover all branches?

2. Iterative Prompting

Don't expect perfect output first try

Example:

Prompt 1: "Generate PID controller"
→ AI outputs basic PID (no anti-windup, no saturation)

Prompt 2: "Add anti-windup for integral term"
→ AI adds integral clamping

Prompt 3: "Add output saturation [-100, +100]"
→ AI adds output limits

Prompt 4: "Make it MISRA C:2012 compliant"
→ AI fixes violations

3. Specify Context

Bad Prompt (vague):

"Generate speed control function"

Good Prompt (specific context):

"Generate C function for ASPICE SWE.3:
- Function: ACC_CalculateTargetSpeed
- Requirement: [SWE-045-3] Calculate target speed
- Inputs: obstacle_distance_m (float), vehicle_speed_kmh (float)
- Output: target_speed_kmh (float)
- Logic: If distance < 2-second following time, decelerate by 5 km/h
- Standards: MISRA C:2012, ASIL-B
- Include: Doxygen header with @implements tag"

Summary

AI Role in ASPICE: AI assists with code generation, testing, documentation (not architecture, safety decisions, requirements elicitation)

70-20-10 Rule: 70% AI-generated, 20% human-modified, 10% human-created

AI Capabilities: Good at boilerplate, tests, documentation; Poor at architecture, safety logic, context-specific decisions

Best Practices: Always review AI output, iterative prompting, specify context

Human-in-the-Loop Mandatory: For ASIL-B and above safety-critical systems, human approval required for all AI-generated code

Next: Effective Prompting (35.01) — How to write prompts that get exactly what you need