1.2: HITL Integration Protocol

Human-in-the-Loop (HITL) Principles

Why HITL is Mandatory for ASPICE

Regulatory Requirement: ASPICE, ISO 26262, IEC 62304 require human accountability

  • [FAIL] AI cannot sign off on safety-critical work products
  • [FAIL] AI cannot be held legally liable for product failures
  • [PASS] Human engineer must review and approve all AI outputs

HITL Definition: A workflow in which AI generates outputs and a human reviews and approves them before integration


HITL Workflow Pattern

Standard Review-Approve Cycle

The following diagram illustrates the standard HITL review-approve cycle, showing how AI-generated outputs flow through human review, feedback, revision, and final approval before becoming accepted work products.

Human-in-the-Loop Protocol

Key Point: Human review (Step 4) is non-negotiable for safety-critical code

Review SLA and Timeout Handling: Set review SLAs (e.g., 24 hours for standard code, 4 hours for blocking issues). If review timeout occurs: (1) Escalate to backup reviewer, (2) Consider breaking PR into smaller pieces, (3) Log review delays for process improvement. Avoid bottlenecks by maintaining reviewer pool capacity.


Review Criteria by Safety Class

ASIL-B Code Review Checklist

Context: Automotive ISO 26262 ASIL-B (e.g., ACC ECU from Chapter 25)

ASIL-B Code Review Checklist (Human Engineer):
─────────────────────────────────────────────────────────

1. Functional Correctness (30 min review time)
    Algorithm matches specification (e.g., CRC-32 IEEE 802.3 polynomial)
    Edge cases handled (null pointer, array bounds, integer overflow)
    Return values checked (error codes propagated correctly)

2. Safety Requirements (20 min)
    Fail-safe behavior defined (what happens on error?)
    Defensive programming (input validation, range checks)
    No unsafe operations (division by zero, buffer overflow)

3. MISRA C:2012 Compliance (10 min)
    Verified by qualified tool (PC-lint Plus, not just cppcheck)
    All "Required" rules satisfied (0 violations)
    "Advisory" violations justified (documented in code comment)

4. Performance (15 min)
    Latency requirement met (e.g., CRC-32 in <10 µs on TriCore 300 MHz)
    Memory usage acceptable (stack: <1 KB, heap: none for ASIL-B)
    Worst-case execution time (WCET) analyzed (if real-time critical)

5. Traceability (5 min)
    @implements tag present (links to requirement SWE-XXX)
    Requirement exists in DOORS (not orphaned code)
    Test case exists (TC-SWE-XXX-1 verifies this function)

6. Code Quality (10 min)
    Doxygen comment complete (@brief, @param, @return, @safety_class)
    Naming conventions followed (snake_case for functions)
    No magic numbers (use named constants #define MAX_SIZE 100)

Total Review Time: ~90 min per 100 LOC (AI-generated code)
Baseline (manual code): ~30 min per 100 LOC (less review needed, more trust)

Verdict:
  [PASS] APPROVE: All criteria met  Merge to main
  [WARN] APPROVE WITH COMMENTS: Minor issues, fix in next iteration
  [FAIL] REQUEST CHANGES: Critical issues, must fix before merge

Approval Rate: 85% of AI-generated code is approved on first review (15% requires changes)

IEC 62304 Class C Variant: For medical device software (Class C, the highest risk class under IEC 62304), add additional checklist items: (1) ISO 14971 risk control verification, (2) IEC 62304 unit decomposition compliance, (3) Software anomaly documentation, (4) SOUP (Software Of Unknown Provenance — third-party libraries whose development history cannot be fully verified) impact assessment. Review time increases approximately 30% for Class C compared to ASIL-B.

Iteration Limit Guidance: If AI-generated code requires >3 review iterations, escalate to senior engineer for manual implementation. Excessive iterations indicate task complexity exceeds current AI capability or unclear requirements.


Escalation Protocol

When AI Must Escalate to Human

Rule: The AI agent should proactively escalate when encountering:

  1. Safety-Critical Decisions [WARN]

    • Example: "Should we use 1oo2 or 2oo3 voting for this sensor?"
    • Action: Flag for human safety engineer, provide options + pros/cons
  2. Ambiguous Requirements [WARN]

    • Example: Requirement says "respond quickly" (not quantified)
    • Action: Generate clarification question, send to human requirements engineer
  3. Conflicting Constraints [WARN]

    • Example: Requirement demands 5 µs latency, but algorithm needs 50 µs
    • Action: Alert human architect, suggest trade-offs
  4. Novel Problems [WARN]

    • Example: Bug that AI cannot debug after 3 attempts
    • Action: Provide debugging log, request human investigation

Escalation Template:

## Escalation Request: ASPICE-1234

**Agent**: Implementation Agent
**Issue**: Safety-critical decision required
**Context**: Function `ACC_EmergencyBrake()` needs fail-safe behavior

**Question**: If brake actuator fails, should we:
  A) Continue trying to brake (retry 3 times)
  B) Alert driver + disable ACC immediately
  C) Activate redundant brake system (if available)

**AI Recommendation**: Option B (alert + disable) based on ISO 26262 guidance
  - Rationale: Simple, deterministic, no complex retry logic
  - Consequence: ACC disabled, driver takes over (acceptable for ASIL-B)

**Required**: Human safety engineer approval before implementation

**Urgency**: Medium (blocks pull request #142)

Approval Authority Matrix

Who Approves What?

Work Product AI Generates Human Reviews Human Approves Signature Required?
Requirements (SWE.1) Draft SRS (80% complete) Requirements Engineer Requirements Engineer [PASS] Yes (ASPICE SWE.1 BP6)
Architecture (SWE.2) ADRs, UML diagrams Software Architect Software Architect [PASS] Yes (design review)
Code (SWE.3) C code (60-80% of LOC) Senior Engineer Senior Engineer [PASS] Yes (code review)
Unit Tests (SWE.4) Test code (80% coverage) Test Engineer Test Engineer [PASS] Yes (test report)
Documentation (SUP.1) SDS, user manual Technical Writer Project Manager [PASS] Yes (release approval)
MISRA Report Auto-generated (cppcheck) QA Engineer - [FAIL] No (tool output)
Traceability Matrix Auto-generated (parsed from code) Requirements Engineer - [WARN] Spot-check only

Key Insight: AI can generate, but cannot approve ASPICE work products


HITL Integration with CI/CD

Automated Review Gates

Goal: Enforce HITL review before code reaches production

CI/CD Pipeline with HITL Gates:

CI/CD Pipeline (GitLab CI example):
─────────────────────────────────────────────────────────

stages:
  - build
  - test
  - review_prep
  - human_review  # ← HITL gate (blocks pipeline)
  - integration

# Stage 1: Build (automated)
build_job:
  stage: build
  script:
    - make clean && make all  # Compile C code
  artifacts:
    paths:
      - build/firmware.elf

# Stage 2: Test (automated)
test_job:
  stage: test
  script:
    - ./run_unit_tests.sh  # Google Test
    - gcov src/*.c  # Coverage analysis
  coverage: '/\d+\.\d+% statements/'
  artifacts:
    paths:
      - coverage_report.html

# Stage 3: Review Preparation (AI-assisted)
review_prep_job:
  stage: review_prep
  script:
    - cppcheck --addon=misra src/  # MISRA check
    - ai_agent_review.py  # AI Review Agent generates report (implement this script per Chapter 30.05 Review Agent instructions)
  artifacts:
    paths:
      - misra_report.txt
      - ai_review_comments.md

# Stage 4: Human Review (HITL gate, manual trigger)
human_review_job:
  stage: human_review
  when: manual  # Pipeline stops here, awaits human approval
  script:
    - echo "Awaiting human review of pull request..."
    - # Human reviews AI-generated code via GitLab UI
    - # Approves or requests changes
  only:
    - merge_requests

# Stage 5: Integration (automated, only after human approval)
integration_job:
  stage: integration
  script:
    - git merge --ff-only  # Merge to main branch
    - ./deploy_to_target.sh  # Flash firmware to ECU
  only:
    - main  # Only runs after merge

Pipeline Flow:

  1. AI generates code → Automatic build + test (Stages 1-3)
  2. Pipeline pauses at Stage 4 (human review required)
  3. Human reviews code via GitLab UI, approves or rejects
  4. If approved → Pipeline continues to Stage 5 (merge)
  5. If rejected → AI refines code, restarts pipeline

Benefit: HITL review is enforced — it cannot be bypassed, and code will not merge without human approval


Review Turnaround Time

Expected Review Times

Code Type LOC Review Time Throughput
Boilerplate (getters, setters) 10-20 5 min Fast [PASS] (low risk)
Standard Algorithm (CRC, sort) 50-100 30 min Medium [PASS]
Safety-Critical Logic (brake control) 100-200 2 hours Slow [WARN] (high risk, detailed review)
ML Model Integration (TensorRT) 200-500 4 hours Very Slow [WARN] (novel, complex)

Recommendation: Batch AI-generated code into small pull requests (fewer than 100 LOC) for faster review


Summary

HITL Integration Protocol:

  1. Non-Negotiable: Human review required for all safety-critical code (ASPICE, ISO 26262, IEC 62304)
  2. Review Criteria: Functional correctness, safety, MISRA, performance, traceability (90 min per 100 LOC)
  3. Escalation: AI must escalate safety decisions, ambiguities, conflicts, novel problems to humans
  4. Approval Authority: Humans sign off on work products (AI cannot approve)
  5. CI/CD Integration: HITL gates enforce review before merge (manual approval stage in pipeline)

Next: Capability mapping (task-by-task AI readiness assessment) - 29.03