1.2: HITL Integration Protocol
Human-in-the-Loop (HITL) Principles
Why HITL is Mandatory for ASPICE
Regulatory Requirement: ASPICE, ISO 26262, IEC 62304 require human accountability
- [FAIL] AI cannot sign off on safety-critical work products
- [FAIL] AI cannot be held legally liable for product failures
- [PASS] Human engineer must review and approve all AI outputs
HITL Definition: A workflow in which AI generates outputs and a human reviews and approves them before integration
HITL Workflow Pattern
Standard Review-Approve Cycle
The following diagram illustrates the standard HITL review-approve cycle, showing how AI-generated outputs flow through human review, feedback, revision, and final approval before becoming accepted work products.
Key Point: Human review (Step 4) is non-negotiable for safety-critical code
Review SLA and Timeout Handling: Set review SLAs (e.g., 24 hours for standard code, 4 hours for blocking issues). If review timeout occurs: (1) Escalate to backup reviewer, (2) Consider breaking PR into smaller pieces, (3) Log review delays for process improvement. Avoid bottlenecks by maintaining reviewer pool capacity.
Review Criteria by Safety Class
ASIL-B Code Review Checklist
Context: Automotive ISO 26262 ASIL-B (e.g., ACC ECU from Chapter 25)
ASIL-B Code Review Checklist (Human Engineer):
─────────────────────────────────────────────────────────
1. Functional Correctness (30 min review time)
☐ Algorithm matches specification (e.g., CRC-32 IEEE 802.3 polynomial)
☐ Edge cases handled (null pointer, array bounds, integer overflow)
☐ Return values checked (error codes propagated correctly)
2. Safety Requirements (20 min)
☐ Fail-safe behavior defined (what happens on error?)
☐ Defensive programming (input validation, range checks)
☐ No unsafe operations (division by zero, buffer overflow)
3. MISRA C:2012 Compliance (10 min)
☐ Verified by qualified tool (PC-lint Plus, not just cppcheck)
☐ All "Required" rules satisfied (0 violations)
☐ "Advisory" violations justified (documented in code comment)
4. Performance (15 min)
☐ Latency requirement met (e.g., CRC-32 in <10 µs on TriCore 300 MHz)
☐ Memory usage acceptable (stack: <1 KB, heap: none for ASIL-B)
☐ Worst-case execution time (WCET) analyzed (if real-time critical)
5. Traceability (5 min)
☐ @implements tag present (links to requirement SWE-XXX)
☐ Requirement exists in DOORS (not orphaned code)
☐ Test case exists (TC-SWE-XXX-1 verifies this function)
6. Code Quality (10 min)
☐ Doxygen comment complete (@brief, @param, @return, @safety_class)
☐ Naming conventions followed (snake_case for functions)
☐ No magic numbers (use named constants #define MAX_SIZE 100)
Total Review Time: ~90 min per 100 LOC (AI-generated code)
Baseline (manual code): ~30 min per 100 LOC (less review needed, more trust)
Verdict:
[PASS] APPROVE: All criteria met → Merge to main
[WARN] APPROVE WITH COMMENTS: Minor issues, fix in next iteration
[FAIL] REQUEST CHANGES: Critical issues, must fix before merge
Approval Rate: 85% of AI-generated code is approved on first review (15% requires changes)
IEC 62304 Class C Variant: For medical device software (Class C, the highest risk class under IEC 62304), add additional checklist items: (1) ISO 14971 risk control verification, (2) IEC 62304 unit decomposition compliance, (3) Software anomaly documentation, (4) SOUP (Software Of Unknown Provenance — third-party libraries whose development history cannot be fully verified) impact assessment. Review time increases approximately 30% for Class C compared to ASIL-B.
Iteration Limit Guidance: If AI-generated code requires >3 review iterations, escalate to senior engineer for manual implementation. Excessive iterations indicate task complexity exceeds current AI capability or unclear requirements.
Escalation Protocol
When AI Must Escalate to Human
Rule: The AI agent should proactively escalate when encountering:
-
Safety-Critical Decisions [WARN]
- Example: "Should we use 1oo2 or 2oo3 voting for this sensor?"
- Action: Flag for human safety engineer, provide options + pros/cons
-
Ambiguous Requirements [WARN]
- Example: Requirement says "respond quickly" (not quantified)
- Action: Generate clarification question, send to human requirements engineer
-
Conflicting Constraints [WARN]
- Example: Requirement demands 5 µs latency, but algorithm needs 50 µs
- Action: Alert human architect, suggest trade-offs
-
Novel Problems [WARN]
- Example: Bug that AI cannot debug after 3 attempts
- Action: Provide debugging log, request human investigation
Escalation Template:
## Escalation Request: ASPICE-1234
**Agent**: Implementation Agent
**Issue**: Safety-critical decision required
**Context**: Function `ACC_EmergencyBrake()` needs fail-safe behavior
**Question**: If brake actuator fails, should we:
A) Continue trying to brake (retry 3 times)
B) Alert driver + disable ACC immediately
C) Activate redundant brake system (if available)
**AI Recommendation**: Option B (alert + disable) based on ISO 26262 guidance
- Rationale: Simple, deterministic, no complex retry logic
- Consequence: ACC disabled, driver takes over (acceptable for ASIL-B)
**Required**: Human safety engineer approval before implementation
**Urgency**: Medium (blocks pull request #142)
Approval Authority Matrix
Who Approves What?
| Work Product | AI Generates | Human Reviews | Human Approves | Signature Required? |
|---|---|---|---|---|
| Requirements (SWE.1) | Draft SRS (80% complete) | Requirements Engineer | Requirements Engineer | [PASS] Yes (ASPICE SWE.1 BP6) |
| Architecture (SWE.2) | ADRs, UML diagrams | Software Architect | Software Architect | [PASS] Yes (design review) |
| Code (SWE.3) | C code (60-80% of LOC) | Senior Engineer | Senior Engineer | [PASS] Yes (code review) |
| Unit Tests (SWE.4) | Test code (80% coverage) | Test Engineer | Test Engineer | [PASS] Yes (test report) |
| Documentation (SUP.1) | SDS, user manual | Technical Writer | Project Manager | [PASS] Yes (release approval) |
| MISRA Report | Auto-generated (cppcheck) | QA Engineer | - | [FAIL] No (tool output) |
| Traceability Matrix | Auto-generated (parsed from code) | Requirements Engineer | - | [WARN] Spot-check only |
Key Insight: AI can generate, but cannot approve ASPICE work products
HITL Integration with CI/CD
Automated Review Gates
Goal: Enforce HITL review before code reaches production
CI/CD Pipeline with HITL Gates:
CI/CD Pipeline (GitLab CI example):
─────────────────────────────────────────────────────────
stages:
- build
- test
- review_prep
- human_review # ← HITL gate (blocks pipeline)
- integration
# Stage 1: Build (automated)
build_job:
stage: build
script:
- make clean && make all # Compile C code
artifacts:
paths:
- build/firmware.elf
# Stage 2: Test (automated)
test_job:
stage: test
script:
- ./run_unit_tests.sh # Google Test
- gcov src/*.c # Coverage analysis
coverage: '/\d+\.\d+% statements/'
artifacts:
paths:
- coverage_report.html
# Stage 3: Review Preparation (AI-assisted)
review_prep_job:
stage: review_prep
script:
- cppcheck --addon=misra src/ # MISRA check
- ai_agent_review.py # AI Review Agent generates report (implement this script per Chapter 30.05 Review Agent instructions)
artifacts:
paths:
- misra_report.txt
- ai_review_comments.md
# Stage 4: Human Review (HITL gate, manual trigger)
human_review_job:
stage: human_review
when: manual # Pipeline stops here, awaits human approval
script:
- echo "Awaiting human review of pull request..."
- # Human reviews AI-generated code via GitLab UI
- # Approves or requests changes
only:
- merge_requests
# Stage 5: Integration (automated, only after human approval)
integration_job:
stage: integration
script:
- git merge --ff-only # Merge to main branch
- ./deploy_to_target.sh # Flash firmware to ECU
only:
- main # Only runs after merge
Pipeline Flow:
- AI generates code → Automatic build + test (Stages 1-3)
- Pipeline pauses at Stage 4 (human review required)
- Human reviews code via GitLab UI, approves or rejects
- If approved → Pipeline continues to Stage 5 (merge)
- If rejected → AI refines code, restarts pipeline
Benefit: HITL review is enforced — it cannot be bypassed, and code will not merge without human approval
Review Turnaround Time
Expected Review Times
| Code Type | LOC | Review Time | Throughput |
|---|---|---|---|
| Boilerplate (getters, setters) | 10-20 | 5 min | Fast [PASS] (low risk) |
| Standard Algorithm (CRC, sort) | 50-100 | 30 min | Medium [PASS] |
| Safety-Critical Logic (brake control) | 100-200 | 2 hours | Slow [WARN] (high risk, detailed review) |
| ML Model Integration (TensorRT) | 200-500 | 4 hours | Very Slow [WARN] (novel, complex) |
Recommendation: Batch AI-generated code into small pull requests (fewer than 100 LOC) for faster review
Summary
HITL Integration Protocol:
- Non-Negotiable: Human review required for all safety-critical code (ASPICE, ISO 26262, IEC 62304)
- Review Criteria: Functional correctness, safety, MISRA, performance, traceability (90 min per 100 LOC)
- Escalation: AI must escalate safety decisions, ambiguities, conflicts, novel problems to humans
- Approval Authority: Humans sign off on work products (AI cannot approve)
- CI/CD Integration: HITL gates enforce review before merge (manual approval stage in pipeline)
Next: Capability mapping (task-by-task AI readiness assessment) - 29.03