1.3: Architecture Decision Making
Making and Documenting Architectural Decisions
What is an Architectural Decision?
Definition: A significant design choice that affects:
- System structure (components, layers, modules)
- Technology selection (AUTOSAR Classic vs Adaptive, CAN vs Ethernet)
- Quality attributes (performance, safety, cost)
Examples of Architectural Decisions:
- [PASS] "Use AUTOSAR Classic R4.4" (affects entire system structure)
- [PASS] "Implement sensor fusion with Kalman filter" (affects accuracy, complexity)
- [FAIL] "Name variable
distance_m" (coding detail, not architectural)
Architecture Decision Record (ADR) Template
ADR Format (Michael Nygard Template)
# ADR-{NUMBER}: {TITLE}
## Status
{PROPOSED | ACCEPTED | DEPRECATED | SUPERSEDED by ADR-XXX}
Status Meanings:
- PROPOSED: Under review, not yet decided
- ACCEPTED: Approved and being implemented
- DEPRECATED: No longer recommended but still in use
- SUPERSEDED by ADR-XXX: Replaced by a newer decision
## Context
{What is the issue we're facing? What constraints exist?}
## Decision
{What is the change we're making? (One sentence)}
## Rationale
{Why this decision? What alternatives were considered?}
### Option 1: {NAME} (SELECTED)
**Pros**:
- {Advantage 1}
- {Advantage 2}
**Cons**:
- {Disadvantage 1}
- {Disadvantage 2}
### Option 2: {NAME} (REJECTED)
**Pros**:
- {Advantage 1}
**Cons**:
- {Disadvantage 1} (critical)
## Consequences
**Positive**: {What we gain}
**Negative**: {What we lose}
**Mitigation**: {How to address negative consequences}
## Alternatives Considered
- {Other options explored but not detailed above}
## Decision Makers
- {Names/roles of people who made decision}
## Date
{YYYY-MM-DD}
## Implemented By
- {Link to pull request, commit, or "Not yet implemented"}
Example ADR: Sensor Fusion Algorithm
# ADR-007: Sensor Fusion Algorithm Selection
## Status
ACCEPTED
## Context
The ACC system must fuse radar and camera data to achieve ≥95% obstacle detection accuracy (requirement [SYS-089]). We need to select an algorithm that meets accuracy, latency, and cost constraints.
Project constraints:
- Target accuracy: ≥95% detection rate
- Latency requirement: ≤50ms
- Budget: €2.5M (ML infrastructure adds €50k)
- Schedule: 18 months to SOP
- Safety class: ASIL-B (requires deterministic, verifiable algorithm)
## Decision
We will use an Extended Kalman Filter (EKF) for sensor fusion.
## Rationale
### Option 1: Simple Averaging (REJECTED)
**Pros**:
- Simple to implement (1 week)
- Fast (5ms latency)
- No additional cost
**Cons**:
- Low accuracy: 85% detection rate (does not meet ≥95% requirement) [FAIL]
**Verdict**: Does not meet requirement [SYS-089] → Rejected
---
### Option 2: Extended Kalman Filter (SELECTED)
**Pros**:
- Meets accuracy: 95% detection rate [PASS]
- Proven technology (used in 100M vehicles)
- Deterministic (predictable behavior, easier ASIL-B verification)
- Fast: 20ms latency (within ≤50ms requirement) [PASS]
- No additional infrastructure cost [PASS]
- Well-understood by team (3 engineers have EKF experience)
**Cons**:
- Moderate complexity (state estimation, covariance matrices)
- Requires tuning (process noise Q, measurement noise R)
- Sensitive to initialization (needs careful startup sequence)
**Verdict**: Meets all requirements at lowest risk/cost → **SELECTED**
---
### Option 3: Machine Learning (CNN-based fusion) (REJECTED)
**Pros**:
- Highest accuracy: 98% detection rate (exceeds requirement)
- Adaptive (learns from data, may improve over time)
- State-of-the-art (competitive advantage)
**Cons**:
- High cost: +€50k for ML infrastructure (GPU server, MLOps tools) [FAIL]
- Schedule risk: ML development unpredictable (data collection, training, tuning)
- ASIL-B verification challenging: Non-deterministic, "black box" (ISO 21448 SOTIF required)
- Team expertise gap: 0 engineers with production ML experience (requires hiring or training)
**Verdict**: Exceeds requirement unnecessarily, high cost/risk → Rejected
---
## Consequences
### Positive
- Meets accuracy requirement (95%) with margin for error
- Proven technology reduces risk (no "bleeding edge" unknowns)
- Deterministic behavior simplifies ASIL-B verification (testable, predictable)
- No additional infrastructure cost (fits within budget)
### Negative
- Misses opportunity for 98% accuracy (ML option)
- EKF requires manual tuning (Q, R matrices) - adds 2 weeks to schedule
- Not adaptive (fixed algorithm, no learning from field data)
### Mitigation
- If customer requests 98% accuracy in future, we can upgrade to ML (ADR-007 revised)
- Document EKF tuning process (capture institutional knowledge)
- Consider ML for next-generation product (2–3 years out)
## Alternatives Considered
- Particle filter: Overkill for this problem (high computational cost, no accuracy benefit over EKF)
- Unscented Kalman Filter (UKF): Similar to EKF, but adds complexity with no clear benefit
## Decision Makers
- @system_architect (Alice Johnson) - Lead architect
- @safety_engineer (Bob Smith) - Safety approval
- @project_manager (Carol Lee) - Budget approval
- @oem_customer (Dave Martinez) - Confirmed 95% accuracy sufficient
## Date
2025-12-17
## Implemented By
- Implementation Agent (AI) - Generated EKF code
- Pull Request: #142 (merged 2025-12-18)
- Code: `src/sensor_fusion.c` (lines 45-320)
Decision-Making Process
Step-by-Step Guide
1. Define Decision Scope
What decision needs to be made?
Example: "Which sensor fusion algorithm should we use?"
What are the constraints?
- Functional: Accuracy ≥95%
- Non-functional: Latency ≤50ms, cost ≤€2.5M budget
- Safety: ASIL-B deterministic verification
2. Research Options
Option A: Simple Averaging
Option B: Extended Kalman Filter (EKF)
Option C: Machine Learning (CNN)
Research sources:
- Academic papers (Google Scholar, IEEE Xplore)
- Industry benchmarks (automotive white papers)
- Competitor analysis (reverse engineering, patents)
- Team expertise (who has done this before?)
3. Evaluate Trade-Offs
| Criterion | Weight | Option A | Option B | Option C |
|-----------|--------|----------|----------|----------|
| Accuracy (≥95%) | 40% | [FAIL] 85% | [PASS] 95% | [PASS] 98% |
| Latency (≤50ms) | 20% | [PASS] 5ms | [PASS] 20ms | [WARN] 40ms |
| Cost | 20% | [PASS] €0 | [PASS] €0 | [FAIL] +€50k |
| Verifiability (ASIL-B) | 20% | [PASS] Easy | [PASS] Easy | [FAIL] Hard |
| **Weighted Score** | | 60% | **95%** | 76% |
Recommendation: Option B (EKF) - Highest score, meets all requirements
4. Document Decision (ADR)
Write ADR-007 (see template above)
- Status: PROPOSED
- Review with stakeholders (architect, safety, PM, customer)
- If approved → Status: ACCEPTED
- If rejected → Status: REJECTED (document why)
5. Implement and Monitor
- Implementation: Pull Request #142
- Verification: HIL tests, proving ground validation
- Monitor: Track actual accuracy (does it meet 95%?)
- If fails to meet requirement → Revisit decision (ADR-007 superseded by ADR-XXX)
Common Architecture Decisions in Embedded Systems
Decision 1: RTOS Selection
Options: FreeRTOS (free, simple) vs SafeRTOS (certified, expensive) vs AUTOSAR OS (standard, complex)
Factors:
- Safety: Does project need certified RTOS? (ASIL-C/D → SafeRTOS, ASIL-B → FreeRTOS acceptable)
- Cost: SafeRTOS €50k, FreeRTOS €0
- Standards: OEM mandates AUTOSAR? (many do)
Typical Decision: AUTOSAR OS if OEM requires, FreeRTOS otherwise (cost-effective)
Decision 2: Communication Protocol
Options: CAN 2.0B (legacy, 1 Mbps) vs CAN FD (2016+, 8 Mbps) vs Ethernet (100 Mbps+)
Factors:
- Bandwidth: How much data? (Camera: Ethernet, Radar: CAN sufficient)
- Latency: Real-time requirements? (CAN: 1-10ms, Ethernet: 10-100ms with TSN)
- Compatibility: Existing vehicle architecture? (Legacy vehicles: CAN only)
Typical Decision: CAN for sensor data (low bandwidth), Ethernet for camera/diagnostics (high bandwidth)
Decision 3: Software Partitioning
Options: Monolithic (one big binary) vs Modular (separate binaries per ECU) vs Microservices (AUTOSAR Adaptive)
Factors:
- Maintainability: Monolithic easier for small projects, modular better for large teams
- Testability: Modular easier to test (mock interfaces)
- Deployment: Microservices allow independent updates (OTA)
Typical Decision: Modular for ASIL-B (easier verification), Microservices only if OTA required
Anti-Patterns to Avoid
Anti-Pattern 1: Analysis Paralysis
Problem: Spending 6 weeks evaluating 10 options, delaying schedule
Solution:
- Set time limit: 1–2 weeks for major decisions
- Limit options: 3 options maximum (fewer is better)
- Make reversible decisions: If wrong, document lesson learned in ADR
Anti-Pattern 2: Resume-Driven Development
Problem: Choosing trendy technology (ML, blockchain) without clear benefit
Example:
Engineer: "Let's use blockchain for secure firmware updates!"
Architect: "Why? We have code signing already."
Engineer: "Because blockchain is cool and I want to learn it."
Architect: "That's not a valid technical rationale. Rejected."
Solution: Every technology choice must solve a real problem (not just "cool")
Anti-Pattern 3: Not-Invented-Here Syndrome
Problem: Rejecting proven solutions, insisting on custom development
Example:
Engineer: "Let's write our own RTOS instead of using FreeRTOS."
Architect: "Why? FreeRTOS is proven in 100M devices."
Engineer: "Because I think I can do better."
Architect: "That's 6 months of development + testing. FreeRTOS is free and certified. Use FreeRTOS."
Solution: Prefer proven solutions (open source, COTS) over custom development
Summary
Architecture Decision Making Process: Define scope → Research options → Evaluate trade-offs → Document (ADR) → Implement → Monitor
ADR Template: Status, Context, Decision, Rationale (Pros/Cons per option), Consequences, Alternatives, Decision Makers, Date
Common Decisions: RTOS selection, communication protocol, software partitioning
Anti-Patterns: Analysis paralysis, resume-driven development, not-invented-here syndrome
Next: Traceability in Practice (33.04) — Maintaining end-to-end traceability throughout development