7.2: MAN.5 Risk Management

Process Definition

Purpose

MAN.5 Purpose: To identify, analyze, treat, and monitor risks continuously throughout the project lifecycle.

Outcomes

Outcome	Description
O1	The sources of risks are identified and regularly updated
O2	Potential undesirable events are identified as they develop
O3	Risks are analyzed and priority for treatment is determined
O4	Risk measures are defined, applied, and assessed
O5	Appropriate treatment is taken to correct or avoid impact

Base Practices with AI Integration

BP	Base Practice	AI Level	AI Application
BP1	Identify sources of risks	L2	Pattern-based identification from project context
BP2	Identify potential undesirable events	L2	Event pattern matching, historical analysis
BP3	Determine risks (probability and severity)	L2	AI-assisted scoring and impact analysis
BP4	Define risk treatment options	L1	Treatment strategy recommendations
BP5	Define and perform risk treatment activities	L1	Mitigation action generation
BP6	Monitor risks	L2-L3	Automated monitoring and trend analysis
BP7	Take corrective action	L0-L1	Action recommendations

Risk Management Framework

The following diagram illustrates the MAN.5 risk management process, showing the cycle from risk identification and analysis through treatment planning, monitoring, and corrective action.

Risk Management Process

Risk Treatment Strategies

MAN.5 defines four fundamental risk treatment options (BP4):

Treatment	Definition	When to Use	Example
Accept	Tolerate the risk without action	Cost of mitigation exceeds potential impact; residual risk within acceptable tolerance	Accept risk of minor UI cosmetic defect in non-safety-critical display
Mitigate	Reduce probability or impact through preventive/protective actions	Risk level unacceptable but elimination not feasible; cost-effective mitigation exists	Implement software watchdog to reduce probability of system hang
Avoid	Eliminate the risk source entirely	Risk too severe; alternative approach available	Remove dependency on unqualified third-party library; develop in-house
Share/Transfer	Outsource risk to supplier, partner, or insurance	Risk outside direct control; external party better positioned to manage	Transfer hardware reliability risk to qualified automotive Tier-1 supplier

ASPICE Requirement: Treatment selection must be documented with justification (Work Product 08-27: Risk Mitigation Plan). Residual risk after treatment must be evaluated and accepted by appropriate authority.

Process vs. Product Risks

MAN.5 addresses both process and product risks throughout the development lifecycle:

Process Risks

Risks to project execution capability:

Schedule: Milestone delays, resource availability, dependency delays
Resources: Key personnel departure, skill gaps, infrastructure constraints
Methodology: Tool failures, process maturity, organizational change
External: Supplier dependencies, regulatory changes, customer requirement volatility

Example: HIL test equipment shared across multiple projects → integration testing delay → schedule risk.

Product Risks

Risks to system/software quality, safety, and performance:

Technical Feasibility: Algorithm complexity, real-time constraints, hardware limitations
Safety: Functional safety failures, hazard realization (ISO 26262 ASIL-rated risks)
Security: Cybersecurity vulnerabilities, attack surface (ISO/SAE 21434 CAL-rated risks)
Quality: Defect introduction, requirements gaps, integration incompatibilities

Example: Real-time constraints not achievable on target MCU → system requirements violation → product risk.

ASPICE Alignment: Both risk types must be managed continuously. Product risks often manifest as process risks (e.g., safety certification failure → schedule delay). Risk register (WP 08-26) captures both categories with clear classification.

AI-Powered Risk Identification

Risk Pattern Analyzer

"""
AI-assisted risk identification for embedded software projects.
"""

from dataclasses import dataclass, field
from typing import List, Dict, Optional, Set
from enum import Enum
from datetime import datetime

class RiskCategory(Enum):
    TECHNICAL = "technical"
    SCHEDULE = "schedule"
    RESOURCE = "resource"
    REQUIREMENTS = "requirements"
    INTEGRATION = "integration"
    QUALITY = "quality"
    SAFETY = "safety"
    SECURITY = "security"
    SUPPLIER = "supplier"
    REGULATORY = "regulatory"

class Probability(Enum):
    VERY_LOW = 1
    LOW = 2
    MEDIUM = 3
    HIGH = 4
    VERY_HIGH = 5

class Impact(Enum):
    NEGLIGIBLE = 1
    MINOR = 2
    MODERATE = 3
    MAJOR = 4
    SEVERE = 5

@dataclass
class Risk:
    """Project risk definition."""
    id: str
    title: str
    category: RiskCategory
    description: str
    cause: str
    consequence: str
    probability: Probability
    impact: Impact
    risk_level: int  # probability * impact
    mitigation: str
    owner: str
    status: str
    identified_date: datetime
    ai_confidence: float = 0.0


class RiskIdentifier:
    """AI-assisted risk identification.

    Note: Risk patterns should be customized based on organizational
    experience and project types (automotive, industrial, medical, etc.).
    """

    def __init__(self):
        self.risk_patterns = self._load_risk_patterns()
        self.historical_risks = self._load_historical_risks()

    def _load_risk_patterns(self) -> Dict[str, List[Dict]]:
        """Load risk patterns from knowledge base."""
        return {
            'embedded_software': [
                {
                    'pattern': 'new_target_platform',
                    'title': "New target platform learning curve",
                    'category': RiskCategory.TECHNICAL,
                    'triggers': ['new platform', 'first project', 'unfamiliar MCU'],
                    'probability': Probability.MEDIUM,
                    'impact': Impact.MODERATE
                },
                {
                    'pattern': 'real_time_constraints',
                    'title': "Real-time performance not achievable",
                    'category': RiskCategory.TECHNICAL,
                    'triggers': ['hard real-time', 'timing critical', 'deadline'],
                    'probability': Probability.MEDIUM,
                    'impact': Impact.MAJOR
                },
                {
                    'pattern': 'safety_certification',
                    'title': "Safety certification delays",
                    'category': RiskCategory.REGULATORY,
                    'triggers': ['ASIL', 'ISO 26262', 'functional safety'],
                    'probability': Probability.MEDIUM,
                    'impact': Impact.MAJOR
                },
                {
                    'pattern': 'integration_complexity',
                    'title': "Integration issues with external components",
                    'category': RiskCategory.INTEGRATION,
                    'triggers': ['third party', 'COTS', 'supplier'],
                    'probability': Probability.HIGH,
                    'impact': Impact.MODERATE
                },
                {
                    'pattern': 'requirements_volatility',
                    'title': "Requirements changes during development",
                    'category': RiskCategory.REQUIREMENTS,
                    'triggers': ['unclear requirements', 'customer changes', 'scope creep'],
                    'probability': Probability.HIGH,
                    'impact': Impact.MAJOR
                },
                {
                    'pattern': 'resource_availability',
                    'title': "Key resource unavailability",
                    'category': RiskCategory.RESOURCE,
                    'triggers': ['single point', 'expert', 'specialized'],
                    'probability': Probability.MEDIUM,
                    'impact': Impact.MAJOR
                },
                {
                    'pattern': 'test_equipment',
                    'title': "Test equipment availability",
                    'category': RiskCategory.RESOURCE,
                    'triggers': ['HIL', 'test bench', 'hardware'],
                    'probability': Probability.MEDIUM,
                    'impact': Impact.MODERATE
                }
            ]
        }

    def _load_historical_risks(self) -> List[Risk]:
        """Load risks from historical projects."""
        # In production, load from database
        return []

    def identify_risks(self, project_context: Dict) -> List[Risk]:
        """Identify risks based on project context."""

        identified_risks = []

        # Pattern-based identification
        for pattern in self.risk_patterns['embedded_software']:
            if self._pattern_matches(pattern, project_context):
                risk = self._create_risk_from_pattern(pattern, project_context)
                identified_risks.append(risk)

        # Historical pattern matching
        similar_risks = self._find_similar_historical_risks(project_context)
        identified_risks.extend(similar_risks)

        # De-duplicate and rank
        unique_risks = self._deduplicate_risks(identified_risks)
        ranked_risks = self._rank_risks(unique_risks)

        return ranked_risks

    def _pattern_matches(self, pattern: Dict, context: Dict) -> bool:
        """Check if a risk pattern matches project context."""

        context_text = ' '.join(str(v).lower() for v in context.values())

        for trigger in pattern['triggers']:
            if trigger.lower() in context_text:
                return True

        return False

    def _create_risk_from_pattern(self, pattern: Dict,
                                  context: Dict) -> Risk:
        """Create risk instance from matched pattern."""

        return Risk(
            id=f"RSK-{pattern['pattern'][:8].upper()}-{datetime.now().strftime('%Y%m%d')}",
            title=pattern['title'],
            category=pattern['category'],
            description=self._generate_description(pattern, context),
            cause=self._infer_cause(pattern, context),
            consequence=self._infer_consequence(pattern),
            probability=pattern['probability'],
            impact=pattern['impact'],
            risk_level=pattern['probability'].value * pattern['impact'].value,
            mitigation="TBD - Define mitigation strategy",
            owner="TBD",
            status="identified",
            identified_date=datetime.now(),
            ai_confidence=0.75  # Calibrate based on pattern match quality
        )

    def _generate_description(self, pattern: Dict, context: Dict) -> str:
        """Generate risk description from pattern and context."""
        return f"Risk identified based on pattern '{pattern['pattern']}'. " \
               f"Project characteristics suggest this risk is applicable."

    def _infer_cause(self, pattern: Dict, context: Dict) -> str:
        """Infer risk cause."""
        cause_map = {
            'new_target_platform': "Team lacks experience with target platform",
            'real_time_constraints': "Complex timing requirements may not be achievable",
            'safety_certification': "Safety assessment and documentation requirements",
            'integration_complexity': "Dependencies on external components",
            'requirements_volatility': "Customer requirements not fully defined",
            'resource_availability': "Critical skill dependency on limited resources",
            'test_equipment': "Shared or limited test infrastructure"
        }
        return cause_map.get(pattern['pattern'], "To be determined")

    def _infer_consequence(self, pattern: Dict) -> str:
        """Infer risk consequence."""
        consequence_map = {
            'new_target_platform': "Schedule delay, quality issues",
            'real_time_constraints': "System requirements not met",
            'safety_certification': "Delayed product release",
            'integration_complexity': "Integration issues, rework",
            'requirements_volatility': "Scope changes, rework",
            'resource_availability': "Schedule impact, knowledge loss",
            'test_equipment': "Testing delays"
        }
        return consequence_map.get(pattern['pattern'], "To be determined")

    def _find_similar_historical_risks(self, context: Dict) -> List[Risk]:
        """Find similar risks from historical projects."""
        # Placeholder for similarity matching
        return []

    def _deduplicate_risks(self, risks: List[Risk]) -> List[Risk]:
        """Remove duplicate risks."""
        seen_titles: Set[str] = set()
        unique = []
        for risk in risks:
            if risk.title not in seen_titles:
                seen_titles.add(risk.title)
                unique.append(risk)
        return unique

    def _rank_risks(self, risks: List[Risk]) -> List[Risk]:
        """Rank risks by risk level."""
        return sorted(risks, key=lambda r: r.risk_level, reverse=True)


def generate_risk_register(risks: List[Risk]) -> str:
    """Generate risk register markdown."""

    report = ["# Risk Register\n"]
    report.append(f"**Generated**: {datetime.now().strftime('%Y-%m-%d %H:%M')}\n")

    # Summary
    report.append("## Summary\n")
    report.append(f"| Total Risks | High | Medium | Low |")
    report.append(f"|-------------|------|--------|-----|")

    high = len([r for r in risks if r.risk_level >= 15])
    medium = len([r for r in risks if 8 <= r.risk_level < 15])
    low = len([r for r in risks if r.risk_level < 8])
    report.append(f"| {len(risks)} | {high} | {medium} | {low} |\n")

    # Risk table
    report.append("## Risk Details\n")
    report.append("| ID | Title | Category | P | I | Level | Status |")
    report.append("|-----|-------|----------|---|---|-------|--------|")

    for risk in risks:
        report.append(
            f"| {risk.id} | {risk.title} | {risk.category.value} | "
            f"{risk.probability.name} | {risk.impact.name} | "
            f"{risk.risk_level} | {risk.status} |"
        )

    return "\n".join(report)

Risk Assessment Matrix

The diagram below presents the risk assessment matrix, mapping probability against impact to classify risks into severity zones that drive treatment priority.

Risk Assessment Matrix

Risk Register Template

Note: Dates and owners are illustrative; actual registers use project-specific information.

# Risk Register (template)
risk_register:
  project: (Project Name)
  version: 1.0
  last_updated: (date)

  risks:
    - id: RSK-001
      title: "HIL test equipment availability"
      category: resource
      description: |
        HIL test bench is shared across multiple projects.
        Availability conflicts may delay integration testing.
      cause: "Limited HIL infrastructure, high utilization"
      consequence: "Integration testing delayed by 2-4 weeks"
      probability: medium
      impact: major
      risk_level: 12
      treatment: mitigate
      mitigation_actions:
        - action: "Reserve HIL time slots 4 weeks in advance"
          owner: Test Lead
          due_date: 2025-01-20
          status: complete
        - action: "Develop SIL simulation as backup"
          owner: SW Architect
          due_date: 2025-02-01
          status: in_progress
      residual_risk: 6
      owner: Project Manager
      status: mitigating
      monitoring_frequency: weekly

    - id: RSK-002
      title: "Requirements changes from customer"
      category: requirements
      description: |
        Customer has indicated potential changes to door lock timing
        requirements based on field feedback from other vehicles.
      cause: "Customer field experience driving specification changes"
      consequence: "Rework in implementation and testing phases"
      probability: high
      impact: moderate
      risk_level: 12
      treatment: mitigate
      mitigation_actions:
        - action: "Weekly sync with customer requirements team"
          owner: Project Manager
          due_date: ongoing
          status: active
        - action: "Implement configurable timing parameters"
          owner: SW Architect
          due_date: 2025-01-25
          status: in_progress
      residual_risk: 6
      owner: Project Manager
      status: mitigating
      monitoring_frequency: weekly

    - id: RSK-003
      title: "Cold temperature qualification failure"
      category: technical
      description: |
        Door lock timing may not meet requirements at extreme
        cold temperatures (-40°C) due to transistor characteristics.
      cause: "Temperature-dependent hardware behavior"
      consequence: "Design change required, schedule impact"
      probability: medium
      impact: major
      risk_level: 12
      treatment: mitigate
      mitigation_actions:
        - action: "Early cold temperature characterization"
          owner: HW/SW Integration Lead
          due_date: 2025-01-15
          status: complete
        - action: "Implement temperature compensation"
          owner: SW Developer
          due_date: 2025-01-30
          status: planned
      residual_risk: 4
      owner: Technical Lead
      status: mitigating
      monitoring_frequency: bi-weekly

    - id: RSK-004
      title: "Key resource departure"
      category: resource
      description: |
        Senior developer with unique CAN stack expertise
        may leave during project execution.
      cause: "Market demand for embedded expertise"
      consequence: "Knowledge loss, schedule delay"
      probability: low
      impact: major
      risk_level: 8
      treatment: mitigate
      mitigation_actions:
        - action: "Document CAN stack architecture and design decisions"
          owner: SW Architect
          due_date: 2025-01-31
          status: in_progress
        - action: "Cross-train second developer"
          owner: Team Lead
          due_date: 2025-02-15
          status: planned
      residual_risk: 4
      owner: Project Manager
      status: mitigating
      monitoring_frequency: monthly

Risk Monitoring Dashboard

The following diagram shows the risk monitoring dashboard, providing real-time visibility into open risks, mitigation progress, trend indicators, and risk exposure over time.

Risk Monitoring Dashboard

Work Products

WP ID	Work Product	AI Role
08-25	Risk management plan	Template generation
08-26	Risk register	Pattern-based identification
13-22	Risk status report	Automated generation
08-27	Risk mitigation plan	Suggestion generation

Summary

MAN.5 Risk Management:

AI Level: L1-L2 (AI identifies and scores, human decides)
Primary AI Value: Risk identification, scoring, monitoring
Human Essential: Risk acceptance, mitigation decisions
Key Outputs: Risk register, mitigation plans
Continuous: Risk management throughout project lifecycle