All Volumes/

AI Incident Response

Comprehensive incident response capabilities covering prompt injection, model poisoning, hallucination management, data leakage, and regulatory notification.

6.1 AI Incident Response Plan

The AI Incident Response Plan establishes the organisational structure, procedures, and resources for detecting, responding to, and recovering from AI-related security incidents. It integrates with the enterprise Incident Response Plan and addresses AI-specific incident types.

Incident Response Phases

  1. Preparation: Maintain incident response capabilities, training, tools, and documentation. Conduct regular tabletop exercises and red-team simulations.
  2. Detection & Analysis: Monitor AI systems for anomalous behaviour, security alerts, and user reports. Validate and classify incidents by severity.
  3. Containment: Implement short-term containment to stop immediate damage. Develop long-term containment strategy for sustained operations.
  4. Eradication: Remove the root cause of the incident, including compromised models, poisoned data, or injected prompts.
  5. Recovery: Restore affected systems to normal operations with enhanced monitoring. Validate integrity before full restoration.
  6. Post-Incident: Conduct lessons-learned review, update controls, and communicate findings to stakeholders.
PhaseTime ObjectiveKey ActionsOwner
Detection< 15 minutesAlert triage, initial validation, severity assignmentSOC Analyst
Containment< 1 hourIsolate affected systems, preserve evidence, notify IR leadIncident Manager
Eradication< 24 hoursRemove root cause, patch vulnerabilities, clean compromised artefactsTechnical Lead
Recovery< 72 hoursRestore services, validate integrity, enhanced monitoringOperations Lead
Post-Incident< 14 daysRoot cause analysis, lessons learned, control updatesIncident Manager

6.2 Severity Classification

All AI incidents shall be classified by severity to ensure appropriate resource allocation, escalation, and communication.

Severity Levels

SeverityDefinitionExamplesResponse TimeEscalation
Critical (SEV-1)Active exploitation causing severe business impact, data breach, or safety riskActive prompt injection on production LLM handling PII; model producing harmful safety-critical outputs< 15 minutesCEO, Board, Regulators
High (SEV-2)Confirmed compromise with significant impact potentialConfirmed data poisoning in training pipeline; unauthorised model access< 1 hourC-Suite, Steering Committee
Medium (SEV-3)Suspicious activity or limited impact incidentAnomalous inference patterns; isolated hallucination in customer-facing system< 4 hoursDirector, Security Lead
Low (SEV-4)Minor anomaly or policy violation with minimal impactFailed login attempts; minor policy deviation detected in audit< 24 hoursManager, Team Lead
Informational (SEV-5)No impact; observational or preventiveThreat intelligence alert with no confirmed exposure; control improvement opportunity< 72 hoursAnalyst

6.3 Prompt Injection Response

Prompt injection is a critical attack vector against large language models. This playbook provides step-by-step response procedures for confirmed or suspected prompt injection incidents.

Response Playbook

  1. Detect: Monitor for anomalous outputs, user reports of unexpected behaviour, or automated alerts from input validation systems.
  2. Confirm: Reproduce the suspicious input in a sandboxed environment. Document the exact prompt, model version, and system configuration.
  3. Contain: Immediately disable the affected model endpoint or route traffic to a fallback model. Preserve logs and model state for investigation.
  4. Assess Impact: Determine whether the injection resulted in data exfiltration, unauthorised actions, or harmful output generation.
  5. Notify: If personal data or regulated information was accessed, initiate regulatory notification timelines (72 hours for GDPR, as applicable).
  6. Remediate: Patch prompt templates, strengthen input validation, update system prompts, and retrain or fine-tune if model weights are compromised.
  7. Validate: Test fixes in staging with red-team prompt injection techniques before restoring production traffic.
  8. Document: Complete the Incident Report (Template 9.13) and update the Prompt Register with the attack vector and mitigations.

Immediate Action

Upon confirmation of prompt injection on a production system handling sensitive data or critical functions: isolate the endpoint within 15 minutes, preserve all logs, and escalate to the Incident Manager and CISO immediately.

6.4 Hallucination Management

AI hallucination — the generation of plausible but factually incorrect or unsupported content — poses significant risks for organisations using AI in decision-support, customer-facing, or content-generation roles.

Hallucination Response Procedures

  • Detection: Implement automated fact-checking, confidence scoring, and retrieval-augmented generation (RAG) grounding checks.
  • User Reporting: Establish clear channels for users to report suspected hallucinations with structured feedback forms.
  • Impact Assessment: Classify hallucinations by severity — factual error, harmful misinformation, or safety-critical incorrect advice.
  • Correction: For high-impact hallucinations, issue corrections to affected users, update knowledge bases, and retrain or fine-tune models.
  • Prevention: Enhance RAG retrieval quality, implement citation requirements, constrain output to verified sources, and use ensemble models for critical queries.
All customer-facing AI outputs include appropriate confidence indicators and disclaimers.
Hallucination rates are measured and reported monthly by model and use case.
High-confidence AI outputs for regulated domains require human verification.
Retrieval sources are authoritative, current, and validated.
Fallback to human review is enforced when confidence scores fall below thresholds.

6.5 Data Leakage & Insider Threat Response

Data leakage from AI systems can occur through model memorisation, prompt echoing, insecure API responses, or insider actions. This section establishes detection and response procedures.

Data Leakage Response

  1. Detect: Monitor API responses for unexpected inclusion of training data, PII, or confidential content. Use DLP tools on AI outputs.
  2. Confirm: Reproduce the leakage in a controlled environment. Identify the source data, model version, and triggering input.
  3. Contain: Restrict access to the affected model, disable the vulnerable endpoint, and block the triggering input patterns.
  4. Assess Scope: Determine which data elements were exposed, to whom, and for how long. Review access logs comprehensively.
  5. Notify: If personal data was leaked, notify the Privacy Officer immediately. Comply with mandatory breach notification timelines.
  6. Remediate: Retrain or fine-tune the model with differential privacy, implement output filtering, and strengthen data sanitisation.
  7. Review: Assess whether the leakage indicates a broader control failure requiring systemic improvement.

Insider Threat Considerations

Insider threats to AI systems include malicious model extraction, data theft, sabotage of training data, and unauthorised model deployment. Implement least-privilege access, segregation of duties, anomaly detection on developer actions, and regular insider risk assessments.

6.6 Regulatory Notification & Communications

Timely, accurate, and coordinated communication is essential during AI incidents. This section establishes notification requirements, communication templates, and approval workflows.

Notification Matrix

StakeholderTriggerTimelineChannelOwner
C-Suite / BoardAny SEV-1 or SEV-2 incidentWithin 1 hourSecure call + written summaryIncident Manager
Regulators (OAIC / ICO)Personal data breach affecting > 1 individual72 hours (GDPR) / as soon as practicable (APPs)Formal notificationPrivacy Officer / Legal
Affected IndividualsHigh risk to rights and freedomsWithout undue delayDirect communicationCustomer Success / Legal
Cyber InsuranceAny incident likely to trigger coverageWithin 24 hoursFormal claim notificationRisk Manager
Law EnforcementCriminal activity suspectedAs appropriateFormal referralGeneral Counsel
Media / PublicIncident of public interestAfter legal review and Board approvalApproved statementCommunications Director

Communication Discipline

All external communications regarding AI incidents require pre-approval from Legal and the Communications Director. Unauthorised statements by employees are prohibited and may result in disciplinary action.