AI for Incident Investigation: Using It Without Losing ICAM Rigour

An ICAM lead investigator on where AI genuinely speeds up incident investigation, and where it quietly wrecks your causal analysis. 90% of safety pros already have concerns.

11 min read
  • AI
  • Incident Investigation
  • ICAM
  • WHS
  • Root Cause Analysis
A worker on an industrial site, the kind of setting an ICAM investigation examines

Yes, AI can produce an ICAM report in minutes. I would still not let it run an investigation. AI genuinely speeds up the clerical scaffolding of an incident investigation: transcribing interviews, drafting the timeline, organising evidence, formatting the report. It quietly wrecks the part that actually matters, the multi-causal reasoning that separates a real systemic cause from a fluent-sounding correlation. The vendor promise of an "ICAM report in minutes" sells you the easy 80% of the typing and silently automates the hard 20% of the thinking.

I am an ICAM lead investigator and have investigated critical incidents at enterprise scale. The stakes here are not abstract. Safe Work Australia recorded 188 worker deaths in 2024 and 146,700 serious workers' compensation claims in 2023-24, more than 400 a day (Safe Work Australia, 2025). An investigation that lands on the wrong cause does not just waste time. It leaves the next person exposed. So here is the honest field guide to where AI helps an ICAM investigation, and where it does damage.

What ICAM actually is (the one-minute version)

ICAM is an Australian-built investigation method that treats every serious incident as the product of multiple aligned failures, never one root cause. That single assumption is exactly the one AI tooling is worst at honouring.

It was developed by Gerry Gibb of Safety Wise in the late 1990s, drawing on the work of human-error researcher Professor James Reason and the Australian Transport Safety Bureau, and it became standard across Australian mining and resources. It sits directly on Reason's "Swiss cheese" model: defences are layers with holes, and an incident happens when latent organisational weaknesses line up with active failures so a hazard passes through every barrier.

ICAM structures causation across four levels, worked backwards from the event: absent or failed defences, individual or team actions, the task and environmental conditions that shaped those actions, and the organisational factors underneath. The systemic, organisational level is where durable fixes live, which is why "re-train the worker" is considered a weak recommendation. Evidence is gathered with PEEPO (People, Environment, Equipment, Procedures, Organisation), a discipline whose entire purpose is to mitigate bias and stop the team fixating on the obvious cause. A trained lead investigator holds the analysis. That credential is real in Australia and New Zealand, not a fill-in-the-template exercise.

Where AI genuinely helps an ICAM investigation

AI earns its place in the clerical and collation layers of an investigation: the transcription, the timeline draft, the first-pass evidence sort, the report formatting. Anywhere speed helps and a human still owns the judgement, it is welcome.

Map the genuine wins onto the ICAM sequence. AI can transcribe and summarise interviews, draft the event timeline as you gather evidence, do a first-pass sort of material into PEEPO categories for a human to confirm, and format and consistency-check the final report. Surfacing patterns across a back catalogue of past incidents is a legitimate strength too: this is prevention and trend work, distinct from reasoning about a single event.

Adoption reflects this. In a 2026 survey of 1,053 EHS, operations and risk professionals, 20% reported extensive AI integration and 62% moderate or limited use (NSC & Wolters Kluwer Enablon, 2026). That is a US sample, so read it as direction rather than a local number, but the trajectory is the same here. Safety professionals are already using general assistants to structure a 5 Whys or a fishbone diagram and to draft summaries for leadership.

The rule that separates this section from the next is simple. AI may draft and organise. It may not decide. Whoever owns the contributing-versus-non-contributing call, and the causal judgement behind it, is the investigator.

Where AI quietly wrecks the causal analysis

AI wrecks an ICAM investigation at the exact point its value lives: the multi-causal, latent-condition reasoning. It pattern-matches a fluent linear story, the team anchors to it, and the confidence makes it hard to challenge.

The clearest evidence is from an adjacent safety task. When four large language models generated HAZOP worksheets from the same diagram, only 19 to 37% of the accident scenarios they produced were semantically valid, despite high textual similarity to the expert worksheets. The authors concluded that these models are assistive tools, not replacements for expert-led analysis (Park et al., Safety Science, 2026). Looks right, is wrong, is the whole problem in one finding. A scoping review of safety-critical AI benchmarks reinforces it, noting "limited evaluation of causal reasoning in technical contexts" and output that varies from run to run (Dokas, Safety Science, 2026).

Then there is fabrication. General-purpose models hallucinate between 69% and 88% of the time on specific legal queries (Stanford Law School, 2024). The models marketed for "reasoning" are not exempt: OpenAI's own system card reported its o3 model hallucinated on 33% of one benchmark and o4-mini on 48%, up from earlier versions (OpenAI o3 and o4-mini System Card, 2025). An AI can invent a plausible causal chain, or claim to have verified evidence it never saw, and write it with total confidence.

The most insidious risk is automation bias: the AI's wrong answer dragging an expert off their own correct judgement.

The same pattern plays out as accuracy in expert judgement under a wrong AI prompt:

Automation bias: radiologist accuracy with a correct vs incorrect AI suggestion
Experienced, AI correct82%Experienced, AI wrong45.5%Less experienced, AI correct80%Less experienced, AI wrong20%
Automation bias: radiologist accuracy with a correct vs incorrect AI suggestion
CategoryValue (%)
Experienced, AI correct82%
Experienced, AI wrong45.5%
Less experienced, AI correct80%
Less experienced, AI wrong20%
Source: Dratsch et al., Radiology (RSNA)

Beyond anchoring, AI defaults pull against the method itself. It tends to collapse a multi-causal event into a single tidy root cause, the precise Safety-I error that ICAM, Safety-II and the Human and Organisational Performance movement exist to dismantle. It reaches for the individual's "unsafe act" rather than the organisational level, reintroducing blame. As Australian consultant John Ninness puts it, language models "pattern-match against surface features of incident narratives. They do not reason causally about latent conditions," and "plausibility is not verification" (Safetysure, 2026). AI also cannot read a witness: it misses tone, demeanour and credibility, which undermines the People half of PEEPO. And the more reliable a tool seems, the less anyone checks it, so investigators slowly lose the analytical muscle the AI was meant to support.

A safe AI-assisted ICAM workflow, step by step

The safe pattern is simple to state and harder to hold to: let AI do the collation and drafting, force yourself to form an independent causal view before you read any AI suggestion, and keep every causal call and recommendation in trained hands.

Here is how the AI role maps across the seven phases of an ICAM investigation.

ICAM phaseAI can help withHuman ownsWatch for
Incident responseLogging, notificationsMake safe, secure the scenePrivacy of the scene and people
Investigation planningAdmin, scheduling draftsTeam, scope, leadConfidentiality and privilege
PEEPO planning (Mark 1)Suggest an evidence checklistDecide what to collectOver-narrowing the evidence
PEEPO collectionTranscribe, OCR, draft the timelineValidate evidence, run interviewsHallucinated timeline entries, witness nuance
ICAM analysisNothing until you have drafted the four levels yourselfRight-to-left causal reasoningAnchoring, single-cause flattening
RecommendationsFormat, cross-checkTarget the organisational levelLow-order procedural fixes only
ReportingDraft prose, consistency-checkSign-off, verify every citationFabricated references

A few governance rules make the difference:

  • Independent view before AI. Draft your own four-level analysis before you let any model suggest causes. This defeats anchoring.
  • Verify every factual claim and citation. Plausibility is not verification. If the AI cites a standard or a clause, open it.
  • Never paste identifiable witness statements or incident facts into a public or training model. This is a confidentiality and privilege risk, not a convenience question.
  • Treat run-to-run inconsistency as a defect. A causal analysis that changes when you re-run the prompt cannot anchor a defensible corrective-action plan.
  • Name a human owner of the causal analysis. Someone trained signs off, and is accountable.

Two realities sharpen the privilege point. First, privilege is not automatic: in Crafti v Cohealth the Fair Work Commission ordered an external investigation report produced despite a privilege claim, because the dominant-purpose and confidentiality tests were not met (Ashurst, 2025). Feeding the facts into a third-party tool does not help that case. Second, the data foundation is messier than the demos suggest: only 11% of surveyed organisations run fully digital EHS systems, with 71% hybrid and 18% still on paper (NSC & Wolters Kluwer Enablon, 2026). The promise that AI instantly assembles a clean, complete dataset assumes records that most workplaces do not have.

The "ICAM in minutes" tools, honestly

The "ICAM report in minutes" category is selling the typing, not the thinking. The honest read is that the speed claims are real for collation and unsubstantiated for causation.

Treat every vendor figure as a marketing claim. AI4HSE markets "85% time saved", turning "6-8 hours" into "minutes", and a tool that "requires no prior ICAM training to operate" (AI4HSE). Others advertise "50-80% efficiency gains" or "3x deeper root causes." None publish a methodology. Tellingly, the vendor that quantifies hardest also disclaims hardest: AI4HSE's own page says the output is "decision-support only" that "must be reviewed and validated by a qualified safety professional." That disclaimer is correct. It also undercuts the "no training needed" pitch, because validating a causal analysis is precisely what training is for.

Some claims point at the genuine use. A tool that analyses "decades of safety records" to surface trends is doing pattern work, which AI is good at, not single-incident causation. The line to hold is the one the research draws: AI is faster at collation, and causation is the contested part. And "purpose-built" is not a safety guarantee. Even dedicated, retrieval-augmented legal-AI tools hallucinated between 17 and 34% of the time (Stanford HAI, 2024). A tool built for ICAM will be wrong less often than a general chatbot, but it will still be wrong, and confidently.

Be sceptical of borrowed numbers, too. Impressive "mean time to resolution" figures from IT and DevOps incident response are a different field entirely. They are not evidence about workplace-safety investigation time or quality, and they should never stand in for it.

The bottom line

Used well, AI gives you back the hours that incident investigation wastes on typing, and hands them to the work that actually prevents the next incident: the interviews, the right-to-left causal analysis, the systemic recommendations. Used as a shortcut, it manufactures a confident, tidy, wrong story and anchors everyone to it. The dividing line is always the same: whoever owns the causal judgement is the investigator, and that cannot be the model.

This is the same lesson as my field guide to AI in workplace safety, applied to the one task where getting the cause wrong has the highest cost. If you want the deeper pattern behind trustworthy AI in this domain, I wrote about grounding AI in the actual law. And if you are still deciding which assistant to reach for at all, start with how I pick an AI tool. For the same test applied before the incident, when you are still drafting the controls, see AI-assisted SWMS and risk assessments.

Frequently asked questions

Can AI write my ICAM report?
It can draft the prose and format it, but it cannot own the causal analysis or the sign-off. Verify every factual claim and citation it produces. Plausibility is not verification, no matter how confident the writing sounds.
Is it safe to paste witness statements into ChatGPT?
No. Treat incident details as a confidentiality and legal-privilege risk. In Crafti v Cohealth (2025) the Fair Work Commission ordered an external investigation report produced despite a privilege claim. Use governed, non-training tooling and de-identify.
Will AI make my investigation faster?
Yes for collation and drafting: transcription, the timeline, sorting evidence, formatting the report. The causal reasoning should not be compressed, and trying to compress it is exactly where quality fails.
Does an AI ICAM tool replace lead-investigator training?
No. ICAM is a credentialed thinking discipline in Australia and New Zealand. Vendor claims that a tool needs 'no ICAM training to operate' contradict that standard and should be treated as a warning sign, not a feature.
Won't a purpose-built ICAM AI avoid hallucinations?
Not reliably. Purpose-built, retrieval-augmented legal-AI tools still hallucinated between 17 and 34% of the time (Stanford HAI, 2024). 'Built for this domain' reduces the rate, it does not remove it. Verify the output.
What is the single biggest risk?
Automation bias and anchoring. A confident AI first draft pulls even expert investigators toward a wrong, single-cause conclusion, and time pressure makes the effect worse. Form your own view before you read the AI's.

More from the blog