Claude Fable 5 for Safety Work: What Changes, What Doesn't

Anthropic launched Claude Fable 5 on 9 June 2026, a new tier above Opus. A WHS practitioner on what it changes for SWMS, investigations and analytics.

15 min read
  • AI
  • Claude
  • Fable 5
  • WHS
  • AI Tools
The Claude Fable wordmark and orange starburst logo on a cream background, with the tagline: next generation of intelligence for the hardest knowledge work and coding problems

Anthropic released Claude Fable 5 on 9 June 2026: its most capable model to date, and the first of a tier above Opus to reach general availability (Anthropic, Claude Fable 5 and Claude Mythos 5, 9 June 2026). For safety work it moves three ceilings at once. It can hold something close to an entire WHS management system in one read, it can run long multi-step jobs without losing the thread, and it writes prose people will actually finish. What it does not move is the floor: the duty of care, the consultation, and the competent person who signs.

I've spent a decade in WHS and I build with these models daily, including a WHS professional skill grounded in the legislation itself. I drafted this post with Fable 5 running underneath, then reviewed and reworked every word myself. That division of labour is the whole argument, so consider it disclosed up front.

What is Claude Fable 5?

Fable 5 is the public face of Anthropic's Mythos class, described at launch as "a Mythos-class model that we've made safe for general use" (Anthropic, 9 June 2026). The first Mythos-class model went to a small group of partners in April 2026 as Claude Mythos Preview, under a US government collaboration called Project Glasswing. Fable 5 is the second generation, hardened and released broadly. Its sibling, Claude Mythos 5, is the same model with fewer restrictions and stays limited to vetted partners and researchers.

The numbers that matter for document-heavy safety work: a context window of around one million tokens, and output up to 128,000 tokens (Anthropic model documentation, retrieved 10 June 2026). Anthropic claims "state-of-the-art" results on nearly all tested benchmarks, with the lead growing as tasks get longer and more complex. It reached GitHub Copilot, AWS Bedrock and Microsoft Foundry on launch day, so it will surface inside tools your organisation already runs, whether or not you went looking for it.

What actually changes for safety work?

Three things, and they compound.

Whole-of-system reading. A million tokens is roughly a full WHS management plan, its procedures, a project's SWMS library, the risk register and a couple of years of incident summaries, in one pass. That makes a new kind of question practical. Not "summarise this document" but "find where these documents disagree". Which SWMS contradicts the procedure it sits under? Which rescue plan names equipment the plant register stopped listing a year ago? Earlier models could summarise. They couldn't hold the whole system in view while they did it.

The state of most safety document systems is what makes this valuable. In early 2026 only 11% of EHS teams ran fully digital systems, and 71% worked across a hybrid of digital and manual (NSC and Wolters Kluwer Enablon, The Safety Shift: EHS Readiness in 2026, retrieved 10 June 2026). That survey is US based, but I'd recognise those numbers on most Australian sites. Scattered documents are where contradictions hide.

Long-horizon work. Anthropic says Fable 5 is built to "run for days at a time: planning across stages, delegating to sub-agents, and checking its own work" (Anthropic, Claude Fable product page, retrieved 10 June 2026), and that the longer the task, the larger its lead. Early third-party signals agree. TechCrunch reported that analytics platform Hex found it the first model to score 90% on its core analytics benchmark (TechCrunch, 9 June 2026). Simon Willison's first-day verdict was "The challenge is finding tasks that it can't do" (Simon Willison, 9 June 2026).

For a safety function, that profile fits the multi-stage job you keep deferring: five years of inconsistent incident exports into a cleaned dataset, a leading-indicator view and a written analysis, in one supervised run. I've run that pattern at smaller scale, turning a LinkedIn data export into a working dashboard in an afternoon. The new tier exists for the bigger version of that job.

Prose that gets read. The quieter gain is writing quality. A board paper that argues a position instead of listing activities. A safety alert in plain English rather than compliance prose. A toolbox talk a supervisor can deliver without translating it first. Most safety documents fail at the reading step, not the writing step, and the distance between a control that changes behaviour and one that sits in a folder is usually the prose.

The caveats are real. Willison also called Fable 5 "something of a beast" with "a big model smell: slow, expensive and capable", and his first day of testing cost US$110.42, about A$157. Accuracy stays a caveat too. Anthropic has published no hallucination figures for Fable 5, and a May 2026 study of frontier models, not yet including Fable 5, still measured hallucinated software package rates of 4.62% to 6.10% in coding tasks (Churilov, arXiv, May 2026). Slow, expensive and still fallible: you choose your moments, and you verify what comes back.

What doesn't change?

Everything that makes safety work safety work. Under the model WHS Act the primary duty of care sits with the PCBU and cannot be transferred to a contractor, a consultant or a context window. Consultation with the workers who do the work is a legal requirement, not a step a model can simulate. In any defensible system a competent person still reviews and signs, and the reasoning behind so far as is reasonably practicable still has to be a person's. I drew those lines for AI-assisted SWMS and for incident investigation under ICAM, and nothing released on 9 June moves any of them.

If anything, the law is moving the other way. In February 2026 NSW passed the Work Health and Safety Amendment (Digital Work Systems) Act 2026, a national first that adds an explicit duty on PCBUs to ensure, so far as is reasonably practicable, that workers are not put at risk by the allocation of work through digital work systems (NSW Parliament, February 2026). The definition reaches "an algorithm, artificial intelligence, automation or online platform". It commences by proclamation and is not yet in force, but the direction is unambiguous: the more capable the system, the sharper the question of who owns its risks.

The human pattern worries me more than the law does. In KPMG and the University of Melbourne's 2025 global study, 57% of Australian employees said they had relied on AI output at work without evaluating its accuracy (KPMG and University of Melbourne, Trust, attitudes and use of AI: A global study 2025, retrieved 10 June 2026). Another 59% said they had made mistakes in their work because of AI, and only 24% had any AI training. A model this polished makes unchecked trust more tempting, not safer. The analysts watching EHS software have reached the same conclusion: credible vendors keep a human in the loop so AI "augments, rather than replaces, professional judgement" (Verdantix, Future Of AI-Enabled EHS Software Solutions, October 2025).

One constant arrives sharpened by the launch terms: confidentiality. Investigation files carry personal and health information, and sometimes legal professional privilege, and Mythos-class traffic now sits under that 30-day retention rule. Uploading an incident file is a disclosure decision. Make it deliberately, against your enterprise controls, your privacy obligations and your own policy, before the convenience makes it for you.

Is it worth US$10 per million tokens?

Sometimes, and knowing when is the discipline. On API rates Fable 5 costs US$10 per million input tokens and US$50 per million output, around A$14 and A$71 at June 2026 exchange rates, double Opus 4.8 and more than triple Sonnet 4.6 (Anthropic pricing, retrieved 10 June 2026).

Claude API input price per million tokens, June 2026
Fable 510 USDOpus 4.85 USDSonnet 4.63 USDHaiku 4.51 USD
Claude API input price per million tokens, June 2026
CategoryValue ( USD)
Fable 510 USD
Opus 4.85 USD
Sonnet 4.63 USD
Haiku 4.51 USD
Source: Anthropic

Most practitioners will meet it inside a paid Claude plan first: it's included at no extra cost until 22 June 2026, then draws usage credits, so the cost shows up as how quickly the big model consumes your allowance. Either way the same logic holds. Match the tier to the task, and don't pay Mythos prices for Haiku work.

TaskTier I'd run it onWhy
Whole-of-system consistency reviews, long investigation filesFable 5Holding every document in one context is the point of the tier
Board papers, complex analyses, regulatory submissionsFable 5 or OpusReasoning depth and prose quality earn the spend
Routine SWMS and toolbox talk first draftsSonnet tierDrafting was already solved at a third of the price
Bulk triage, classifying hazard reports, tagging recordsHaiku tierHigh-volume work where speed and cost beat depth

If your question is which vendor rather than which tier, my comparison of the major AI tools covers the selection logic, which outlives any individual release.

How I'd put it to work this week

Two worked examples in full, and a third in brief. Useful AI workflows and dangerous ones are built from the same ingredients; the detail is what tells them apart, so here's the detail. Both worked examples keep the person exactly where the law already puts them.

Worked example 1: the whole-of-system consistency audit

The scenario. A mid-life infrastructure project. The WHS management plan was written at tender. The procedures came down from corporate. The SWMS library has grown to forty-odd documents drafted by different supervisors over two years, and the risk register has been reviewed twice since mobilisation. Every document is individually fine. Nobody has read them together, because no one person can hold them all in their head at once.

Why this is a Fable 5 job. Reading the whole set in one pass is the capability that did not exist before a million-token context window. Most SWMS run to a few thousand words, so forty of them plus the plan, the procedures and the register still fit with room to spare. Earlier models forced you to chunk and summarise, and the contradictions live in what summaries throw away: the cross-references between documents.

Preparing the pack. Export everything to text or PDF and strip what doesn't belong: worker names, signatures, medical details. None of it helps the audit, and all of it raises the privacy stakes. Then stage the conversation by telling the model what each document is and where it sits in your organisation's document hierarchy, so it can tell you where each contradiction should be escalated. One thing the hierarchy must never imply is that the higher document simply wins. Where a SWMS conflicts with the procedure above it, the SWMS still governs the task until a competent person reviews and revises it in consultation with the crew. The audit tells you which documents need that review.

What to ask for. Contradictions between documents, ranked by what happens if a crew relies on the wrong one. References to plant, equipment, roles or rescue arrangements that appear in one document and nowhere else. Controls that have drifted to the weak end of the hierarchy of controls, with PPE and signage standing in where isolation or engineering looks feasible. And high-risk construction work categories the scope appears to trigger that no SWMS in the pack covers.

You are helping me audit my project's document set for internal
consistency. You are not assessing compliance, and you must not
claim any document is compliant. Treat everything you produce as
a lead for a competent person to verify, not a finding.
 
The pack contains, in my organisation's order of document authority:
1. The WHS management plan
2. Corporate procedures [list them]
3. The SWMS library [list the titles]
4. The current risk register
 
Produce:
1. Every contradiction between documents: quote both passages, name
   both documents, and rank by the consequence if a crew relied on
   the wrong one.
2. Plant, equipment, roles or rescue arrangements referenced in one
   document but missing from the others.
3. Controls sitting at the weak end of the hierarchy (PPE,
   administrative) where a higher-order control looks feasible.
   Frame each as a question for a competent person, not a finding.
4. High-risk construction work categories the scope appears to
   trigger that no SWMS in the pack covers.
5. The assumptions you made, and everything you could not verify
   from the pack alone.
 
Where you are uncertain, say so plainly rather than smoothing it over.

What comes back, and what you do with it. Expect a long list, and expect some of it to be wrong. That's fine, because every item is a lead, not a finding. Walk the site, ask the crew, pull the revision history, and close each item out as confirmed, false or fixed. The output earns its keep by turning an unreadable document system into a finite checklist a competent person can actually work through. The SWMS rules hold the whole way: consultation and sign-off stay with people.

Worked example 2: the challenge pass on an investigation

The scenario. An ICAM investigation into a mobile plant near miss: an excavator slewed through an exclusion zone a spotter had stepped out of minutes earlier. You've built the timeline, taken statements, photographed the scene, pulled the SWMS and the prestart records, and drafted your contributing factors. The analysis is yours and it's finished. Now you want it attacked before the review panel, or a regulator, does it for you.

The gate before anything is uploaded. An investigation pack is the most sensitive document set a safety team holds: personal information, health information, and sometimes legal professional privilege. De-identify the statements and drop the medical detail, knowing de-identification is imperfect on a single incident, because a role like the spotter still identifies a person in a small crew. Confirm your enterprise AI controls before a single page goes in, and remember the 30-day retention rule on Mythos-class traffic. Above all: if privilege has been claimed, putting the material into an external AI service is a disclosure that can waive it. That's why legal owns the call, and why it comes before any workflow decision.

Why after, never instead. Causation is the line. Ask a model for causes and it will produce a plausible, generic causal story, and that story will anchor your thinking before the evidence does. That's exactly where AI quietly wrecks an ICAM investigation. So the sequence is non-negotiable: your timeline, your factors, your reasoning first. The model's job is to find the weak points in a finished analysis while there's still time to fix them.

What to ask for. Where the evidence doesn't support a draft factor. Alternative readings of the same evidence, framed as questions to test rather than conclusions to adopt. Organisational factors that are absent from the draft, since first-pass ICAM charts skew towards individual actions and go light on planning, supervision and change management. And the questions you didn't ask: the gaps a fresh set of eyes would circle in red.

You are challenging my completed draft incident analysis. The causal
analysis is mine, not yours. Do not propose your own causal narrative.
 
The pack contains a de-identified evidence file (timeline, witness
statement summaries, photo descriptions, procedure extracts, the
SWMS, prestart records) and my draft ICAM contributing factors.
 
Working only from the pack:
1. For each draft contributing factor, state whether the evidence
   supports it, contradicts it, or is silent, and quote the evidence.
2. For each factor, offer the strongest alternative interpretation
   of the evidence behind it, framed as a question for me to test.
   Do not assemble these into an alternative account of the incident.
3. List plausible organisational factors (planning, supervision,
   procedures, change management) absent from my draft, each tied
   to something specific in the evidence.
4. List the questions I appear not to have asked, and the evidence
   not yet collected, that would strengthen or kill each factor.
 
Be blunt. A weakness found now costs nothing. The same weakness
found by a review panel costs the investigation.

What stays yours. Every causal determination, every factual finding, every classification on the ICAM chart, every recommendation. The model never interviews a witness and never decides a cause. It pressure-tests an analysis a person already owns. That's what separates a sharper investigation from a fabricated one.

And a third, once you trust the pattern. The long analytics build. Five years of messy incident exports, inconsistent categories, no leading-indicator view, run as one supervised, multi-stage build. The long-horizon behaviour is what this tier is for, and the checkpoints you review at each stage are how you keep hold of it.

The ceiling will move again. Anthropic shipped two Mythos-class generations in two months, and nobody else is standing still. That's why the four-question test in my field guide outlasts any model review. What would a confident wrong answer cost? Who verifies the output? Can it be checked against a primary source? And is a person still making the decision that carries the duty? Fable 5 changes the economics of safety work, in places dramatically. The ceiling moved on 9 June, and the floor is still yours to hold. If you're working out where this tier fits your safety system, reach out…

Frequently asked questions

What is Claude Fable 5?
Fable 5 is Anthropic's most capable AI model, released on 9 June 2026. It is the first Mythos-class model made generally available, a tier above Opus, with a context window of around one million tokens, output up to 128,000 tokens, and API pricing of US$10 per million input tokens and US$50 per million output tokens (roughly A$14 and A$71).
Is Fable 5 safe to use for WHS work?
Used around the decision, yes: drafting, cross-checking documents, analysing data and challenging your thinking. Used on the decision, no. The duty of care under the model WHS Act is non-delegable, consultation cannot be simulated, and a competent person must still review and own anything the model touches.
Is Fable 5 worth the price for a safety team?
For long, complex, multi-document work, often yes. It costs double Opus 4.8 and more than triple Sonnet 4.6, so save it for whole-of-system reviews, investigation files and analytics builds, and run routine drafting on cheaper tiers. Most practitioners will first meet it inside a paid Claude plan rather than through API pricing.
Does Fable 5 still hallucinate?
Assume yes. Anthropic has published no hallucination or accuracy figures for Fable 5, and a May 2026 arXiv study by Churilov still measured hallucinated software package rates of 4.62% to 6.10% across frontier coding models. Verify every citation, clause and number against the source instrument before it goes near a safety decision.
What is a Mythos-class model?
Mythos is Anthropic's model tier above Opus. The first Mythos-class model went to a small group of partners as Claude Mythos Preview in April 2026 under Project Glasswing. Fable 5 is the second generation, hardened for general use, while Claude Mythos 5, the same model with fewer restrictions, stays limited to vetted partners and researchers.
Should I upload incident files to Fable 5?
Not without a deliberate decision. Investigation files carry personal and health information and sometimes legal professional privilege, and Mythos-class traffic carries a mandatory 30-day data retention period. Check your enterprise controls, your privacy obligations and your own policy before anything sensitive goes in.

More from the blog