Claude Fable 5 lasted three days. Launched June 9, suspended June 12 under a US export control order, and as of July 1 it is back worldwide, on Claude.ai, Claude Code and the API. So the obvious question returns with it: what does the most capable model on the market actually cost to run? The sticker says $10 in and $50 out per million tokens, and the sticker is misleading. Fable 5 thinks on every request, and the effort dial (low, medium, high, xhigh, max) changes how many tokens it burns to answer you. Same task, same rates, and the bill moves by 7x between settings. Nobody prices that out level by level, so we did. Here is the story of the ban, the math of the dial, and which effort each kind of work deserves.
The short answer
Fable 5 is back worldwide after the US lifted its export controls, with a new safety classifier as the price of return. The per-token rate never changes: what changes is how many tokens each effort level burns. Same agentic turn, roughly $0.45 on low, $1.20 on high, $3.20 on xhigh. Default to high, drop to low for the mechanical stuff, and save xhigh for work that measurably needs it.
Three days on the market, nineteen in a vault
The timeline reads like fiction. Anthropic shipped Claude Fable 5 on June 9, its first Mythos-class model for the general public, state of the art on coding, knowledge work and computer use. On June 12 the US government applied export controls and global access went dark. The trigger: Amazon researchers had found a prompting technique that walked past the model’s safeguards and got it hunting software vulnerabilities, in at least one case producing code that demonstrated a working exploit. For a model this capable, that report was enough.
The controls were lifted on June 30 and Fable 5 came back worldwide on July 1, on Claude.ai, Claude Code, Claude Cowork and the API, with cloud platforms following. The return came with conditions worth knowing about: deeper collaboration with the US government, including pre-release access, and a new safety classifier that blocks the reported technique in over 99 percent of cases. The honest footnote is that the classifier also raises false positives on routine security and coding work, so if a benign request gets declined, that is why. The API ships an opt-in fallback that re-serves declined requests on Opus 4.8 inside the same call, which is worth turning on if you automate anything.
Availability on the plans is time-boxed generosity: Pro, Max, Team and select Enterprise get Fable 5 included for up to half their weekly usage limits through July 7, after which it moves to usage credits. The API is straight per-token pricing. Which brings us to the actual subject.
The rate is fixed. The bill is not.
Fable 5 costs $10 per million input tokens and $50 per million output, double
Opus 4.8’s $5 and $25. Most coverage stops there, and stopping there is how
people end up shocked by an invoice, because Fable 5 has a property that makes
the rate card almost decorative: thinking is always on. You cannot turn it
off. Every request reasons before it answers, the depth of that reasoning is
steered by one parameter, effort, and every thinking token bills as output
even when you never see it.
Effort takes five values: low, medium, high (the default), xhigh and
max, with xhigh reserved for the top models (Fable 5, Mythos 5, Opus 4.8 and
4.7). It is not a price multiplier. It is a behavior multiplier. At low effort
the model skips preamble, combines tool calls and answers tersely. At xhigh it
plans out loud, explores, double-checks, and spends tokens like a researcher on
a deadline. Rates fixed, volume anything but.
The same task, priced at every effort
Numbers make it concrete. Take a typical agentic turn with 20,000 tokens of input, the kind Claude Code fires constantly, and use the per-effort output volumes documented from real usage: about 5,000 output tokens on low, 20,000 on high, 60,000 on xhigh. Price them at $10 and $50 per million and you get, per turn:
The input costs $0.20 every time. On low, 5,000 output tokens add $0.25: about $0.45 a turn. On high, 20,000 add $1.00: about $1.20. On xhigh, 60,000 add $3.00: about $3.20. Medium lands between low and high, and max has no published ceiling at all, by design: Anthropic describes it as no constraints on token spending. Run a hundred-turn agent session and the gap stops being pocket change: about $45 on low, $120 on high, $320 on xhigh, for the same hundred turns.
Two levers cut all of these regardless of effort. Prompt caching takes up to 90 percent off repeated input, which in agent loops is most of it, and batch mode halves anything that can wait. But nothing you do to the input side changes the core fact: on a model that always thinks, the effort dial is the price tag, and it is set per request, by you.
Price it like a project, not a turn
Per-turn numbers stay abstract until you multiply them by a real job. So take four familiar projects and price them end to end, same assumptions as above: about 20,000 tokens of input per turn, the published per-effort output volumes. The turn counts are honest estimates from real Claude Code sessions; yours will vary, the ratios will not.
Writing a 1,500-word article, five turns of draft and revisions: about $2 on low, $6 on high, $16 on xhigh. Building a landing page, forty turns of scaffold, styling and fixes: $18, $48, $128. A small app, say a subnet calculator with a web UI, eighty turns: $36, $96, $256. A heavy build or migration, hundreds of files, an agent running overnight, three hundred turns and realistically the reason you rented Fable 5 at all: $360 at high, and pushing $960 at xhigh.
And here is the pick, per project, so nobody has to guess. The article: low. Fable 5’s low effort already writes at the level of yesterday’s flagships, and prose does not need sixty thousand tokens of deliberation. The landing page and the small app: high, the default, which is exactly what agentic coding was tuned for. The heavy build or migration: xhigh, the one job whose failure costs more than its tokens. Which makes the realistic bills $2, $48, $96 and $960, not four times the worst case.
Two things jump out of that table. First, the dial matters more than the project: an article at xhigh costs almost what a landing page costs at low. Second, at these volumes prompt caching stops being a nice-to-have, because in a real session most of that per-turn input is cache reads at a tenth of the rate. The output side, the thinking, is the part no cache can save you from.
Which effort for which work
Anthropic publishes recommendations, and they match what the math suggests.
Low is for the mechanical layer: subagents doing one bounded thing, classification, extraction, quick lookups, latency-sensitive paths where a fast terse answer beats a considered one. A striking detail from the docs: Fable 5’s lower efforts often outperform the xhigh of previous-generation models, so low here is not the discount bin. It is yesterday’s flagship at a sixth of the spend.
Medium fits everyday drafting, summaries and Q&A, work where you want some deliberation but the problem is not deep. High, the default, is the right home for daily coding and standard agent runs, and if you never touch the dial this is what you are already paying. xhigh is for capability-sensitive, long-horizon work: the multi-day migration, the refactor that has to hold a whole system in its head, deep research where a missed connection costs more than the tokens. And max is for frontier problems only. Anthropic’s own framing is that it adds significant cost for relatively small gains, which is a vendor telling you not to buy its most expensive setting unless you must.
The habit that costs the most
If one thing on this page saves you money, let it be this: the expensive
mistake is not picking xhigh for a hard problem. It is leaving xhigh on for
everything because it worked once. Blanket maximum effort turns a $45 agent
session into a $320 one and buys you longer preambles on tasks a terse answer
would have served. The discipline that works is boring: default to high, drop
to low wherever the task is mechanical, and promote a task to xhigh only when
a run at high measurably fell short. Check the usage field in the response,
thinking_tokens sits right there, and treat any task that did not need its
thinking budget as a candidate for demotion.
Fable 5 coming back is genuinely good news: it is the most capable model generally available, and for the hardest work nothing else currently touches it. Just remember that its real price is not on the rate card. It is on a dial, the dial defaults to sensible, and every notch above sensible should be a decision, not a habit. For whether this much model is worth five times the new mid-tier, our Sonnet 5 vs Fable 5 head-to-head prices that exact fork, and Sonnet 5 vs Opus 4.8 covers the middle of the ladder.
Sources: Anthropic’s Fable 5 redeployment announcement and Claude Platform docs; export-control timeline via VentureBeat and Forbes; per-effort token volumes collated by Developers Digest, July 2026. Dollar figures on this page are computed from those volumes at the published $10/$50 rates; your tasks will vary.
Frequently asked questions
Why was Claude Fable 5 suspended?
A US export control order landed on June 12, three days after launch, after Amazon researchers reported a prompting technique that bypassed its safeguards and got it to find software vulnerabilities, in one case producing working exploit code. Anthropic suspended global access, added a classifier that blocks the reported technique in over 99 percent of cases, and the controls were lifted on June 30.
How much does Claude Fable 5 cost?
The API rate is $10 per million input tokens and $50 per million output, double Opus 4.8. But the real bill depends on the effort setting, because thinking tokens are billed as output even when hidden: the same agentic turn that costs about $0.45 on low effort runs about $1.20 on high and about $3.20 on xhigh. On Claude plans, Fable 5 counts against usage limits, included up to 50 percent of weekly limits through July 7, then via usage credits.
What are the effort levels on Claude Fable 5?
Five: low, medium, high, xhigh and max. High is the default. xhigh exists only on Fable 5, Mythos 5, Opus 4.8 and Opus 4.7. Effort is not a price multiplier; it steers how many tokens the model spends thinking, calling tools and explaining, which is what actually moves the bill.
Which effort level should I use?
Anthropic's own guidance: start at high, the default, and reserve xhigh for capability-sensitive, long-horizon work. Use low for subagents, classification and quick lookups. Fable 5's lower efforts often beat the xhigh of previous models, so resist running everything at the ceiling out of habit. Max is for frontier problems and adds a lot of cost for small gains.
Does Claude Fable 5 always think?
Yes. Thinking is always on and adaptive; you cannot disable it, and explicit thinking budgets are rejected. You control depth through the effort parameter instead, and the thinking tokens bill as output even when the summary is hidden. That is exactly why effort, not the rate card, is the real cost lever.