General capability

Benchmark	Mythos 5	Fable 5	Mythos Preview	Opus 4.8	GPT-5.5
SWE-bench Pro	80.3	80.0	77.8	69.2	58.6
SWE-bench Verified	95.5	95.0	93.9	88.6	NA
HLE, no tools	59.0	*	56.8	49.8	41.4
HLE, with tools	64.5	*	64.7	57.9	52.2
BrowseComp	88.0 single-agent 93.3 multi-agent	*	87.9	84.3 single-agent 88.5 multi-agent	84.4
OSWorld-Verified	85.0	85.0	85.4	83.4	78.7
Terminal-Bench 2.1	88.0	84.3	NA	82.7	83.4 with Codex CLI

Sources: Anthropic’s Claude Fable 5 and Mythos 5 system card for Mythos 5, Fable 5, and updated Opus 4.8 figures; Anthropic’s Claude Mythos Preview system card for Mythos Preview; OpenAI’s GPT-5.5 launch materials for GPT-5.5. Fable 5 and Mythos 5 share the same underlying model weights. An asterisk means Fable 5’s score effectively matches Mythos 5; the exception is when a safety classifier fires and Fable 5 falls back to Opus 4.8 mid-trajectory, visible on Terminal-Bench (84.3 vs. 88.0). See the methodology note for harness changes affecting Opus 4.8 and OSWorld comparisons.