A pelican for GPT-5.5 via the semi-official Codex backdoor API

Simon Willison's Weblog

View Original ↗
AI 導讀 technology AI 重要性 3/5

GPT-5.5 發布但 API 未開,Codex 後門可用;定價是 GPT-5.4 兩倍,Pro 版輸出費達 $180/百萬 token

  • Codex 後門端點獲 OpenAI 官方支援,三步安裝即可透過訂閱直接呼叫 GPT-5.5
  • xhigh 推理模式消耗 9,322 token 是預設 39 個的 239 倍,生成近 4 分鐘但品質更佳
  • GPT-5.5 定價是 GPT-5.4 兩倍($5/$30/M),Pro 輸出費 $180/M 為 5.4 的 12 倍

GPT-5.5 官方 API 尚未上線,透過 Codex 訂閱的後門端點卻已能直接呼叫:開啟最高推理模式生成一張鵜鶘 SVG,消耗的推理 token 是預設模式的 240 倍9,322 vs 39),輸出品質也確實天差地遠。Simon Willison 用 Claude Code 逆向工程 openai/codex 開源 repo,三行指令打通了整個通道,並把過程記錄成這篇部落格。

GPT-5.5 已上線,但正式 API 暫缺

GPT-5.5 於 2026 年 4 月 23 日正式發布,目前可透過 OpenAI Codex(程式開發 AI 助手工具)使用,並開始向付費 ChatGPT 訂閱者滾動開放。Willison 拿到了提前預覽資格,評價是「快速、有效、能力強」,要求它建東西就能精確做出來。

但有一個明顯缺席:正式 API 尚未開放。OpenAI 官方說法是 API 部署需要不同的安全保護措施,正在與夥伴密切合作中,GPT-5.5 和 GPT-5.5 Pro 將「很快」進入 API。對習慣用 API 跑基準測試的開發者而言,這製造了一個障礙——Willison 習慣透過 API 跑他的「鵜鶘基準測試」(pelican benchmark,用生成鵜鶘騎自行車 SVG 來評估模型能力),以避免 ChatGPT 的隱藏系統提示影響結果。

OpenClaw 爭議催生的「半官方後門」

理解這條通道的來由,需要先了解背景。過去幾個月,AI 圈持續存在一個張力點:OpenClawPi 等第三方 AI 客戶端,是否可以繞過 API 計費、直接對接 OpenAI 和 Anthropic 的月訂閱服務。

OpenClaw 的做法是直接整合訂閱機制,Anthropic 隨即封鎖了這個行為,引發一場爭議。而 OpenAI——恰好最近挖角了 OpenClaw 創辦人 Peter Steinberger——趁機宣布歡迎 OpenClaw 繼續整合,使用的正是 Codex CLI 的同款機制。OpenAI 的 Romain Huet 明確表示:「我們希望大家能在任何地方使用 Codex 和 ChatGPT 訂閱,不論是 app、終端機、JetBrains、Xcode、OpenCode、Pi,還是 Claude Code。」Steinberger 也直接回覆:OpenAI 訂閱是官方支援的

關鍵端點是 /backend-api/codex/responses,Codex CLI 本身也是開源的——這讓任何人都可以查看認證機制並自行整合,門沒有關,只是沒有廣而告之。

llm-openai-via-codex:逆向工程後的三步安裝

Willison 決定動手。他讓 Claude Code 去逆向工程 openai/codex 開源 repo,弄清楚認證 token 是如何儲存的,然後寫出了 llm-openai-via-codex——這是他維護的命令列工具 LLM(大型語言模型呼叫工具)的一個新插件。

安裝流程只需三步:首先安裝 Codex CLI、購買 OpenAI 訂閱並登入;接著跑 uv tool install llm;最後 llm install llm-openai-via-codex。之後直接執行 llm -m openai-codex/gpt-5.5 '提示詞' 即可。

LLM 工具的所有既有功能也都能繼續使用:用 -a filepath.jpg 附加圖片、llm chat 開啟持續對話、llm logs 查看對話紀錄、--tool 啟用工具呼叫。整個整合本身工程量不大——核心複雜度在認證部分,而 Codex 開源讓這件事變得可行。

xhigh 模式 9,322 token vs 預設 39 token:鵜鶘 SVG 對比

拿到通道後,Willison 跑了他的鵜鶘基準測試:要求模型生成「鵜鶘騎自行車」的 SVG。預設模式的結果他認為比 GPT-5.4 的版本還差,於是加上 -o reasoning_effort xhigh 重跑。

高推理模式的代價是時間:將近四分鐘才生成完成。但視覺效果確實明顯更好——xhigh 版大量使用 CSS,包含複雜的漸層效果,走的是截然不同的技術策略;預設版則精簡許多。Reasoning token 的消耗量印證了差距:xhigh 用了 9,322 個推理 token,預設版只用了 39 個,相差約 239 倍。這個數字也說明了為何兩個版本的輸出策略如此不同——xhigh 實際上是在「想更多」。

鵜鶘 SVG:xhigh vs 預設推理 token 消耗

GPT-5.5 定價:比 GPT-5.4 貴一倍,Pro 版輸出費是 12 倍

GPT-5.5 進入 API 後,定價將是 GPT-5.4 的兩倍:輸入 $5 / 百萬 token,輸出 $30 / 百萬 token;而 GPT-5.4 是輸入 $2.5、輸出 $15。GPT-5.5 Pro 則更昂貴:輸入 $30 / 百萬 token,輸出 $180 / 百萬 token,輸出費率比 GPT-5.4 高出整整 12 倍。

Willison 的觀察是:GPT-5.4 和 GPT-5.5 的關係,類似 Claude Sonnet 和 Claude Opus——前者半價,後者能力更強,GPT-5.4 不會下架。研究者 Ethan Mollick 的詳細測評結論是:「參差不齊的前沿(jagged frontier)依然成立」——GPT-5.5 在某些任務非常出色,在另一些地方仍有挑戰,且這種不均衡很難預測。

GPT-5.5 API 開放前,Codex 後門已可用;但 Pro 版 $180/百萬輸出 token 的定價讓規模化使用需要謹慎估算成本。

GPT-5.4 / 5.5 / 5.5 Pro 定價比較
模型輸入($/百萬 token)輸出($/百萬 token)
GPT-5.4$2.5$15
GPT-5.5$5$30
GPT-5.5 Pro$30$180

Abstract

GPT-5.5 is out. It's available in OpenAI Codex and is rolling out to paid ChatGPT subscribers. I've had some preview access and found it to be a fast, effective and highly capable model. As is usually the case these days, it's hard to put into words what's good about it - I ask it to build things and it builds exactly what I ask for! There's one notable omission from today's release - the API: API deployments require different safeguards and we are working closely with partners and customers on the safety and security requirements for serving it at scale. We'll bring GPT‑5.5 and GPT‑5.5 Pro to the API very soon. When I run my pelican benchmark I always prefer to use an API, to avoid hidden system prompts in ChatGPT or other agent harnesses from impacting the results. The OpenClaw backdoor One of the ongoing tension points in the AI world over the past few months has concerned how agent harnesses like OpenClaw and Pi interact with the APIs provided by the big providers. Both OpenAI and Anthropic offer popular monthly subscriptions which provide access to their models at a significant discount to their raw API. OpenClaw integrated directly with this mechanism, and was then blocked from doing so by Anthropic. This kicked off a whole thing. OpenAI - who recently hired OpenClaw creator Peter Steinberger - saw an opportunity for an easy karma win and announced that OpenClaw was welcome to continue integrating with OpenAI's subscriptions via the same mechanism used by their (open source) Codex CLI tool. Does this mean anyone can write code that integrates with OpenAI's Codex-specific APIs to hook into those existing subscriptions? The other day Jeremy Howard asked: Anyone know whether OpenAI officially supports the use of the /backend-api/codex/responses endpoint that Pi and Opencode (IIUC) uses? It turned out that on March 30th OpenAI's Romain Huet had tweeted: We want people to be able to use Codex, and their ChatGPT subscription, wherever they like! That means in the app, in the terminal, but also in JetBrains, Xcode, OpenCode, Pi, and now Claude Code. That’s why Codex CLI and Codex app server are open source too! 🙂 And Peter Steinberger replied to Jeremy that: OpenAI sub is officially supported. llm-openai-via-codex So... I had Claude Code reverse-engineer the openai/codex repo, figure out how authentication tokens were stored and build me llm-openai-via-codex, a new plugin for LLM which picks up your existing Codex subscription and uses it to run prompts! (With hindsight I wish I'd used GPT-5.4 or the GPT-5.5 preview, it would have been funnier. I genuinely considered rewriting the project from scratch using Codex and GPT-5.5 for the sake of the joke, but decided not to spend any more time on this!) Here's how to use it: Install Codex CLI, buy an OpenAI plan, login to Codex Install LLM: uv tool install llm Install the new plugin: llm install llm-openai-via-codex Start prompting: llm -m openai-codex/gpt-5.5 'Your prompt goes here' All existing LLM features should also work - use -a filepath.jpg/URL to attach an image, llm chat -m openai-codex/gpt-5.5 to start an ongoing chat, llm logs to view logged conversations and llm --tool ... to try it out with tool support. And some pelicans Let's generate a pelican! llm install llm-openai-via-codex llm -m openai-codex/gpt-5.5 'Generate an SVG of a pelican riding a bicycle' Here's what I got back: I've seen better from GPT-5.4, so I tagged on -o reasoning_effort xhigh and tried again: That one took almost four minutes to generate, but I think it's a much better effort. If you compare the SVG code (default, xhigh) the xhigh one took a very different approach, which is much more CSS-heavy - as demonstrated by those gradients. xhigh used 9,322 reasoning tokens where the default used just 39. Tags: ai, openai, generative-ai, chatgpt, llms, llm, pelican-riding-a-bicycle, llm-reasoning, llm-release, codex-cli