--- title: Migrating the Vision Script from OpenAI to Claude description: 'How scripts/vision.ts was rewritten to use the Anthropic SDK with claude-opus-4-6 and tool use instead of OpenAI''s json_schema response format.' pubDate: '2026-03-19T12:00:00+01:00' category: en/development tags: - astro - photography - openai seriesParent: obsidian-to-vps-pipeline-with-sync-pull-and-redeploy seriesOrder: 3 --- The script that generates photo sidecar files — `scripts/vision.ts` — was originally written against the OpenAI API. Moving it to Claude was mostly a rewrite of the request envelope; the surrounding infrastructure — EXIF extraction, concurrency control, batching, CLI flags — stayed exactly the same. ## What the script does `scripts/vision.ts` processes new JPG files in the photo albums directory. For each image without a `.json` sidecar, it: 1. Extracts EXIF metadata with `exiftool` (camera, lens, aperture, ISO, focal length, shutter speed, GPS) 2. Sends the image to an AI Vision API to generate alt text, title suggestions, and tags 3. Merges both into a JSON file written next to the image The resulting sidecar drives the photo stream on this site — alt text for accessibility, titles for the detail page, EXIF for the metadata panel. ## Dependency and environment ```bash # before pnpm add openai # after pnpm add @anthropic-ai/sdk ``` ```bash # before OPENAI_API_KEY=sk-... # after ANTHROPIC_API_KEY=sk-ant-... ``` ## Client ```ts // before import OpenAI from "openai"; const openai = new OpenAI(); // after import Anthropic from "@anthropic-ai/sdk"; const anthropic = new Anthropic({ maxRetries: 0 }); ``` `maxRetries: 0` disables the SDK's built-in retry behaviour. The script manages its own retry loop with configurable backoff, so double-retrying would be redundant. ## Structured output **Problem:** The script needs a guaranteed-shape JSON response — title ideas, description, tags — with no parsing surprises. OpenAI and Claude approach this differently. **Implementation (before):** OpenAI used a `json_schema` response format to constrain the model output: ```ts const completion = await openai.chat.completions.create({ model: "gpt-5.1-chat-latest", max_completion_tokens: 2048, response_format: { type: "json_schema", json_schema: { name: "visionResponse", strict: true, schema: { ... }, }, }, messages: [{ role: "user", content: [...] }], }); const result = JSON.parse(completion.choices[0].message.content); ``` **Implementation (after):** Claude uses tool use with `tool_choice: { type: "tool" }` to force a specific tool call — this guarantees the model always responds with the schema, with no JSON parsing step: ```ts const response = await anthropic.messages.create({ model: "claude-opus-4-6", max_tokens: 2048, tools: [{ name: "vision_response", description: "Return the vision analysis of the image.", input_schema: { type: "object", additionalProperties: false, properties: { title_ideas: { type: "array", items: { type: "string" } }, description: { type: "string" }, tags: { type: "array", items: { type: "string" } }, }, required: ["title_ideas", "description", "tags"], }, }], tool_choice: { type: "tool", name: "vision_response" }, messages: [{ role: "user", content: [ { type: "image", source: { type: "base64", media_type: "image/jpeg", data: encodedImage }, }, { type: "text", text: prompt }, ], }], }); const toolUseBlock = response.content.find((b) => b.type === "tool_use"); const result = toolUseBlock.input; // already a typed object, no JSON.parse needed ``` **Solution:** `toolUseBlock.input` arrives as a typed object — no `JSON.parse`, no schema re-validation. The image content block format also differs: OpenAI uses `{ type: "image_url", image_url: { url: "data:image/jpeg;base64,..." } }`, while Anthropic uses a dedicated `source` block with `type: "base64"` and a separate `media_type` field. ## Rate limit handling **Problem:** The script catches rate limit errors and retries with exponential backoff. The old code detected 429s by poking at a raw status property and regex-parsing the error message for a retry-after hint — fragile in both directions. **Implementation:** Switch to the SDK's typed exception class: ```ts // before — checking a raw status property function isRateLimitError(error: unknown): boolean { return (error as { status?: number }).status === 429; } function extractRetryAfterMs(error: unknown): number | null { // parsed "Please try again in Xs" from error message } // after — using Anthropic's typed exception function isRateLimitError(error: unknown): boolean { return error instanceof Anthropic.RateLimitError; } function extractRetryAfterMs(error: unknown): number | null { if (!(error instanceof Anthropic.RateLimitError)) return null; const retryAfter = error.headers?.["retry-after"]; if (retryAfter) { const seconds = Number.parseFloat(retryAfter); if (Number.isFinite(seconds) && seconds > 0) return Math.ceil(seconds * 1000); } return null; } ``` **Solution:** `instanceof Anthropic.RateLimitError` is the detector, and the `retry-after` header is read off the exception directly. Everything else — EXIF extraction, concurrency control, batching, file writing, CLI flags — stayed exactly the same. ## Why claude-opus-4-6 `claude-opus-4-6` is Anthropic's most capable model and handles dense visual scenes, low-light photography, and culturally specific subjects well. For a batch script that runs offline before a deploy, quality matters more than latency. ## What to take away - Claude's tool use with `tool_choice: { type: "tool" }` is a cleaner way to get structured output than OpenAI's `json_schema` — the result comes back as a typed object, no `JSON.parse` step. - Image payloads differ: OpenAI takes a data-URL in `image_url`; Anthropic wants a `source` block with explicit `media_type`. - Use the SDK's typed exception (`Anthropic.RateLimitError`) and read `retry-after` from its headers — don't regex the error message. - Set `maxRetries: 0` on the SDK if you already have your own backoff loop. Double-retrying wastes tokens and quota.