Claude Vision and Multimodal Guide: Images, PDFs, and Documents (2026)

#claude #vision #multimodal #images

Originally published at claudeguide.io/claude-vision-multimodal-guide

Claude Vision and Multimodal Guide: Images, PDFs, and Documents (2026)

Claude's vision capabilities let you send images, screenshots, PDFs, and documents directly to the API — pass them as base64-encoded content or URLs in the messages array, and Claude analyzes them alongside text with no additional configuration in 2026. Claude 3.5 models support vision natively with a 200K token context window. This guide covers every input type with working Python examples and production patterns.

What Claude Can Analyze

Claude's multimodal input handles:

Images: JPEG, PNG, GIF, WebP
PDFs: Native PDF support in Claude 3.5+ models
Screenshots: UI analysis, bug reports, test verification
Charts and graphs: Data extraction and interpretation
Diagrams: Architecture diagrams, flowcharts, ERDs
Handwriting: Reasonably accurate OCR of handwritten text
Code in images: Extract and explain code from screenshots

Sending Images via URL

The simplest approach — if your image is publicly accessible:

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "url",
                        "url": "https://example.com/chart.png"
                    }
                },
                {
                    "type": "text",
                    "text": "What does this chart show? Extract the key data points."
                }
            ]
        }
    ]
)

print(response.content[0].text)

Sending Images via Base64

For local files or private images:

import base64
import anthropic

def encode_image(image_path: str) -

---

## PDF Analysis

Claude 3.5 models support native PDF input:

python
import base64

def encode_pdf(pdf_path: str) -

DEV Community

Claude Vision and Multimodal Guide: Images, PDFs, and Documents (2026)

Claude Vision and Multimodal Guide: Images, PDFs, and Documents (2026)

What Claude Can Analyze

Sending Images via URL

Sending Images via Base64

Top comments (0)