Image to Text API: Extract Text from Photos with Code Examples

#beginners #python #api #tutorial

Turning a document, receipt, or handwritten note into machine-readable text used to require heavy on-premise software. Today, an OCR API lets you extract text from any image with a single HTTP request.

Why Use an OCR API?

Open-source engines like Tesseract demand careful preprocessing — deskewing, binarization, language tuning — before they produce usable output. A cloud OCR API handles all of that behind the scenes.

No infrastructure — Skip GPU provisioning and model management
Multilingual — Dozens of languages and scripts out of the box
Handwriting recognition — Deep-learning models read cursive and messy handwriting
Structured output — Bounding boxes, line-level text, and confidence values

Code Example

import requests

url = "https://ocr-wizard.p.rapidapi.com/ocr"
headers = {
    "x-rapidapi-host": "ocr-wizard.p.rapidapi.com",
    "x-rapidapi-key": "YOUR_API_KEY",
}

with open("document.jpg", "rb") as f:
    response = requests.post(
        url,
        headers=headers,
        files={"image": ("doc.jpg", f, "image/jpeg")},
    )

data = response.json()
print(f"Language: {data['body']['detectedLanguage']}")
print(f"Text: {data['body']['fullText']}")

for word in data["body"]["annotations"]:
    print(f"  '{word['text']}' at {word['boundingPoly']}")

The API returns the full extracted text, detected language, and word-level bounding boxes — useful for building searchable PDFs or overlaying highlights.

Use Cases

Receipt/invoice scanning — Parse totals, dates, and vendor names directly into accounting software
Document digitization — Convert scanned contracts or medical records into searchable text at scale
Handwriting-to-text — Students photograph handwritten homework and get a typed transcript
License plate/ID reading — Automate identity verification or parking management

Best Practices

Provide clear, well-lit images — shadows and glare degrade recognition quality
Let the API detect the language — works reliably across dozens of languages
Crop to the region of interest — reduces noise and bandwidth
Use word-level bounding boxes for searchable PDFs or region-specific extraction
Batch with concurrency (5-10 at a time) for large document sets

👉 Read the full guide with cURL, Python, and JavaScript examples