From Raw Text to Searchable Knowledge — One API
RAG (Retrieval-Augmented Generation) is how you give LLMs long-term memory. The idea is simple: split your documents into chunks, embed them into vectors, store them, and retrieve the most relevant ones at query time.
IteraTools now has every piece of that pipeline as a pay-per-use API endpoint. No infra. No self-hosting. Just HTTP.
The Full RAG Pipeline
Step 1: Chunk your document — POST /text/chunk ($0.001)
The newest endpoint. Splits any text (up to 500,000 chars) into overlapping chunks ready for embedding.
Three strategies:
-
token— fixed character windows (1 token ≈ 4 chars), great for guaranteed context limits -
sentence— groups complete sentences up tochunk_sizetokens; preserves readability (default) -
paragraph— groups\n\n-separated paragraphs; ideal for structured documents
curl -X POST https://api.iteratools.com/text/chunk \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "Your long document here...",
"chunk_size": 500,
"overlap": 50,
"strategy": "sentence"
}'
{
"ok": true,
"data": {
"chunks": ["First chunk...", "Overlapping second chunk..."],
"count": 12,
"strategy": "sentence",
"avg_length": 487
}
}
The overlap parameter prevents context loss at chunk boundaries — crucial for coherent retrieval.
Zero external API calls. Runs entirely in Node.js. No surprises.
Step 2: Embed chunks — POST /embeddings ($0.001)
Convert your chunks to float vectors using OpenAI text-embedding-3-small (1536 dims) or large (3072 dims). Accepts up to 100 strings per request — so you can batch all your chunks in one call.
curl -X POST https://api.iteratools.com/embeddings \
-H "Authorization: Bearer YOUR_KEY" \
-d '{"text": ["chunk 1...", "chunk 2...", "chunk 3..."]}'
Returns { embeddings: [[...], [...]], model, dimensions, count, tokens }.
Step 3: Store in vector memory — POST /memory/upsert ($0.003)
Stores a document with its embedding in an isolated, per-API-key namespace.
curl -X POST https://api.iteratools.com/memory/upsert \
-H "Authorization: Bearer YOUR_KEY" \
-d '{
"namespace": "my-docs",
"id": "chunk-001",
"text": "The first chunk of my document...",
"metadata": {"source": "manual.pdf", "page": 1}
}'
Namespaces are automatically isolated per API key — no user can access another user's data.
Step 4: Semantic search — POST /memory/search ($0.002)
At query time, pass a natural language question. The API embeds it, computes cosine similarity against all stored chunks, and returns the top-K most relevant ones.
curl -X POST https://api.iteratools.com/memory/search \
-H "Authorization: Bearer YOUR_KEY" \
-d '{"namespace": "my-docs", "query": "What is the refund policy?", "top_k": 5}'
Feed those chunks as context into /ai/chat or your own LLM call. That's RAG.
Other Recent Endpoints
While building out the RAG toolkit, a few other useful tools landed:
POST /barcode/generate ($0.001)
Generate barcode images locally — Code128, Code39, EAN-13, EAN-8, UPC-A, ITF14, DataMatrix. Returns PNG as base64. Zero external API.
curl -X POST https://api.iteratools.com/barcode/generate \
-d '{"data": "1234567890", "type": "ean13"}'
POST /document/ocr ($0.015)
AI-powered OCR via Mistral mistral-ocr-latest. Far superior to Tesseract for:
- Scanned PDFs and forms
- Tables and multi-column layouts
- Invoices, receipts, Brazilian notas fiscais
Returns plain text, markdown (with table structure preserved), and extracted table objects.
POST /json/validate ($0.001)
Five modes in one endpoint: validate, format (pretty-print), minify, stats (depth, key count, types), and get (extract value by dot-bracket path like user.name or items[0].id).
The Full RAG Flow in One Script
import requests
API = "https://api.iteratools.com"
KEY = "YOUR_KEY"
HEADERS = {"Authorization": f"Bearer {KEY}", "Content-Type": "application/json"}
# 1. Read your document
with open("manual.txt") as f:
document = f.read()
# 2. Chunk it
chunks_resp = requests.post(f"{API}/text/chunk", headers=HEADERS, json={
"text": document, "chunk_size": 500, "overlap": 50, "strategy": "sentence"
}).json()
chunks = chunks_resp["data"]["chunks"]
print(f"Created {len(chunks)} chunks")
# 3. Store each chunk
for i, chunk in enumerate(chunks):
requests.post(f"{API}/memory/upsert", headers=HEADERS, json={
"namespace": "manual",
"id": f"chunk-{i:04d}",
"text": chunk,
"metadata": {"index": i}
})
print("All chunks stored!")
# 4. Query
question = "What is the return policy?"
results = requests.post(f"{API}/memory/search", headers=HEADERS, json={
"namespace": "manual", "query": question, "top_k": 3
}).json()
context = "\n\n".join(r["text"] for r in results["data"]["results"])
print("Context retrieved:", context[:200])
# 5. Answer with LLM
answer = requests.post(f"{API}/ai/chat", headers=HEADERS, json={
"message": question,
"system": f"Answer based only on this context:\n\n{context}"
}).json()
print("Answer:", answer["data"]["response"])
Total cost for a 10-page document (~5,000 tokens, ~10 chunks): ~$0.06 for the full pipeline.
Current Tool Count: 58
IteraTools now has 58 tools across images, video, web, audio, text processing, code execution, external integrations, and full RAG pipelines — all pay-per-use with x402 micropayments on Base.
Browse all tools: iteratools.com/tools
Get your API key: iteratools.com/docs
MCP package (Claude, Cursor, Windsurf): npx mcp-iteratools
Top comments (0)