API OCR.chat
Unha petición HTTP converte unha imaxe ou PDF en texto simple, Markdown, táboas e JSON, en máis de 100 linguas. Medido por páxina, sen sorpresas.
Resumo
The OCR.chat API is a small REST interface. You POST a file and get back a job with the recognized text and a per-page breakdown (text, bounding boxes, confidence). Jobs of 5 pages or fewer return inline; larger jobs return immediately with a pending status that you poll until done.
- Base URL:
https://ocr.chat - Formats in: PNG, JPG, WEBP, GIF, BMP, TIFF, and multi-page PDF
- Formats out:
txt,md,docx,pdf,csv,json - Engines:
cpu(fast, printed docs) andvlm(premium AI, handwriting, complex layout, math)
Autenticación
Authenticate with your API token (find it on your account page) as a Bearer header:
Authorization: Bearer YOUR_API_TOKEN
You can also pass ?api_token=… as a query parameter. Usage is metered against your account's page balance.
Enviar un documento
POST /api/v1/ocr/, multipart form upload.
curl -X POST https://ocr.chat/api/v1/ocr/ \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-F "file=@invoice.pdf" \
-F "tier=vlm" \
-F "language=auto"
Returns the job. For ≤5-page files it is already done with the text; larger files come back pending/processing, poll the status endpoint.
{
"uuid": "9f2c1b7e4a...",
"status": "done",
"tier": "vlm",
"language": "auto",
"page_count": 1,
"mean_confidence": 0.98,
"text": "INVOICE\nAcme Corp\nTotal: 215.00 USD",
"markdown": "# INVOICE\n\n**Acme Corp** ...",
"pages": [ { "index": 0, "text": "...", "blocks": [ { "text": "...", "bbox": [x0,y0,x1,y1], "confidence": 0.98 } ] } ]
}
Obter un resultado
GET /api/v1/ocr/<uuid>/, poll until status is done or failed.
curl https://ocr.chat/api/v1/ocr/9f2c1b7e4a.../ \
-H "Authorization: Bearer YOUR_API_TOKEN"
Obter un formato
GET /api/v1/ocr/<uuid>/download/?format=md, export the result. format is one of txt, md, docx, pdf, csv, json.
curl -L "https://ocr.chat/api/v1/ocr/9f2c1b7e4a.../download/?format=docx" \
-H "Authorization: Bearer YOUR_API_TOKEN" -o result.docx
Converse cun documento
Faga preguntas acerca dunha tarefa rematada. As respostas baséanse só no texto extraído e citan a páxina de orixe. Require un token de conta; a funcionalidade de conversación está limitada pola conta.
POST /api/v1/chat/<uuid>/, JSON body {"message": "your question"}.
curl -X POST https://ocr.chat/api/v1/chat/9f2c1b7e4a.../ \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"message": "What is the invoice total and due date?"}'
Devolve a mensaxe do asistente coa súa resposta e unha lista de páxinas citadas:
{"conversation": "a1b2…", "message": {
"role": "assistant",
"content": "The total is $42, due on March 3 (p. 1).",
"citations": [{"page": 1, "snippet": "The invoice total is $42…"}]
}}
GET /api/v1/chat/<uuid>/history/, obter a transcrición completa da conversa para unha tarefa.
Exemplos de código
import requests, time
API = "https://ocr.chat/api/v1/ocr/"
H = {"Authorization": "Bearer YOUR_API_TOKEN"}
# Submit
with open("invoice.pdf", "rb") as f:
job = requests.post(API, headers=H,
files={"file": f}, data={"tier": "vlm"}).json()
# Poll until done
while job["status"] in ("pending", "processing"):
time.sleep(2)
job = requests.get(API + job["uuid"] + "/", headers=H).json()
print(job["markdown"])
# Download as DOCX
r = requests.get(API + job["uuid"] + "/download/",
headers=H, params={"format": "docx"})
open("result.docx", "wb").write(r.content)
import fs from "fs";
const API = "https://ocr.chat/api/v1/ocr/";
const H = { Authorization: "Bearer YOUR_API_TOKEN" };
const form = new FormData();
form.append("file", new Blob([fs.readFileSync("invoice.pdf")]), "invoice.pdf");
form.append("tier", "vlm");
let job = await (await fetch(API, { method: "POST", headers: H, body: form })).json();
while (["pending", "processing"].includes(job.status)) {
await new Promise(r => setTimeout(r, 2000));
job = await (await fetch(API + job.uuid + "/", { headers: H })).json();
}
console.log(job.markdown);
# 1. Submit
curl -X POST https://ocr.chat/api/v1/ocr/ \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-F "file=@invoice.pdf" -F "tier=vlm"
# 2. Poll (use the uuid from step 1)
curl https://ocr.chat/api/v1/ocr/UUID/ \
-H "Authorization: Bearer YOUR_API_TOKEN"
# 3. Download
curl -L "https://ocr.chat/api/v1/ocr/UUID/download/?format=md" \
-H "Authorization: Bearer YOUR_API_TOKEN" -o result.md
Parámetros
| Field | Type | Description |
|---|---|---|
file | file | Required. The image or PDF to process. |
tier | string | cpu (default, fast/printed) or vlm (premium AI: handwriting, layout, math). |
language | string | auto (default) or a language code (en, ch, ja, ar, …). |
tool | string | Optional tool slug (e.g. extract-tables, handwriting-to-text) to apply that tool's preset. |
translate_to | string | For the translate tool, target language code. |
Erros e límites
| Code | Meaning |
|---|---|
400 | No file, unsupported type, or file too large. |
401 | Missing or invalid API token. |
402 | Out of pages, daily/monthly free limit reached, or no credits. The body includes used/cap. |
404 | Job UUID not found. |
409 | Download requested before the job finished. |
Each page processed costs credits (1/page on the fast tier, more on premium). Paid plans raise per-file page caps and add priority. See pricing.
Preguntas frecuentes
language=auto to detect, or pass a specific code.