OCR.chat API

Aperçu

The OCR.chat API is a small REST interface. You POST a file and get back a job with the recognized text and a per-page breakdown (text, bounding boxes, confidence). Jobs of 5 pages or fewer return inline; larger jobs return immediately with a pending status that you poll until done.

Base URL: https://ocr.chat
Formats in: PNG, JPG, WEBP, GIF, BMP, TIFF, and multi-page PDF
Formats out: txt, md, docx, pdf, csv, json
Engines: cpu (fast, printed docs) and vlm (premium AI, handwriting, complex layout, math)

Auth

Authenticate with your API token (find it on your account page) as a Bearer header:

Authorization: Bearer YOUR_API_TOKEN

You can also pass ?api_token=… as a query parameter. Usage is metered against your account's page balance.

Soumèt yon dokiman

POST /api/v1/ocr/, multipart form upload.

curl -X POST https://ocr.chat/api/v1/ocr/ \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -F "file=@invoice.pdf" \
  -F "tier=vlm" \
  -F "language=auto"

Returns the job. For ≤5-page files it is already done with the text; larger files come back pending/processing, poll the status endpoint.

{
  "uuid": "9f2c1b7e4a...",
  "status": "done",
  "tier": "vlm",
  "language": "auto",
  "page_count": 1,
  "mean_confidence": 0.98,
  "text": "INVOICE\nAcme Corp\nTotal: 215.00 USD",
  "markdown": "# INVOICE\n\n**Acme Corp** ...",
  "pages": [ { "index": 0, "text": "...", "blocks": [ { "text": "...", "bbox": [x0,y0,x1,y1], "confidence": 0.98 } ] } ]
}

Obtenn yon rezilta

GET /api/v1/ocr/<uuid>/, poll until status is done or failed.

curl https://ocr.chat/api/v1/ocr/9f2c1b7e4a.../ \
  -H "Authorization: Bearer YOUR_API_TOKEN"

Telechaje yon fòma

GET /api/v1/ocr/<uuid>/download/?format=md, export the result. format is one of txt, md, docx, pdf, csv, json.

curl -L "https://ocr.chat/api/v1/ocr/9f2c1b7e4a.../download/?format=docx" \
  -H "Authorization: Bearer YOUR_API_TOKEN" -o result.docx

Konvèsasyon ak yon dokiman

Mande kesyon sou yon travay fini. Repons yo baze sèlman sou tèks ekstraksyon an epi yo site paj sous la. Mande yon kont - karakteristik konvèsasyon an se kont-gated.

POST /api/v1/chat/<uuid>/, JSON body {"message": "your question"}.

curl -X POST https://ocr.chat/api/v1/chat/9f2c1b7e4a.../ \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"message": "What is the invoice total and due date?"}'

Retounen mesaj asistan an ak repons li ak yon lis paj ki citée:

{"conversation": "a1b2…", "message": {
   "role": "assistant",
   "content": "The total is $42, due on March 3 (p. 1).",
   "citations": [{"page": 1, "snippet": "The invoice total is $42…"}]
}}

GET /api/v1/chat/<uuid>/history/, Fòk ou fèmen fenèt la pou w ka wè konvèsasyon an.

Ekzanp kòd

import requests, time

API = "https://ocr.chat/api/v1/ocr/"
H = {"Authorization": "Bearer YOUR_API_TOKEN"}

# Submit
with open("invoice.pdf", "rb") as f:
    job = requests.post(API, headers=H,
        files={"file": f}, data={"tier": "vlm"}).json()

# Poll until done
while job["status"] in ("pending", "processing"):
    time.sleep(2)
    job = requests.get(API + job["uuid"] + "/", headers=H).json()

print(job["markdown"])

# Download as DOCX
r = requests.get(API + job["uuid"] + "/download/",
                 headers=H, params={"format": "docx"})
open("result.docx", "wb").write(r.content)

import fs from "fs";

const API = "https://ocr.chat/api/v1/ocr/";
const H = { Authorization: "Bearer YOUR_API_TOKEN" };

const form = new FormData();
form.append("file", new Blob([fs.readFileSync("invoice.pdf")]), "invoice.pdf");
form.append("tier", "vlm");

let job = await (await fetch(API, { method: "POST", headers: H, body: form })).json();

while (["pending", "processing"].includes(job.status)) {
  await new Promise(r => setTimeout(r, 2000));
  job = await (await fetch(API + job.uuid + "/", { headers: H })).json();
}
console.log(job.markdown);

# 1. Submit
curl -X POST https://ocr.chat/api/v1/ocr/ \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -F "file=@invoice.pdf" -F "tier=vlm"

# 2. Poll  (use the uuid from step 1)
curl https://ocr.chat/api/v1/ocr/UUID/ \
  -H "Authorization: Bearer YOUR_API_TOKEN"

# 3. Download
curl -L "https://ocr.chat/api/v1/ocr/UUID/download/?format=md" \
  -H "Authorization: Bearer YOUR_API_TOKEN" -o result.md

Paramèt

Field	Type	Description
`file`	file	Required. The image or PDF to process.
`tier`	string	`cpu` (default, fast/printed) or `vlm` (premium AI: handwriting, layout, math).
`language`	string	`auto` (default) or a language code (`en`, `ch`, `ja`, `ar`, …).
`tool`	string	Optional tool slug (e.g. `extract-tables`, `handwriting-to-text`) to apply that tool's preset.
`translate_to`	string	For the translate tool, target language code.

Erè & limit

Code	Meaning
`400`	No file, unsupported type, or file too large.
`401`	Missing or invalid API token.
`402`	Out of pages, daily/monthly free limit reached, or no credits. The body includes `used`/`cap`.
`404`	Job UUID not found.
`409`	Download requested before the job finished.

Each page processed costs credits (1/page on the fast tier, more on premium). Paid plans raise per-file page caps and add priority. See pricing.

Kesyon ki poze souvan

Create a free account and open your account page, your token is shown there with a copy button.

Yes, files of 5 pages or fewer return the full result inline in the POST response, so no polling is needed for most images and short PDFs.

Over 100, including Latin, CJK, Arabic, Cyrillic and Indic scripts. Use language=auto to detect, or pass a specific code.

Uploads are processed for OCR and deleted automatically. We never sell, share, or train on your documents.

Itilize a se mezire pa paj kont balans kont ou: apèl anonim jwenn yon per-IP jounen alokasyon, kont gratis yon mennaj, ak plan peye itilize kredi achte ak pi wo a per-file paj kap ak priyorite.Lè ou kouri soti ou jwenn yon 402 ak itilize ak cap nan kò a.

Ou ka voye PNG, JPG, WEBP, GIF, BMP, TIFF, ak PDF multi-page.Resultats telechaje kòm txt, md, docx, pdf (searchable), csv, oswa json via la paramèt fòma download endpoint's.

400 se yon dosye ki manke, yon kalite ki pa sipòte, oswa yon dosye ki twò gwo; 401 se yon kòd ki manke oswa ki pa valid; 402 se pa yon paj; 404 se yon UUID travay ki pa konnen; ak 409 se yon telechaje mande anvan travay la fini. kòd erè yo gen ladan yo yon mesaj kout.

Yon objè travay ak estati, nivo, lang, page_count, ak mean_confidence, plis tèks konplè ak markdown. Array paj la divize chak paj an blòk ak tèks yo, bokit limit (bbox), ak konfidans pou chak blòk.

Itilize CPU (prèv) pou rekonèt rapidman, ak pri ki ba, dokiman ki ekri an lèt detache. Itilize vlm, motè AI premium, pou ekri an lèt detache, aranjman kolonèl, matematik, ak tradiksyon, kote li pi egzat.

Pass outil avec un slug (par exemple extract-tables ou handwriting-to-text) pour appliquer l'établissement prédéfini de cet outil. Pour l'outil de traduction, passe aussi translate_to avec le code de la langue cible pour obtenir le texte reconnu traduit.

Fichiers de 5 pages ou moins retournent inline dans la réponse POST. Fichiers plus grands reviennent immédiatement comme en attente ou traitement, et vous poll GET /api/v1/ocr/<uuid>/ jiskaske estati a se te fè oswa te pèdi. Plan peye ogmante paj kap pou chak dosye.

API a se REST senp sou HTTPS, se konsa li travay soti nan nenpòt lang ak yon kliyan HTTP — gade egzanp Python, Node.js, ak cURL anwo a. Pa gen okenn SDK pou enstale; kèk liy nan kòd HTTP estanda se tout sa ou bezwen.

OCR.chat API

Aperçu

Auth

Soumèt yon dokiman

Obtenn yon rezilta

Telechaje yon fòma

Konvèsasyon ak yon dokiman

Ekzanp kòd

Paramèt

Erè & limit

Kesyon ki poze souvan

How do I get an API token?

Is there a synchronous (no-polling) mode?

What languages are supported?

Do you store my documents?

Ki sa ki limit pousantaj ak kwoti?

Ki enpòte ak dechouke fòma yo sipòte?

Ki sa ki vle di kòd erè yo?

Ki jan li ye?

Ki lè mwen ta dwe itilize tier = cpu vs tier = vlm?

Ki jan zouti a ak translate_to paramèt yo travay?

Ki gwosè dosye mwen ka voye, epi ki jan travay gwo yo trete?

Èske gen SDKs ofisyèl?