OCR.chat API

Kufotokozera

The OCR.chat API is a small REST interface. You POST a file and get back a job with the recognized text and a per-page breakdown (text, bounding boxes, confidence). Jobs of 5 pages or fewer return inline; larger jobs return immediately with a pending status that you poll until done.

Base URL: https://ocr.chat
Formats in: PNG, JPG, WEBP, GIF, BMP, TIFF, and multi-page PDF
Formats out: txt, md, docx, pdf, csv, json
Engines: cpu (fast, printed docs) and vlm (premium AI, handwriting, complex layout, math)

Kutsimikizira

Authenticate with your API token (find it on your account page) as a Bearer header:

Authorization: Bearer YOUR_API_TOKEN

You can also pass ?api_token=… as a query parameter. Usage is metered against your account's page balance.

Kutumiza fayilo

POST /api/v1/ocr/, multipart form upload.

curl -X POST https://ocr.chat/api/v1/ocr/ \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -F "file=@invoice.pdf" \
  -F "tier=vlm" \
  -F "language=auto"

Returns the job. For ≤5-page files it is already done with the text; larger files come back pending/processing, poll the status endpoint.

{
  "uuid": "9f2c1b7e4a...",
  "status": "done",
  "tier": "vlm",
  "language": "auto",
  "page_count": 1,
  "mean_confidence": 0.98,
  "text": "INVOICE\nAcme Corp\nTotal: 215.00 USD",
  "markdown": "# INVOICE\n\n**Acme Corp** ...",
  "pages": [ { "index": 0, "text": "...", "blocks": [ { "text": "...", "bbox": [x0,y0,x1,y1], "confidence": 0.98 } ] } ]
}

Kupeza zotsatira

GET /api/v1/ocr/<uuid>/, poll until status is done or failed.

curl https://ocr.chat/api/v1/ocr/9f2c1b7e4a.../ \
  -H "Authorization: Bearer YOUR_API_TOKEN"

Kutsitsa mtundu

GET /api/v1/ocr/<uuid>/download/?format=md, export the result. format is one of txt, md, docx, pdf, csv, json.

curl -L "https://ocr.chat/api/v1/ocr/9f2c1b7e4a.../download/?format=docx" \
  -H "Authorization: Bearer YOUR_API_TOKEN" -o result.docx

Kulankhulana ndi fayilo

Kufunsa mafunso pa ntchito yomaliza. Mayankho amachokera pa mawu ochokera m'mawu ochokera ndi kufotokoza tsamba lochokera. Kufunikira akaunti token - mfundo yokambirana ndi akaunti-gated.

POST /api/v1/chat/<uuid>/, JSON body {"message": "your question"}.

curl -X POST https://ocr.chat/api/v1/chat/9f2c1b7e4a.../ \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"message": "What is the invoice total and due date?"}'

Kubwerera mnzake uthenga ndi yankho lake ndi mnda wa masamba anatchula:

{"conversation": "a1b2…", "message": {
   "role": "assistant",
   "content": "The total is $42, due on March 3 (p. 1).",
   "citations": [{"page": 1, "snippet": "The invoice total is $42…"}]
}}

GET /api/v1/chat/<uuid>/history/, GET: GET kulandira zonse kulankhulana transcript kwa ntchito.

Kodi chitsanzo

import requests, time

API = "https://ocr.chat/api/v1/ocr/"
H = {"Authorization": "Bearer YOUR_API_TOKEN"}

# Submit
with open("invoice.pdf", "rb") as f:
    job = requests.post(API, headers=H,
        files={"file": f}, data={"tier": "vlm"}).json()

# Poll until done
while job["status"] in ("pending", "processing"):
    time.sleep(2)
    job = requests.get(API + job["uuid"] + "/", headers=H).json()

print(job["markdown"])

# Download as DOCX
r = requests.get(API + job["uuid"] + "/download/",
                 headers=H, params={"format": "docx"})
open("result.docx", "wb").write(r.content)

import fs from "fs";

const API = "https://ocr.chat/api/v1/ocr/";
const H = { Authorization: "Bearer YOUR_API_TOKEN" };

const form = new FormData();
form.append("file", new Blob([fs.readFileSync("invoice.pdf")]), "invoice.pdf");
form.append("tier", "vlm");

let job = await (await fetch(API, { method: "POST", headers: H, body: form })).json();

while (["pending", "processing"].includes(job.status)) {
  await new Promise(r => setTimeout(r, 2000));
  job = await (await fetch(API + job.uuid + "/", { headers: H })).json();
}
console.log(job.markdown);

# 1. Submit
curl -X POST https://ocr.chat/api/v1/ocr/ \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -F "file=@invoice.pdf" -F "tier=vlm"

# 2. Poll  (use the uuid from step 1)
curl https://ocr.chat/api/v1/ocr/UUID/ \
  -H "Authorization: Bearer YOUR_API_TOKEN"

# 3. Download
curl -L "https://ocr.chat/api/v1/ocr/UUID/download/?format=md" \
  -H "Authorization: Bearer YOUR_API_TOKEN" -o result.md

Ma parameters

Field	Type	Description
`file`	file	Required. The image or PDF to process.
`tier`	string	`cpu` (default, fast/printed) or `vlm` (premium AI: handwriting, layout, math).
`language`	string	`auto` (default) or a language code (`en`, `ch`, `ja`, `ar`, …).
`tool`	string	Optional tool slug (e.g. `extract-tables`, `handwriting-to-text`) to apply that tool's preset.
`translate_to`	string	For the translate tool, target language code.

Mavuto & mipaka

Code	Meaning
`400`	No file, unsupported type, or file too large.
`401`	Missing or invalid API token.
`402`	Out of pages, daily/monthly free limit reached, or no credits. The body includes `used`/`cap`.
`404`	Job UUID not found.
`409`	Download requested before the job finished.

Each page processed costs credits (1/page on the fast tier, more on premium). Paid plans raise per-file page caps and add priority. See pricing.

Mafunso ofunsidwa nthawi zambiri

Create a free account and open your account page, your token is shown there with a copy button.

Yes, files of 5 pages or fewer return the full result inline in the POST response, so no polling is needed for most images and short PDFs.

Over 100, including Latin, CJK, Arabic, Cyrillic and Indic scripts. Use language=auto to detect, or pass a specific code.

Uploads are processed for OCR and deleted automatically. We never sell, share, or train on your documents.

Kugwiritsa ntchito kumawerengedwa patsamba lililonse m'malo mwa ndalama zanu za akaunti: zokambirana zosadziwika zimalandira ndalama za tsiku ndi tsiku za IP, ndalama zosalipira za mwezi, ndipo maphunziro olipira amagwiritsa ntchito ndalama zogulitsidwa ndi masamba opitilira masamba ndi ma priority.

Mukhoza kutumiza PNG, JPG, WEBP, GIF, BMP, TIFF, ndi PDF yokhala ndi masamba ambiri.Mafunso atsitsa ngati txt, md, docx, pdf (osaka), csv, kapena json pogwiritsa ntchito fayilo ya fayilo ya fayilo ya fayilo.

400 ndi fayilo yosapezeka, yosagwirizana ndi mtundu, kapena fayilo yolemera kwambiri; 401 ndi fayilo yosapezeka kapena yosavomerezeka; 402 ndi patsamba; 404 ndi ntchito yosadziwika UUID; ndi 409 ndi download yofunidwa pambuyo ntchito yatha.

A ntchito chinthu ndi khalidwe, tier, zinenero, page_count, ndi mean_confidence, kuphatikizapo zonse malemba ndi markdown. Masambatolankhani amatenga mbali iliyonse ya tsamba m'ma blocks ndi malemba awo, kuzungulira katundu (bbox), ndi per-block confidence.

Use cpu (the default) for fast, low-cost recognition of clean printed documents. Use vlm, the premium AI engine, for handwriting, complex or multi-column layouts, math, and translation, where it is far more accurate.

Patsani chida ndi slug (mwachitsanzo extract-tables kapena handwriting-to-text) kuti mugwiritse ntchito chida chosinthidwa. Pankhani ya chida chotchulazi, patsaninso translate_to ndi kodi ya chilankhulo chofuna kuti mulandire mawu odziwikanso osinthidwa.

Fayilo ya 5 masamba kapena ochepa kumbuyo inline mu POST yankho. Zochulukirapo fayilo akubweranso mwamsanga monga kuyembekezera kapena kugwiritsira ntchito, ndipo inu poll GET /api/v1/ocr/<uuid>/ mpaka lamuloli lidzachitika kapena kulephera. Maphunziro olipira amawonjezera kapangidwe ka tsamba la per-file.

API ndi REST yoyera pa HTTPS, ndipo imagwira ntchito kuchokera ku chilichonse ndi HTTP client - onani Python, Node.js, ndi cURL chitsanzo pamwambapa. Palibe SDK yokhazikitsa; Palibe masamba angapo a standard HTTP code omwe mukufuna.

OCR.chat API

Kufotokozera

Kutsimikizira

Kutumiza fayilo

Kupeza zotsatira

Kutsitsa mtundu

Kulankhulana ndi fayilo

Kodi chitsanzo

Ma parameters

Mavuto & mipaka

Mafunso ofunsidwa nthawi zambiri

How do I get an API token?

Is there a synchronous (no-polling) mode?

What languages are supported?

Do you store my documents?

Kodi ndi ziti zoletsa ndi zoletsa zomwe zilipo?

Kodi inu mukudziwa zimene zidzachitike ndi output formats?

Kodi kodi za vutoli zikutanthauza chiyani?

Kodi yankho likuwoneka bwanji?

Kodi ndiyenera kugwiritsa ntchito tier = cpu vs tier = vlm?

Kodi ntchitoyi ndi translate_to parameters ntchito?

Kodi ndi fayilo yotani yolimba yomwe ndingatumize, ndipo bwanji za ntchito zolimba?

Kodi pali SDKs ovomerezeka?