Up to 6x more accurate and 5x cheaper

Aryn's document parsing (DocParse) runs a compound deep learning AI model trained on 80k+ enterprise documents along with powerful post-processing steps. It's up to 6x more accurate and 5x cheaper than alternative systems, and has JSON or markdown output.

Supports over 30+ file formats including PDF and Microsoft Office

Document layout parsing with labeled bounding boxes by type (e.g. header, text, table...)

Scales to documents with thousands of pages

Supports OCR in 60+ languages

Get Started

Tame your table and image extraction

Complex tables with odd layouts, spanning rows, and lots of text? Trying to extract data from your documents? DocParse can handle it! It has best-in-class compound table extraction and LLM-powered image extraction and summarization to pull accurate information from documents.

Complex Tables

Preserve complex
table formatting

Image Extraction

Leverage GenAI to
extract and summarize images

Get Started

Easily integrate with only a few lines of code

Easily add DocParse to your document processing workflows with a few lines of code using the Aryn SDK. Or, use the Playground UI to visually inspect parsing and extraction.

Use sync or async APIs with the Aryn SDK

Use DocParse Playground UI to easily visualize parsing and extraction

Support for open source Sycamore document ETL library

Available as SaaS, private cloud, or on-prem deployment

Get Started

High quality AI-powered document parsing and data extraction

Up to 6x more accurate and 5x cheaper

Tame your table and image extraction

Easily integrate with only a few lines of code