read_barcode
Read and decode barcodes from images or PDF documents. Supports all popular barcode types including QR Code, Code 128, EAN, DataMatrix, PDF417, and many more.
Returns all detected barcodes with their values, types, confidence scores, and page locations.
classify_document
Automatically classify a document based on keyword-based rules. Useful for sorting incoming documents (e.g., identifying which vendor provided a document) to determine the appropriate extraction template.
Provide custom classification rules in CSV format, or use a URL to an external CSV rules file.
pdf_ocr
Make a PDF text-searchable using OCR, or make it non-searchable by removing the text layer.
Use "searchable" mode to apply OCR to scanned PDFs so text can be selected and searched. Use "unsearchable" mode to flatten the text layer.
search_pdf_text
Search for text within a PDF document and return all matches with their positions. Supports exact text matching, smart matching, and regular expressions.
Returns the text, coordinates, and page index for each match found.
merge_documents
Merge multiple documents into a single PDF file. Source documents can be PDFs, DOC, text, Excel, images, or ZIP archives containing documents.
Provide multiple file URLs separated by commas. All documents will be combined into one PDF in the order specified.
pdf_security
Add or remove password protection from a PDF document. When adding a password, you can configure encryption algorithms and set granular permissions for printing, copying, form filling, and modification.
Use "add" action to protect a PDF, or "remove" action to unlock a protected PDF.
convert_pdf
Convert a PDF document to another format such as CSV, JSON, text, Excel, XML, HTML, or image formats.
Supports OCR for scanned documents, page selection, and password-protected PDFs.
Use this to extract tabular data (CSV/Excel), structured content (JSON/XML), plain text, or render pages as images.
parse_document
Extract structured data from PDF, JPG, or PNG documents using a document parser template. Templates define which fields, tables, and values to extract.
Use template ID "1" for the built-in general invoice template, or specify a custom template ID for your own extraction rules.
get_pdf_info
Read PDF metadata and document information including page count, author, title, creation date, encryption status, and security permissions.
Use this to inspect a PDF's properties before performing other operations.
parse_invoice
Automatically extract structured data from invoice documents using AI. Detects invoice layouts without manual template configuration.
Extracts vendor info, customer info, invoice details, payment information, line items, and more.
generate_barcode
Generate a barcode image in various formats including QR Code, DataMatrix, Code 39, Code 128, PDF417, and many others.
Returns a URL to the generated barcode image. Optionally embed a logo image inside QR codes.
generate_pdf
Create a PDF document from various sources including HTML content, a web URL, a document file (DOC, DOCX, RTF, TXT, XPS), or an image file (JPG, PNG, TIFF).
Supports configurable paper size, margins, orientation, and custom headers/footers when generating from HTML or URL.
split_pdf
Split a PDF document into multiple files. Supports splitting by page ranges or by text/barcode content found in pages.
When splitting by pages, specify comma-separated page ranges. When splitting by text or barcode, specify a search string to determine split points.
edit_pdf
Edit an existing PDF by adding text annotations, images, or filling form fields. You can also search and replace text, delete text, or remove specific pages.
Combines multiple editing capabilities in a single tool — specify which operations to perform using the appropriate parameters.