Maitreya Mishra/FILE-CONVERTER-MCP
Built by Metorial, the integration platform for agentic AI.
Maitreya Mishra/FILE-CONVERTER-MCP
Server Summary
Convert Markdown to DOCX
Convert HTML to PDF
Convert DOCX to EPUB
Convert PDF to Markdown
Batch document conversions
Extract text from documents
A Python-based MCP (Model Context Protocol) server that provides powerful document conversion capabilities via Pandoc. This server allows AI agents (like Claude via LangChain/LangGraph) to request file conversions between various formats such as Markdown, DOCX, HTML, PDF, EPUB, and many more.
This project uses:
convert_document
.Dockerfile
) for creating a self-contained server environment including Pandoc and necessary LaTeX components for PDF generation.convert_document
Converts a document from one format to another using Pandoc.
Arguments:
input_file_path
(str, required): The path accessible by the server to the input document file. If running in Docker with a volume mount, this should be the path inside the container (e.g., /data/my_doc.docx
).output_file_path
(str, required): The path accessible by the server where the converted output file should be saved. If running in Docker, this should be the path inside the container (e.g., /data/my_output.pdf
). The directory will be created if it doesn't exist within the server's accessible filesystem.to_format
(str, required): The target format for the conversion (e.g., 'markdown', 'docx', 'pdf', 'html', 'rst', 'epub'). See Pandoc documentation for a full list (--list-output-formats
).from_format
(str, optional): The format of the input file. If None
, pandoc will try to guess from the file extension. Specify if the extension is ambiguous or missing (e.g., 'md', 'docx', 'html'). Defaults to None
.extra_args
(List[str], optional): A list of additional command-line arguments to pass directly to pandoc (e.g., ['--toc']
, ['-V', 'geometry:margin=1.5cm']
, ['--standalone']
). Defaults to None
.Returns:
You can run this server either locally (requires manual installation of dependencies) or using the provided Docker configuration (recommended for ease of use and deployment).
To install Pandoc Document Converter for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install @MaitreyaM/file-converter-mcp --client claude
This method bundles Python, Pandoc, LaTeX, and required libraries into a container. You only need Docker Desktop installed locally.
git clone https://github.com/your-username/pandoc-mcp-server.git # Replace with your repo URL
cd pandoc-mcp-server
Dockerfile
. It installs Pandoc, a capable TeX Live distribution (for PDF support), and Python dependencies inside the image. This step might take several minutes the first time.
docker build -t pandoc-converter-server .
/data
inside the container and mapping port 8000. Replace /path/to/your/local/project
with the actual absolute path to the project directory on your machine.# Example using the current directory (.) as the host path:
docker run -it --rm -p 8000:8000 -v "$(pwd)":/data pandoc-converter-server
# Or using an absolute path (replace):
# docker run -it --rm -p 8000:8000 -v "/path/to/your/local/project":/data pandoc-converter-server
-it
: Runs interactively (shows logs, allows Ctrl+C).--rm
: Removes the container when stopped.-p 8000:8000
: Maps port 8000 on your host to port 8000 in the container.-v "$(pwd)":/data
: Mounts the current working directory on your host to /data
inside the container. Files placed in your local project directory will appear in /data
inside the container, and files saved to /data
by the server will appear in your local project directory.pandoc-converter-server
: The name of the image you built.http://0.0.0.0:8000
). It's ready to accept connections from your MCP client (like the LangChain agent).MultiServerMCPClient
) to connect to http://127.0.0.1:8000/sse
with transport: "sse"
./data/
. For example: convert /data/my_input.docx to pdf at /data/my_output.pdf
. The output file will appear in your local project directory due to the volume mapping.This requires you to install Python, Pandoc, and a LaTeX distribution directly onto your host machine.
pandoc --version
in a new terminal.brew install --cask mactex-no-gui
(Recommended via Homebrew)sudo apt-get update && sudo apt-get install texlive-latex-base texlive-fonts-recommended texlive-latex-extra texlive-fonts-extra
(or texlive-full
for everything, but large).bin
directory containing pdflatex.exe
is added to your system's PATH.pdflatex --version
in a new terminal.git clone https://github.com/your-username/pandoc-mcp-server.git # Replace with your repo URL
cd pandoc-mcp-server
python -m venv venv
source venv/bin/activate # Linux/macOS
# venv\Scripts\activate # Windows
(Or use Conda: conda create --name pandoc-env python=3.11 && conda activate pandoc-env
)pip install -r requirements.txt
python pandoc_mcp_server.py
http://127.0.0.1:8000/sse
.http://127.0.0.1:8000/sse
.convert my_input.docx to pdf at my_output.pdf
, assuming files are in the same directory, or use absolute paths).Assuming the server container is running with the volume mount:
You: convert /data/report.md to pdf
Agent: Thinking...
[Agent calls convert_document tool with input='/data/report.md', output='/data/report.pdf', to='pdf']
Agent: Successfully converted document to '/data/report.pdf'
[The bot may then attempt to upload report.pdf from the local project directory]
pandoc_mcp_server.py
: The main Python script for the MCP server.Dockerfile
: Instructions for building the Docker container image.requirements.txt
: Python dependencies needed inside the Docker container (or local venv)..gitignore
: Specifies intentionally untracked files for Git.README.md
: This file.Contributions are welcome! Please feel free to submit a Pull Request or open an Issue.