GroupDocs.Merger for Python via .NET provides a comprehensive set of document manipulation features across 70+ supported formats — Microsoft Office, PDF, OpenDocument, images, Visio diagrams, eBooks, archives, and more. It runs entirely on-premise, requires no third-party office applications, and ships as a pre-built wheel on Windows, Linux, and macOS.
The core capability is combining multiple documents into one. You can merge entire documents or select specific pages or page ranges from each source document. Format-specific join options let you control how the documents are stitched together — for example, continuous section breaks for Word documents or bookmark preservation for PDFs.
Merge specific pages from each source — see Merge PDF or any per-format merge page.
Word-specific options (WordJoinOptions) and PDF-specific options (PdfJoinOptions) are documented per format in Merge Files.
Split
Split a document into a collection of smaller documents. You can emit one file per listed page, split by interval (every N pages), or split a plain-text file by line numbers.
Read metadata from a document without modifying it — file type, page count, page dimensions, visibility flags, and file size. Enumerate all formats supported at runtime via FileType.get_supported_file_types().
Generate raster image previews (PNG, JPEG, or BMP) of individual document pages. Previews are useful for displaying document thumbnails in a UI, validating page content before merging, or feeding page images into a vision model.
See Page Preview for a runnable example using the file-stream callback pattern.
Loading Documents
GroupDocs.Merger accepts documents from local disk, binary streams (file handles, in-memory buffers), and password-protected files via LoadOptions.
GroupDocs.Merger is designed to be a first-class building block for AI document pipelines. The groupdocs-merger-net pip package ships an AGENTS.md file inside the wheel so AI coding assistants can discover the API surface automatically, and GroupDocs runs a public MCP server for on-demand documentation lookups.
See Agents and LLM Integration for the full story — including a runnable pipeline that merges PDFs and extracts a page subset for downstream AI processing.
On-Premise Deployment
No cloud calls, no outbound network traffic, no Microsoft Office or Adobe Acrobat installation required. The wheel is self-contained on Windows and ships its own native runtime libraries on Linux and macOS. See System Requirements for the short list of optional native packages (libgdiplus, libfontconfig1).
Was this page helpful?
Any additional feedback you'd like to share with us?
Please tell us how we can improve this page.
Thank you for your feedback!
We value your opinion. Your feedback will help us improve our documentation.