GroupDocs.Merger for Python via .NET is designed to work seamlessly with AI agents, LLMs, and automated code generation tools. The library ships machine-readable documentation in multiple formats — including an AGENTS.md file inside the pip package itself — so that AI assistants can discover and use the API without manual guidance.
MCP server
GroupDocs provides an MCP (Model Context Protocol) server that enables LLMs to query the documentation on demand instead of loading it all at once. This saves tokens and lets your AI assistant fetch only the information it needs for the current task.
To connect your AI tool to the MCP server, add the GroupDocs endpoint to your MCP configuration:
// Claude Code: ~/.claude/settings.json (or project .mcp.json)
// Claude Desktop: ~/Library/Application Support/Claude/claude_desktop_config.json
{"mcpServers":{"groupdocs-docs":{"url":"https://docs.groupdocs.com/mcp"}}}
// .cursor/mcp.json in your project root
{"mcpServers":{"groupdocs-docs":{"url":"https://docs.groupdocs.com/mcp"}}}
// .vscode/mcp.json in your project root
{"servers":{"groupdocs-docs":{"url":"https://docs.groupdocs.com/mcp"}}}
// Any MCP-compatible client
{"mcpServers":{"groupdocs-docs":{"url":"https://docs.groupdocs.com/mcp"}}}
The groupdocs-merger-net pip package includes an AGENTS.md file at groupdocs/merger/AGENTS.md. AI coding assistants that scan installed packages (such as Claude Code, Cursor, GitHub Copilot) can automatically discover the API surface, usage patterns, and troubleshooting tips.
After installing the package, locate the file with:
pip show -f groupdocs-merger-net | grep AGENTS
The full content of that file is reproduced in the AGENTS.md reference section below.
Machine-readable documentation
Every documentation page is available as a plain Markdown file that AI tools can fetch and process directly:
Point your AI assistant to the full documentation file for comprehensive context:
Fetch https://docs.groupdocs.com/merger/python-net/llms-full.txt and use it
as a reference for GroupDocs.Merger for Python via .NET API.
Or reference individual pages for focused tasks:
Read https://docs.groupdocs.com/merger/python-net/getting-started/quick-start-guide.md
and help me merge two PDF files in Python.
Why GroupDocs.Merger is a good building block for AI document pipelines
AI document pipelines often need to assemble, restructure, or subset documents before feeding them to a model or storing them in a vector database. GroupDocs.Merger covers these structural manipulation steps:
Merge multiple PDFs — consolidate retrieved or generated report sections into a single document for archiving or further processing.
Extract a page subset — pull the relevant pages from a large source document to reduce the context window needed by a downstream model.
Split — divide a large document into page-level chunks for per-chunk embedding or classification.
Rotate / reorder — normalise page orientation before OCR or vision-model ingestion.
A typical AI pipeline step that merges several PDFs and then extracts a focused subset:
fromgroupdocs.mergerimportMergerfromgroupdocs.merger.domain.optionsimportExtractOptionsdefassemble_and_subset_pdfs():"""Merge multiple PDF sections, then extract a focused page subset
for downstream AI processing (embedding, classification, etc.)."""# Step 1: Merge three PDF sections into one consolidated documentwithMerger("./section_intro.pdf")asmerger:# Append the methodology sectionmerger.join("./section_methods.pdf")# Append the results sectionmerger.join("./section_results.pdf")# Save the consolidated documentmerger.save("./consolidated.pdf")# Step 2: Extract only the pages relevant to the AI task (pages 2-4)withMerger("./consolidated.pdf")asmerger:# Keep only pages 2, 3, and 4 — discard boilerplate intro/conclusionmerger.extract_pages(ExtractOptions([2,3,4]))# Save the focused subset ready for embedding or LLM ingestionmerger.save("./ai_input_subset.pdf")if__name__=="__main__":assemble_and_subset_pdfs()
section_intro.pdf is a sample file used in this example. Click here to download it.
For end-to-end examples covering every documented scenario — including merging by format, splitting, security, and page-level operations — see the Developer Guide. Every code block on those pages has a runnable counterpart in the examples repository.
AGENTS.md reference
The content below is the same AGENTS.md file that ships inside the groupdocs-merger-net package. Copy it into your project as AGENTS.md or point your AI assistant to this page.
# GroupDocs.Merger for Python via .NET -- AGENTS.md
> Instructions for AI agents working with this package.
Merge, split, reorder, swap, move, rotate, and extract pages across documents -- Word, Excel, PowerPoint, PDF, Visio, images, eBooks, email, and text formats, all without MS Office or any external software installed. Add, remove, or change passwords, change page orientation, and render page previews through one unified API.
## Install
```bash
pip install groupdocs-merger-net
```**Python**: 3.5 - 3.14 | **Platforms**: Windows, Linux, macOS
## Resources
| Resource | URL |
|---|---|
| Documentation | https://docs.groupdocs.com/merger/python-net/ |
| LLM-optimized docs | https://docs.groupdocs.com/merger/python-net/llms-full.txt |
| API reference | https://reference.groupdocs.com/merger/python-net/ |
| Code examples | https://docs.groupdocs.com/merger/python-net/developer-guide/ |
| Release notes | https://releases.groupdocs.com/merger/python-net/release-notes/ |
| PyPI | https://pypi.org/project/groupdocs-merger-net/ |
| Free support forum | https://forum.groupdocs.com/c/merger/ |
| Temporary license | https://purchase.groupdocs.com/temporary-license |
## MCP Server
If your environment has MCP configured, you can connect your AI tool to the GroupDocs documentation server for on-demand API lookups:
```json
{"mcpServers":{"groupdocs-docs":{"url":"https://docs.groupdocs.com/mcp"}}}```Works with Claude Code (`~/.claude/settings.json`), Cursor (`.cursor/mcp.json`), VS Code Copilot (`.vscode/mcp.json`), and any MCP-compatible client. If MCP is unavailable, fall back to the LLM-optimized docs URL above and this file -- both are shipped inside the wheel.
## Imports
```python
fromgroupdocs.mergerimport(License,Metered,Merger,MergerSettings,)fromgroupdocs.merger.domainimportFileTypefromgroupdocs.merger.domain.optionsimport(# JoinJoinOptions,PageJoinOptions,ImageJoinOptions,ImageJoinMode,WordJoinOptions,WordJoinMode,PdfJoinOptions,# SplitSplitOptions,SplitMode,TextSplitOptions,TextSplitMode,# Page operationsExtractOptions,RemoveOptions,SwapOptions,MoveOptions,RotateOptions,RotateMode,OrientationOptions,OrientationMode,RangeMode,# SecurityAddPasswordOptions,UpdatePasswordOptions,PdfSecurityOptions,PdfSecurityPermissions,# Load / save / previewLoadOptions,SaveOptions,PdfSaveOptions,PreviewOptions,PreviewMode,)fromgroupdocs.merger.domain.resultimportIDocumentInfo,IPageInfofromgroupdocs.merger.exceptionsimport(GroupDocsMergerException,FileCorruptedException,IncorrectPasswordException,PasswordRequiredException,FileTypeNotSupportedException,)fromgroupdocs.merger.loggingimportConsoleLogger```## Load + Operate + Save (the core workflow)
`Merger` is the entry point. The flow is always: **open → one or more operations → `save()`**. Operations mutate the in-memory document, so you can chain several before a single `save`. Use `Merger` as a context manager so the native document handle is released.
```python
fromgroupdocs.mergerimportMergerwithMerger("document1.docx")asmerger:merger.join("document2.docx")# append a second DOCXmerger.save("merged.docx")# write the result```**Merger constructor.**`Merger(file_path)`, `Merger(stream)`, optionally with `load_options` and/or `MergerSettings`: `Merger(file_path, LoadOptions(...))`, `Merger(stream, LoadOptions(...), MergerSettings(...))`. Pass `LoadOptions(password=...)` to open a protected source.
**`merger.save(file_path_or_stream[, save_options])`** writes the current state to disk or a stream. Saving does not consume the `Merger` — you may continue operating and save again.
## Operations
### Merge / join documents
`join` appends another document to the one already loaded. The documents should share a format family (DOCX+DOCX, PDF+PDF, …). Pass a path or a binary stream; optional `*JoinOptions` tune the result.
```python
fromgroupdocs.mergerimportMergerfromgroupdocs.merger.domain.optionsimportPageJoinOptions,WordJoinOptions,WordJoinModewithMerger("source.docx")asmerger:merger.join("appendix.docx")# whole documentmerger.join("notes.docx",PageJoinOptions([1,3]))# only pages 1 and 3merger.save("combined.docx")# Word-specific: continuous join, no section breakswithMerger("a.docx")asmerger:wj=WordJoinOptions()wj.mode=WordJoinMode.CONTINUOUSmerger.join("b.docx",wj)merger.save("continuous.docx")````ImageJoinOptions(FileType.PNG, ImageJoinMode.VERTICAL)` stacks images; `PdfJoinOptions` exposes `use_bookmarks` / `preserve_accessibility`.
### Split a document
`split` writes each output via a path-format template (`{0}` is substituted with the index). `SplitOptions` works for paged documents; `TextSplitOptions` splits plain text by lines or intervals.
```python
fromgroupdocs.mergerimportMergerfromgroupdocs.merger.domain.optionsimportSplitOptions,SplitModewithMerger("multipage.pdf")asmerger:# one output file per listed pagemerger.split(SplitOptions("page_{0}.pdf",[1,3,5]))withMerger("multipage.pdf")asmerger:# split into intervalsmerger.split(SplitOptions("part_{0}.pdf",[2,4],split_mode=SplitMode.INTERVAL))```### Page operations
All page numbers are **1-based**. `RangeMode.ODD_PAGES` / `EVEN_PAGES` narrow a start/end range.
```python
fromgroupdocs.mergerimportMergerfromgroupdocs.merger.domain.optionsimport(ExtractOptions,RemoveOptions,SwapOptions,MoveOptions,RotateOptions,RotateMode,OrientationOptions,OrientationMode,RangeMode,)withMerger("doc.pdf")asmerger:merger.extract_pages(ExtractOptions([1,2,3]))# keep only these pagesmerger.save("first_three.pdf")withMerger("doc.pdf")asmerger:merger.remove_pages(RemoveOptions([2]))# drop page 2merger.swap_pages(SwapOptions(1,3))# swap pages 1 and 3merger.move_page(MoveOptions(4,1))# move page 4 to position 1merger.rotate(RotateOptions(RotateMode.ROTATE90,[1]))merger.change_orientation(OrientationOptions(OrientationMode.LANDSCAPE,start_number=1,end_number=2))merger.save("reordered.pdf")# extract a range, even pages onlyExtractOptions(start_number=1,end_number=10,mode=RangeMode.EVEN_PAGES)```### Password protection
```python
fromgroupdocs.mergerimportMergerfromgroupdocs.merger.domain.optionsimport(AddPasswordOptions,UpdatePasswordOptions,LoadOptions,)# Add a passwordwithMerger("doc.pdf")asmerger:merger.add_password(AddPasswordOptions("secret"))merger.save("protected.pdf")# Open a protected file, change or remove its passwordwithMerger("protected.pdf",LoadOptions(password="secret"))asmerger:ifmerger.is_password_set():merger.update_password(UpdatePasswordOptions("new-secret"))merger.save("rekeyed.pdf")withMerger("protected.pdf",LoadOptions(password="secret"))asmerger:merger.remove_password()merger.save("unprotected.pdf")```### Page builder
`create_page_builder` / `apply_page_builder` assemble a new document from selected pages of multiple loaded documents.
### Page preview
`generate_preview(PreviewOptions(...))` renders pages to images (PNG / JPEG / BMP). Pass a plain Python `create_page_stream(page_number)` callable that returns a writable stream for each page — the binding wraps it as the .NET delegate automatically.
```python
fromgroupdocs.mergerimportMergerfromgroupdocs.merger.domain.optionsimportPreviewOptions,PreviewModedefcreate_page_stream(page_number):returnopen(f"page-{page_number}.png","wb")# return a file/path streamwithMerger("doc.pdf")asmerger:merger.generate_preview(PreviewOptions(create_page_stream,PreviewMode.PNG,[1,2]))```The same page-stream callback pattern powers the stream-callback overloads of `split` — pass a `create_split_stream(number)` callable instead of a path-format template:
```python
fromgroupdocs.merger.domain.optionsimportSplitOptions,SplitModedefcreate_split_stream(number):returnopen(f"chunk-{number}.pdf","wb")withMerger("multipage.pdf")asmerger:merger.split(SplitOptions(create_split_stream,[1,2],SplitMode.PAGES))```> **Return a file/path stream, not `io.BytesIO()`, from these callbacks.** The engine writes into the stream you return, but in-memory `BytesIO` returned from a *callback* is not copied back to Python (unlike `save(BytesIO)`). Use `open(path, "wb")` (or any path-backed stream) so the rendered bytes land on disk.
### Document info (no full processing)
`get_document_info()` returns an `IDocumentInfo`: `page_count`, `size`, `type` (a `FileType`), and `pages` (a list of `IPageInfo` with `number`, `visible`, `width`, `height`).
```python
withMerger("doc.pdf")asmerger:info=merger.get_document_info()print("pages:",info.page_count,"size:",info.size,"type:",info.type.file_format)forpageininfo.pages:print(f" page {page.number}: {page.width}x{page.height}")```## Licensing
```python
fromgroupdocs.mergerimportLicense# From fileLicense().set_license("path/to/license.lic")# From streamwithopen("license.lic","rb")asf:License().set_license(f)```Or auto-apply: `export GROUPDOCS_LIC_PATH="path/to/license.lic"`Metered licensing is also available:
```python
fromgroupdocs.mergerimportMeteredMetered().set_metered_key("public-key","private-key")print(Metered().get_consumption_quantity(),Metered().get_consumption_credit())```**Evaluation vs licensed.** Without a license the library still runs, but PDF output carries an evaluation watermark stamp and non-PDF targets show an equivalent evaluation mark; there is also a page/document count cap. Set `GROUPDOCS_LIC_PATH` (or call `License().set_license(...)`) and re-run to clear both. A 30-day full license is free: https://purchase.groupdocs.com/temporary-license
## API Reference
### Merger
| Method | Returns | Description |
|---|---|---|
| `Merger(file_path / stream [, load_options [, settings]])` | | Open by path or binary stream; optional `LoadOptions` and `MergerSettings`. Use as a context manager. |
| `join(file_path / stream [, join_options])` | `IMerger` | Append another document; optional `JoinOptions` / `PageJoinOptions` / `ImageJoinOptions` / `WordJoinOptions` / `PdfJoinOptions`. |
| `split(split_options)` | `IMerger` | Split into multiple files via `SplitOptions` or `TextSplitOptions` (path-format template with `{0}`). |
| `extract_pages(extract_options)` | `IMerger` | Keep only the pages named by `ExtractOptions`. |
| `remove_pages(remove_options)` | `IMerger` | Delete the pages named by `RemoveOptions`. |
| `swap_pages(swap_options)` | `IMerger` | Swap two pages (`SwapOptions(first, second)`). |
| `move_page(move_options)` | `IMerger` | Move a page to a new position (`MoveOptions(from, to)`). |
| `rotate(rotate_options)` | `IMerger` | Rotate pages (`RotateOptions(RotateMode.ROTATE90, [pages])`). |
| `change_orientation(orientation_options)` | `IMerger` | Set portrait/landscape (`OrientationOptions`). |
| `add_password(add_password_options)` | `IMerger` | Protect output with a password. |
| `is_password_set()` | `bool` | Whether the loaded document is protected. |
| `remove_password()` | `IMerger` | Strip protection. |
| `update_password(update_password_options)` | `IMerger` | Change the password. |
| `import_document(import_document_options)` | `IMerger` | Embed an OLE object / attachment (`Ole*Options`, `PdfAttachmentOptions`). |
| `create_page_builder([page_builder_options])` | `PageBuilder` | Start assembling a document from selected pages. |
| `apply_page_builder(page_builder)` | `None` | Apply a built page selection. |
| `generate_preview(preview_options)` | `None` | Render pages to images via a `create_page_stream(page_number)` callable (`PreviewMode.PNG`/`JPEG`/`BMP`). Return a file/path stream, not `BytesIO`. |
| `get_document_info()` | `IDocumentInfo` | `page_count`, `size`, `type`, `pages` (list of `IPageInfo`). |
| `save(file_path / stream [, save_options])` | `IMerger` | Write current state; optional `SaveOptions` / `PdfSaveOptions`. |
| `dispose()` | `None` | Release native resources (handled by `with`). |
### Options & enums
| Type | Notes |
|---|---|
| `JoinOptions(file_type)` / `PageJoinOptions(page_numbers, …)` | Whole-document or page-subset join. |
| `ImageJoinOptions(file_type, image_join_mode)` | `ImageJoinMode.HORIZONTAL` / `VERTICAL`. |
| `WordJoinOptions` / `PdfJoinOptions` | `WordJoinMode`, `WordJoinCompliance`; `use_bookmarks`, `preserve_accessibility`. |
| `SplitOptions(file_path_format, page_numbers, split_mode=…)` | `SplitMode.PAGES` / `INTERVAL`. |
| `TextSplitOptions(file_path_format, line_numbers, mode=…)` | `TextSplitMode.LINES` / `INTERVAL`. |
| `ExtractOptions` / `RemoveOptions` / `RotateOptions` / `OrientationOptions` | Accept `page_numbers=[…]` or `start_number`/`end_number` + `RangeMode`. |
| `SwapOptions(first_page_number, second_page_number)` | 1-based page numbers. |
| `MoveOptions(page_number_to_move, new_page_number)` | 1-based page numbers. |
| `RotateMode` | `ROTATE90`, `ROTATE180`, `ROTATE270`. |
| `OrientationMode` | `PORTRAIT`, `LANDSCAPE`. |
| `RangeMode` | `ALL_PAGES`, `ODD_PAGES`, `EVEN_PAGES`. |
| `AddPasswordOptions(password)` / `UpdatePasswordOptions(new_password)` | Document password management. |
| `PdfSecurityOptions(password)` | `owner_password`, `permissions` (`PdfSecurityPermissions`). |
| `LoadOptions(file_type, password, encoding, extension, …)` | Force type / decode protected or extension-less input. |
| `SaveOptions` / `PdfSaveOptions` | `PdfSaveOptions.accesibility_settings` for tagged PDF. |
| `PreviewOptions(create_page_stream, preview_mode, …)` | `PreviewMode.PNG` / `JPEG` / `BMP`. |
### License / Metered
`License().set_license(path_or_stream)` · `Metered().set_metered_key(public, private)` · `Metered().get_consumption_quantity()` · `Metered().get_consumption_credit()`## Key Patterns
- **Properties**: use `snake_case` -- auto-mapped to .NET `PascalCase`- **Context managers**: `with Merger(...) as m:` ensures the document handle is released
- **Chaining**: operations mutate in place; run several before one `save()`- **1-based pages**: every page number argument starts at 1, not 0
- **Path templates**: `split` outputs use a `{0}` placeholder, e.g. `"page_{0}.pdf"`- **Streams**: pass `open("file", "rb")` or `io.BytesIO(data)` where .NET expects a Stream; `BytesIO` is updated after `save(stream)`- **Enums**: case-insensitive, lazy-loaded (e.g., `FileType.PDF`, `RotateMode.ROTATE90`)
- **Collections**: `for page in info.pages` and `len(...)` work on .NET collections
- **Exceptions**: catch `PasswordRequiredException` / `IncorrectPasswordException` / `FileCorruptedException` / `FileTypeNotSupportedException` (all subclass `GroupDocsMergerException`)
## Platform Requirements
| Platform | Requirements |
|---|---|
| Windows | None |
| Linux | `apt install libgdiplus libfontconfig1 ttf-mscorefonts-installer` |
| macOS | `brew install mono-libgdiplus` |
## Troubleshooting
**`PasswordRequiredException` / `IncorrectPasswordException`** -- the source is encrypted. Open it with `Merger(path, LoadOptions(password="..."))`.
**`FileTypeNotSupportedException` when joining** -- the two documents are different format families. Join documents of the same family, or convert first.
**`System.Drawing.Common is not supported`** -- install libgdiplus: `sudo apt install libgdiplus` (Linux) / `brew install mono-libgdiplus` (macOS)
**`Gdip` type initializer exception** -- outdated libgdiplus: `brew reinstall mono-libgdiplus` (macOS)
**Garbled text / missing fonts** -- install fonts: `sudo apt install ttf-mscorefonts-installer fontconfig && sudo fc-cache -f`**`DllNotFoundException: libSkiaSharp`** -- stale system copy conflicts with bundled version. Rename it: `sudo mv /usr/local/lib/libSkiaSharp.dylib /usr/local/lib/libSkiaSharp.dylib.bak`**`DOTNET_SYSTEM_GLOBALIZATION_INVARIANT` errors** -- do NOT set this. Install ICU: `sudo apt install libicu-dev`**`TypeLoadException`** -- reinstall: `pip install --force-reinstall groupdocs-merger-net`**Still stuck?** Post your question at https://forum.groupdocs.com/c/merger/ -- the development team responds directly.