GroupDocs.Redaction for Python via .NET Overview Leave feedback

What is GroupDocs.Redaction?

GroupDocs.Redaction for Python via .NET is a native Python library that permanently removes or obscures sensitive content from documents — across PDF, Microsoft Word, Excel, PowerPoint, and image formats — through a single, format-independent API. It runs entirely on-premise, requires no Microsoft Office or Adobe Acrobat installation, and ships as a pre-built wheel on Windows, Linux, and macOS.

Typical uses include:

PII / PHI removal — strip names, SSNs, emails, and other personal data from a document before it is shared, published, or archived (GDPR, HIPAA, CCPA).
Legal & e-discovery redaction — black out privileged phrases and annotations across every page of a production set.
Metadata sanitization — erase or rewrite author, company, and other hidden metadata that leaks information.
Irreversible redaction — rasterize the cleaned document to a PDF so the removed content can never be recovered.
Policy-driven batch redaction — define a reusable set of redaction rules once and apply it across many documents in a pipeline.

Key Capabilities

Capability	Description
Text Redaction	Replace or black out text matched by an exact phrase (case-sensitive or RTL-aware) or a regular expression. See Text Redactions.
Metadata Redaction	Erase metadata wholesale or by filter, or rewrite values that match a pattern. See Metadata Redactions.
Image Redaction	Black out a rectangular area of an image or scanned page and clean embedded image metadata. See Image Redactions.
Annotation Redaction	Rewrite or delete annotations, comments, and notes by pattern. See Annotation Redactions.
Page Removal	Remove whole pages, slides, or worksheets from a document. See Remove Page Redactions.
Rasterization & Saving	Save in the original format, or rasterize to a PDF (optionally PDF/A) so redactions are irreversible. See Saving Documents.
Redaction Policies	Bundle several redactions into a reusable policy and apply it across many documents. See Use Redaction Policies.
Document Inspection	Read file type, page count, and size without modifying the document. See Get File Info.

Quick Example

Redact every occurrence of a phrase and save the result with just a few lines of code. The example rasterizes the result to a PDF named sample_redacted.pdf, so the removed content cannot be recovered:

redact_text.py

from groupdocs.redaction import Redactor
from groupdocs.redaction.options import SaveOptions
from groupdocs.redaction.redactions import ExactPhraseRedaction, ReplacementOptions

def redact_text():
    # Open the document
    with Redactor("./sample.docx") as redactor:
        # Replace every occurrence of "John Doe" with "[personal]"
        redactor.apply(ExactPhraseRedaction("John Doe", ReplacementOptions("[personal]")))
        # Rasterize the result to a PDF named sample_redacted.pdf
        save_options = SaveOptions()
        save_options.add_suffix = True
        save_options.rasterize_to_pdf = True
        save_options.redacted_file_suffix = "redacted"
        redactor.save(save_options)

if __name__ == "__main__":
    redact_text()

sample.docx

sample.docx is the sample file used in this example. Click here to download it.

sample_redacted.pdf

Binary file (PDF, 1.0 MB)

Download full output

For finer control, apply several redactions and keep the original format with SaveOptions(rasterize_to_pdf=False):

redact_with_options.py

from groupdocs.redaction import Redactor
from groupdocs.redaction.redactions import ExactPhraseRedaction, RegexRedaction, ReplacementOptions
from groupdocs.redaction.options import SaveOptions

def redact_with_options():
    with Redactor("./sample.docx") as redactor:
        # Redact a name and any 2+ digit number sequences
        redactor.apply(ExactPhraseRedaction("John Doe", ReplacementOptions("[personal]")))
        redactor.apply(RegexRedaction(r"\d{2,}", ReplacementOptions("[number]")))

        # Keep the original DOCX format instead of rasterizing to PDF
        options = SaveOptions()
        options.add_suffix = True
        options.rasterize_to_pdf = False
        options.redacted_file_suffix = "redacted"
        redactor.save(options)

if __name__ == "__main__":
    redact_with_options()

sample.docx

sample.docx is the sample file used in this example. Click here to download it.

sample_redacted.docx

Binary file (DOCX, 16 KB)

Download full output

Where to next

Install the package — Installation walks through PyPI and offline wheel installation for Windows, Linux, and macOS.
Run your first redaction — Hello, World! redacts a document in under five minutes.
Explore runnable examples — How to Run Examples clones the GitHub repository and runs every documented scenario locally or in Docker.
Use it in depth — the Developer Guide covers every API surface with runnable, copy-paste code examples.
Plug it into AI pipelines — AI Agents & LLM Integration explains the bundled AGENTS.md, the MCP server, and machine-readable docs.

We value your opinion. Your feedback will help us improve our documentation.

GroupDocs.Redaction for Python via .NET Overview Leave feedback

On this page

What is GroupDocs.Redaction?

Key Capabilities

Quick Example

Where to next

Was this page helpful?

Any additional feedback you'd like to share with us?

Please tell us how we can improve this page.

Thank you for your feedback!

On this page