Redaction basics

GroupDocs.Redaction supports an effective set of document redaction features. It allows to apply redactions for text, metadata, annotations, images.

Wide range of document formats is supported, such as: PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and others. See full list of supported formats at supported document formats article

Redaction types

GroupDocs.Redaction comes with the following redaction types:

TypeDescriptionClasses
TextReplaces or hides with color block a portion of text within document bodyExactPhraseRedaction, RegexRedaction
MetadataReplace metadata values with empty ones or redacts metadata textsEraseMetadataRedaction, MetadataSearchRedaction
AnnotationsDeletes annotations from document or redacts its textsDeleteAnnotationRedaction, AnnotationRedaction
ImagesReplaces specific area of an image with a colored boxImageAreaRedaction
PagesRemoves specific range of pages (slides, worksheets, etc.)RemovePageRedaction

Apply redaction

Applying redaction to a document is done through Redactor.Apply method. As a result, you receive RedactorChangeLog instance, containing a log entry for each redaction applied. The entry contains reference to Redacton instance including its options, status of the operation (see below) and textual descriptions when applicable. If at least one redaction failed, you will see Status == RedactionStatus.Failed:

from groupdocs.redaction import Redactor, RedactionStatus
from groupdocs.redaction.options import SaveOptions
from groupdocs.redaction.redactions import ExactPhraseRedaction, ReplacementOptions


def apply_redaction():
    # Specify the redaction options
    repl_opt = ReplacementOptions("[personal]")
    ex_red = ExactPhraseRedaction("John Doe", repl_opt)

    # Load the document to be redacted
    with Redactor("./sample.docx") as redactor:
        # Apply the redaction
        result = redactor.apply(ex_red)

        if result.status != RedactionStatus.FAILED:
            # Save the redacted document next to the source file
            so = SaveOptions()
            so.add_suffix = True
            so.rasterize_to_pdf = False
            so.redacted_file_suffix = "redacted"
            redactor.save(so)


if __name__ == "__main__":
    apply_redaction()

sample.docx is the sample file used in this example. Click here to download it.

Binary file (DOCX, 16 KB)

Download full output

All possible statuses of the RedactionStatus enumeration are listed in this table:

StatusDescriptionPossible reasons
AppliedRedaction was fully and successfully appliedAll operations within redaction process were successfully applied
PartiallyAppliedRedaction was applied only to a part of its matches1) Trial limitations for replacements were exceeded2) At least one change was rejected by user
SkippedRedaction was skipped (not applied)1) Trial limitations for redactions were exceeded2) Redaction cannot be applied to this type of document3) All replacements were rejected by user and no changes were made
FailedRedaction failed with exceptionAn exception occurred in process of redaction

For detailed information you have to iterate through redaction log entries in RedactorChangeLog.RedactionLog and check for ErrorMessage property of any items with status other than Applied:

result = redactor.apply(redaction)
if result.status != RedactionStatus.FAILED:
    # By default, the redacted document is saved in PDF format
    save_options = SaveOptions()
    save_options.add_suffix = True
    save_options.rasterize_to_pdf = True
    save_options.redacted_file_suffix = "redacted"
    result_path = redactor.save(save_options)
    print(f"Document redacted successfully.\nCheck output in {result_path}")
else:
    # Dump all failed or skipped redactions
    print("Redaction failed!")
    for log_entry in result.redaction_log:
        if log_entry.result.status != RedactionStatus.APPLIED:
            print(f"Status is {log_entry.result.status}, details: {log_entry.result.error_message}")

Apply multiple redactions

You can apply as much redactions as you need in a single call to Redactor.Apply method, since its overload accepts an array of redactions and redaction policy. In this case, redactions will be applied in the same order as they appear in the array. As an alternative to specifying redaction sets in your code, you can create an XML file with redaction policy, as described here.

from groupdocs.redaction import Redactor, RedactionStatus
from groupdocs.redaction.options import SaveOptions
from groupdocs.redaction.redactions import (
    ExactPhraseRedaction,
    RegexRedaction,
    ReplacementOptions,
    DeleteAnnotationRedaction,
    EraseMetadataRedaction,
    MetadataFilters,
)
from groupdocs.pydrawing import Color


def apply_multiple_redactions():
    # Define the color of the redaction box
    color = Color.from_argb(255, 220, 20, 60)

    # Provide a list of redactions to apply in order
    redaction_list = [
        ExactPhraseRedaction("John Doe", ReplacementOptions("[Client]")),
        RegexRedaction("Redaction", ReplacementOptions("[Product]")),
        RegexRedaction("\\d{2}\\s*\\d{2}[^\\d]*\\d{6}", ReplacementOptions(color)),
        DeleteAnnotationRedaction(),
        EraseMetadataRedaction(MetadataFilters.ALL),
    ]

    # Load the document to be redacted
    with Redactor("./sample.docx") as redactor:
        # Apply the list of redactions
        result = redactor.apply(redaction_list)

        if result.status != RedactionStatus.FAILED:
            # By default, the redacted document is saved in PDF format
            save_options = SaveOptions()
            save_options.add_suffix = True
            save_options.rasterize_to_pdf = True
            save_options.redacted_file_suffix = "redacted"
            redactor.save(save_options)
        else:
            # Dump all failed or skipped redactions
            print("Redaction failed!")
            for log_entry in result.redaction_log:
                if log_entry.result.status != RedactionStatus.APPLIED:
                    print(f"Status is {log_entry.result.status}, details: {log_entry.result.error_message}")


if __name__ == "__main__":
    apply_multiple_redactions()

sample.docx is the sample file used in this example. Click here to download it.

Binary file (PDF, 1.2 MB)

Download full output