GroupDocs.Redaction for Java 21.6 Release Notes

Major Features

There are the following improvements in this release:

  • Enable OCR Processing

Full List of Issues Covering all Changes in this Release

REDACTIONJAVA-126Enable OCR ProcessingFeature

Public API and Backward Incompatible Changes

Enable OCR Processing

This feature makes possible redaction of text in image documents and embedded images, using Optical Character Recognition (OCR) tools.

Public API changes

Interface IOcrConnector providing methods that are required to apply textual redactions to image documents and embedded images has been added.
Class RecognizedImage representing text, extracted from an image has been added.
Class TextLine representing a line of text, extracted by OCR engine has been added.
Class TextFragment representing a part of recognized text (word, symbol, etc) has been added.
Class RedactableImage representing standalone or an embedded image has been added.


The following example demonstrates how to use an implementation of IOcrConnector (e.g. AsposeCloudOcrConnector or any other OCR toolkit connector) to redact embedded images.


            RedactorSettings settings = new RedactorSettings(new MyOwnOcrConnector());
            try (Redactor redactor = new Redactor("FileWithEmbeddedImages.pdf", new LoadOptions(), settings))
                ReplacementOptions marker = new ReplacementOptions(Color.BLACK);
                RedactorChangeLog result = redactor.apply(new Redaction[] {
                    new RegexRedaction("(?<=Dear\\s)([^,]+)", marker) // person name
                    new RegexRedaction("\\d{4}", marker)  // card number parts, etc
                if (result.getStatus() != RedactionStatus.Failed)
           SaveOptions(false, "Redacted"));