Using OCR to redact image documents
Leave feedback
GroupDocs.Redaction supports both types of image documents for Optical Character Recognition (OCR):
- image files, such as printed document scans (PNG, JPG, etc.)
- embedded images within office documents (PDF, DOCX, etc.)
You have to implement IOcrConnector interface and pass the instance to RedactorSettings constructor.
For more details, see OCR Usage Basics article.
There are the following limitations of the OCR with GroupDocs.Redaction for Java v21.6:
- textual replacements are not supported, so you have to use color box replacements to redact text in images.
- Spreadsheets, HTML and Markdown document types are not supported.
We are working on removing these limitations in future releases of GroupDocs.Redaction.
You can find details and examples of using OCR with GroupDocs.Redaction in one of these guides:
Was this page helpful?
Any additional feedback you'd like to share with us?
Please tell us how we can improve this page.
Thank you for your feedback!
We value your opinion. Your feedback will help us improve our documentation.