Extract data from Microsoft Office Word documents
GroupDocs.Parser provides the functionality to extract data from Microsoft Office Word documents. Both classic (doc, dot) and Open XML (docx, dotx) formats are supported. Also LibreOffice Writer (OpenOffice.org Writer) formats and RTF are supported.
The following table provides the list of supported formats:
Format | Description |
---|---|
DOC | Microsoft Office Word Document |
DOT | Microsoft Office Word Document Template |
DOCX | Microsoft Office Open XML Document |
DOCM | Microsoft Office Open XML Macro-Enabled Document |
DOTX | Microsoft Office Open XML Document Template |
DOTM | Microsoft Office Open XML Document Macro-Enabled Template |
TXT | Plain text |
ODT | Open Document Text |
OTT | Open Document Text Template |
RTF | Rich Text Format |
You may easily run the code above and see the feature in action in our GitHub examples:
Along with full featured .NET library we provide simple, but powerful free Apps.
You are welcome to parse documents and extract data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more with our free online Free Online Document Parser App.
The following examples demonstrate how to extract data from Microsoft Office Word documents:
- Extract text from Microsoft Office Word documents
- Extract metadata from Microsoft Office Word documents
- Extract images from Microsoft Office Word documents
- Extract hyperlinks from Microsoft Office Word documents
- Extract tables from Microsoft Office Word documents
- Extract table of contents from Microsoft Office Word documents
- Search text in Microsoft Office Word documents