• Introducing GroupDocs.Parser for Java
    • Getting Started
      • Features Overview
      • Supported Document Formats
      • System Requirements
      • Installation
      • Evaluation Limitations and Licensing
      • How to Run Examples
    • Developer Guide
      • Basic Usage
        • Get supported file formats
        • Get document info
        • Get supported features
        • Parse data from documents
        • Extract text from documents
        • Extract formatted text from documents
        • Extract metadata from documents
        • Extract images from documents
        • Extract data from attachments and ZIP archives
        • Extract data from PDF forms
        • Extract table of contents
      • Advanced Usage
        • Loading
          • Load document from local disk
          • Load document from stream
          • Loading specific file formats
          • Password-protected documents
        • Working with hyperlinks
          • Extract hyperlinks from document
          • Extract hyperlinks from document page
          • Extract hyperlinks from document page area
        • Working with tables
          • Extract tables from document
          • Extract tables from document page
        • Working with text
          • Extract text in Accurate Mode
          • Extract text in Raw Mode
          • Extract highlights
          • Search text
          • Working with formatted text
            • Extract formatted text from document
            • Extract formatted text from document page
            • HTML
            • Markdown
            • Plain text
          • Extract text structure
          • Extract text areas
          • Detect encoding
          • Extract text by table of contents item
        • Working with images
          • Extract images from document
          • Extract images from document page
          • Extract images from document page area
          • Extract images to files
        • Working with ZIP archives and attachments
          • Iterate through container items
          • Detect file type of container item
        • Extract data from various formats
          • Extract data from Microsoft Office Word documents
            • Extract text from Microsoft Office Word documents
            • Extract metadata from Microsoft Office Word documents
            • Extract images from Microsoft Office Word documents
            • Extract hyperlinks from Microsoft Office Word documents
            • Extract tables from Microsoft Office Word documents
            • Extract table of contents from Microsoft Office Word documents
            • Search text in Microsoft Office Word documents
          • Extract data from Microsoft Office Excel spreadsheets
            • Extract text from Microsoft Office Excel spreadsheets
            • Extract metadata from Microsoft Office Excel spreadsheets
            • Extract images from Microsoft Office Excel spreadsheets
            • Search text in Microsoft Office Excel spreadsheets
          • Extract data from Microsoft Office PowerPoint presentations
            • Extract text from Microsoft Office PowerPoint presentations
            • Extract metadata from Microsoft Office PowerPoint presentations
            • Extract images from Microsoft Office PowerPoint presentations
            • Search text in Microsoft Office PowerPoint presentations
          • Extract data from PDF documents
            • Extract text from PDF documents
            • Extract metadata from PDF documents
            • Extract images from PDF documents
            • Extract attachments from PDF portfolios
            • Parse data from PDF documents
            • Search text in PDF documents
          • Extract data from Emails
            • Extract text from Emails
            • Extract metadata from Emails
            • Extract images from Emails
            • Extract attachments from Emails
            • Extract emails from Outlook Storage
            • Extract emails from remote server via POP IMAP or Exchange Web Services protocols
            • Search text in Emails
          • Extract data from ePUB eBooks
            • Extract text from EPUB eBooks
            • Extract metadata from EPUB eBook
            • Extract table of contents from EPUB eBooks
            • Search text in EPUB eBooks
          • Extract data from ZIP archives
            • Extract text from ZIP archive files
          • Extract data from HTML documents
            • Extract text from HTML documents
            • Search text in HTML documents
          • Extract data from Microsoft OneNote notebooks
            • Extract text from Microsoft OneNote sections
            • Search text in Microsoft OneNote sections
        • Extract data from databases
        • Working with templates
        • Working with data extracted by template
        • Logging
        • Generate previews
      • Migration Notes
    • Release Notes
      • Release Notes - 2021
        • GroupDocs.Parser for Java 21.2 Release Notes
      • Release Notes - 2020
        • GroupDocs.Parser for Java 20.12 Release Notes
        • GroupDocs.Parser for Java 20.8 Release Notes
        • GroupDocs.Parser for Java 20.6 Release Notes
        • GroupDocs.Parser for Java 20.5 Release Notes
        • GroupDocs.Parser for Java 20.3 Release Notes
        • GroupDocs.Parser for Java 20.1 Release Notes
      • Release Notes - 2019
        • GroupDocs.Parser for Java 19.11 Release Notes
        • GroupDocs.Parser for Java 19.5 Release Notes
      • Release Notes - 2018
        • GroupDocs.Parser for Java 18.12 Release Notes
        • GroupDocs.Parser for Java 18.11 Release Notes
        • GroupDocs.Parser for Java 18.10 Release Notes
        • GroupDocs.Parser for Java 18.9 Release Notes
        • GroupDocs.Parser for Java 18.7 Release Notes
  1. Home
  2. GroupDocs.Parser Product Family
  3. GroupDocs.Parser for Java
  4. Developer Guide
  5. Advanced Usage
  6. Working with text
  7. Working with formatted text

Working with formatted text

  • Extract formatted text from document
  • Extract formatted text from document page
  • HTML
  • Markdown
  • Plain text
Search text Extract formatted text from document