Extract highlights Leave feedback

Prerequisites

GroupDocs.Parser for Python via .NET installed
Sample documents for testing
Understanding of text positioning concepts

What are highlights?

Highlights are text fragments extracted from a document at a specific position, including:

Text: The main text content
Position: Character position in the document
Context: Surrounding text before and after

Extract highlights

To extract a highlight from a specific position:

Python

from groupdocs.parser import Parser
from groupdocs.parser.options import HighlightOptions

# Create an instance of Parser class
with Parser("./sample.pdf") as parser:
    # Create highlight options (extract 20 characters)
    options = HighlightOptions(20)
    
    # Extract highlight at position 100
    highlight = parser.get_highlight(100, True, options)
    
    if highlight:
        print(f"Position: {highlight.position}")
        print(f"Text: {highlight.text}")
    else:
        print("Highlight extraction not supported or position out of range")

sample.pdf

The following sample file is used in this example: sample.pdf

Expected behavior: Returns a HighlightItem object containing the text fragment starting at the specified position, or None if extraction is not supported.

Extract highlights with fixed length

To extract highlights with a specific character count:

Python

from groupdocs.parser import Parser
from groupdocs.parser.options import HighlightOptions

# Create an instance of Parser class
with Parser("sample.docx") as parser:
    # Define positions to extract highlights from
    positions = [0, 100, 500, 1000]
    
    # Create highlight options (50 characters)
    options = HighlightOptions(50)
    
    for pos in positions:
        # Extract highlight
        highlight = parser.get_highlight(pos, True, options)
        
        if highlight:
            print(f"
Position {pos}:")
            print(f"Text: {highlight.text}")

sample.docx

The following sample file is used in this example: sample.docx

Expected behavior: Extracts 50-character text fragments from specified positions.

Extract highlights for search results

To create search result previews with

Extract highlights Leave feedback

On this page

Prerequisites

What are highlights?

Extract highlights

Extract highlights with fixed length

Extract highlights for search results

Extract highlights from multiple positions

Create document preview with highlights

Extract highlights with word boundaries

Highlight extraction for table of contents

Notes

Was this page helpful?

Any additional feedback you'd like to share with us?

Please tell us how we can improve this page.

Thank you for your feedback!

On this page

Extract highlights Leave feedback

On this page

Prerequisites

What are highlights?

Extract highlights

Extract highlights with fixed length

Extract highlights for search results

Extract highlights from multiple positions

Create document preview with highlights

Extract highlights with word boundaries

Highlight extraction for table of contents

Notes

Related pages

Was this page helpful?

Any additional feedback you'd like to share with us?

Please tell us how we can improve this page.

Thank you for your feedback!

On this page