Extract highlights

GroupDocs.Parser provides the functionality to extract a highlight (a part of the text which is usually used to explain the context of the found text in the search functionality) from documents by the GetHighlight method:

HighlightItem GetHighlight(int position, bool isDirect, HighlightOptions options);

The position parameter defines the start position from which the highlight is extracted. The isDirect parameter indicates whether highlight extraction is direct: true if the highlight is extracted by the right of the position; otherwise, false. HighlightOptions parameter is used to define the end of the highlight.

HighlightOptions class has the following constructors:

// Highlight is limited to maxLength text length.
HighlightOptions(int maxLength);
// Highlight is limited to the start (or the end) of a text line (or maxLength text length - if set).
HighlightOptions(int? maxLength, bool isLineLimited);
// Highlight is limited to word count (or maxLength text length - if set).
HighlightOptions(int? maxLength, int wordCount);
// General constructor
HighlightOptions(int? maxLength, int? wordCount, bool isLineLimited);

HighlightItem class has the following members:

MemberDescription
PositionThe position in the document text.
TextThe highlight text.

Here are the steps to extract highlight from the document:

The following example shows how to extract a highlight that contains 3 words:

// Create an instance of Parser class
using (Parser parser = new Parser(filePath))
{
    // Extract a highlight:
    HighlightItem hl = parser.GetHighlight(2, true, new HighlightOptions(3));
    // Check if highlight extraction is supported
    if (hl == null)
    {
        Console.WriteLine("Highlight extraction isn't supported");
        return;
    }
    // Print an extracted highlight
    Console.WriteLine(string.Format("At {0}: {1}", hl.Position, hl.Text));
}

More resources

GitHub examples

You may easily run the code above and see the feature in action in our GitHub examples:

Free online document parser App

Along with full featured .NET library we provide simple, but powerful free Apps.

You are welcome to parse documents and extract data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more with our free online Free Online Document Parser App.