Skip to end of metadata
Go to start of metadata
Contents Summary
 

The code in below examples uses some methods defined in Common Utilities.

Extract Highlight from Documents

This feature is supported by version 17.1.0 or greater.

GroupDocs.Parser for .NET allows its users to extract highlights from documents.

The Recipe

For extracting highlights from documents IHighlightExtractor interface is used. Most of the RAW text extractors implement IHighlightExtractor interface.

IHighlightExtractor interface contains only one method:

This method accepts one or more HighlightOptions parameters. HighlightOptions class provides the information about a highlight:

Position

A start position of highlight

Direction

A direction of highlight

Length

A length of highlight

Position indicates the beginning of highlight. For example, if Direction is left, then ExtractHighlights method returns a text from Position - Length to Position.

ExtractHighlights method returns a collection of strings. Length of this collection is equal to the number of HighlightOptions parameters.

The steps involved in extracting highlights from a document are given below:

  • Get the file's path
  • Initialize WordsTextExtractor object sending the file's path in the constructor and name this object as extractor
  • Extract highlighted text using extractor.ExtractHighlights with required HighlightOptions

The Code

Extract Formatted Highlight from Documents

This feature is supported by version 17.6.0 or greater

GroupDocs.Parser for .NET allows its users to extract highlights from documents. The API allows extracting formatted highlights from all the supporting documents including Words, Slides, Cells, EPUB, FB2, and Email file formats.

The Recipe

The steps involved in extracting formatted highlights from a document are given below:

  • Get the file's path
  • Initialize WordsFormattedTextExtractor object sending the file's path in the constructor and name this object as extractor
  • Extract formatted highlights using extractor.ExtractHighlights with required HighlightOptions

The Code

Extract a Highlight to Line's Start/End

This feature is supported by version 17.2.0 or greater

GroupDocs.Parser API allows limiting highlight by the start or end of the line.

The Recipe

The steps involved in limiting highlights by a line's start or end are given below:

  • Get the file's path
  • Initialize WordsTextExtractor object sending the file's path in the constructor and name this object as extractor
  • Extract highlighted text using extractor.ExtractHighlights with required HighlightOptions
  • You can limit highlight by the start or end of line by defining highlight direction and length in CreateLineOptions method of the HighlightOptions class

The Code

Extract a Highlight With the Limited Words Count

This feature is supported by version 17.2.0 or greater

GroupDocs.Parser API allows limiting highlight by the words count.

The Recipe

The steps involved in limiting highlight by words count are given below:

  • Get the file's path
  • Initialize WordsTextExtractor object sending the file's path in the constructor and name this object as extractor
  • Extract highlighted text using extractor.ExtractHighlights with required HighlightOptions
  • You can limit the highlight words count by defining highlight direction, position and count of words to be highlighted in CreateWordsCountOptions method of the HighlightOptions class

The Code

Extract Highlights from EPUB Files

This feature is supported by version 17.3.0 or greater

GroupDocs.Parser for .NET API allows a user to extract highlighted text from EPUB documents.

The Recipe

The steps involved in extracting highlighted text are given below:

  • Get the file's path
  • Initialize EpubTextExtractor object sending the file's path in the constructor and name this object as extractor
  • Extract highlighted text using extractor.ExtractHighlights with required HighlightOptions

The Code

Extract Highlights from FB2 Files

This feature is supported by version 17.5.0 or greater

GroupDocs.Parser for .NET API allows a user to extract highlighted text from FB2 documents.

The Recipe

The steps involved in extracting highlighted text are given below:

  • Get the file's path
  • Initialize FictionBookTextExtractor object sending the file's path in the constructor and name this object as extractor
  • Extract highlighted text using extractor.ExtractHighlights with required HighlightOptions

The Code

Labels
  • No labels