GroupDocs.Parser for .NET 17.08 Release Notes

Major Features

There are the following features in this release:

  • Implement the support for CHM files

All Changes

KeySummaryIssue Type
TEXTNET-653Implement the support for CHM filesNew feature

Public API and Backward Incompatible Changes

Implement the ability to extract a text from pdf portfolios

This feature allows to extract a raw text from .CHM files.

Public API Changes
Added ChmTextExtractor class.

Extracts a line of characters from a document:

C#

// Create a text extractor for CHM documents
using (var extractor = new ChmTextExtractor(stream)) {
  // Extract a line of the text
  string line = extractor.ExtractLine();
  // If the line is null, then the end of the file is reached
  while (line != null) {
    // Print a line to the console
    Console.WriteLine(line);
    // Extract another line
    line = extractor.ExtractLine();
  }
} 

Extracts all characters from a document:

C#

// Create a text extractor for CHM documents
using (var extractor = new ChmTextExtractor(stream)) {
  // Extract a text
  Console.WriteLine(extractor.ExtractAll());
}