Skip to end of metadata
Go to start of metadata
Contents Summary
 

The code in below examples uses some methods defined in Common Utilities.

Media Type Detectors

Each media type detector class is inherited from MediaTypeDetector abstract class. This class provides the interface to detect media type using following methods:

MethodDescription
Detect(Stream stream)Detects media type by the content of the stream (document)
Detect(String fileName)Detects media type by the extension of the file
Supports(String mediaType)Detects whether the media type is supported by the detector

Each media type detector is related to the definite text extractor:

ClassDescription
CellsMediaTypeDetectorDetects spreadsheets media types
CsvMediaTypeDetectorDetects CSV media type
EmailMediaTypeDetectorDetects email message media types
PdfMediaTypeDetectorDetects PDF media type
PersonalStorageMediaTypeDetectorDetects Outlook's Personal Storage media types
SlidesMediaTypeDetectorDetects presentations media types
WordsMediaTypeDetectorDetects text documents media types
CompositeMediaTypeDetectorSpecial media type detector. Can combine media type detectors
ChmMediaTypeDetectorDetects Microsoft Compiled HTML Help (chm) media type
EpubMediaTypeDetectorDetects Electronic Publication (EPUB) media type
FictionBookMediaTypeDetectorDetects FictionBook (fb2) media type
ZipMediaTypeDetectorDetects Zip archives media type

Detecting Media by Content

 

Detecting Media by Extension

 

Detecting Media Type Support

The API also supports detecting whether the media type is supported by the detector. The following code sample detects the media type support.

 

Detecting Media Type using CompositeMediaTypeDetector 

CompositeMediaTypeDetector class is used for detecting any supported media type. It returns media type (i.e. APPLICATION/EPUB+ZIP for EPUB document) or null if media type can't be detected.

Detecting Media Type of Password-protected Office Open XML Documents

This feature is supported by version 18.12 or greater.

This feature allows detecting media type of password-protected Office Open XML documents by content. To detect media type of encrypted Office Open XML document detect(Stream, LoadOptions) method is used as shown in the following code sample.

You can also perform batch document processing using PasswordProvider as shown in the following code sample.

Labels
  • No labels