Specify File Type for Comparison Manually

GroupDocs.Comparison for Python via .NET detects file type from the extension by default. When the extension is missing, wrong, or ambiguous, set the file type explicitly via LoadOptions.file_type.

Example: Specify the file type manually

from groupdocs.comparison import Comparer
from groupdocs.comparison.options import LoadOptions
from groupdocs.comparison.result import FileType

def specify_file_type_manually():
    load_options = LoadOptions()
    load_options.file_type = FileType.DOCX

    with Comparer("./source.docx", load_options) as comparer:
        comparer.add("./target.docx", load_options)
        comparer.compare("./result.docx")

if __name__ == "__main__":
    specify_file_type_manually()

source.docx is the source file used in this example. Click here to download it.

target.docx is the target file used in this example. Click here to download it.

Binary file (DOCX, 25 KB)

Download full output

Use case: process files with incorrect or missing extensions by forcing the correct type.

Common FileType values

A few commonly-used members of groupdocs.comparison.result.FileType:

Format familyFileType value
WordFileType.DOCX, FileType.DOC, FileType.DOT, FileType.DOTX, FileType.ODT, FileType.RTF
PDFFileType.PDF
SpreadsheetFileType.XLSX, FileType.XLS, FileType.ODS, FileType.CSV
PresentationFileType.PPTX, FileType.PPT, FileType.ODP
EmailFileType.EML, FileType.MSG, FileType.EMLX
ImageFileType.PNG, FileType.JPG, FileType.BMP, FileType.TIFF, FileType.GIF
WebFileType.HTML, FileType.MHTML
MarkdownFileType.MD, FileType.MARKDOWN
Text & codeFileType.TXT, FileType.JSON, FileType.XML, FileType.YAML, FileType.PY, FileType.JS, FileType.CS, …

To enumerate every supported value at runtime, call FileType.get_supported_file_types().

When to use

  • Files with no extension. Documents downloaded from S3 or content systems that strip extensions still need a file-type hint.
  • Wrong extension. Files renamed by content-management systems (e.g., .bin, .dat) that hold valid Word / PDF bytes.
  • Custom content-type detection. Pipelines where Python has already inspected the file (via python-magic or similar) and wants to bypass the library’s extension-based detection.
Close
Loading

Analyzing your prompt, please hold on...

An error occurred while retrieving the results. Please refresh the page and try again.