Extract Metadata from Documents

GroupDocs.Parser extracts metadata such as author, title, creation date, and custom properties from supported formats (see supported formats).

Extract metadata

from groupdocs.parser import Parser

with Parser("./sample.pdf") as parser:
    metadata_items = parser.get_metadata()
    if metadata_items is None:
        print("Metadata extraction is not supported for this format.")
    else:
        for item in metadata_items:
            print(f"{item.name}: {item.value}")

The following sample file is used in this example: sample.pdf

Steps

  1. Create a Parser for the target document.
  2. Call get_metadata() to receive a collection of metadata items.
  3. Iterate through name and value pairs and process them as needed.

For deeper parsing (attachments, text, images), combine metadata extraction with other basic usage topics.

id: extract-metadata-from-documents url: parser/python-net/extract-metadata-from-documents title: Extract Metadata from Documents weight: 7 version: 25.12 description: “Extract metadata (author, title, custom properties) from PDF, Office, images, emails, and other formats using GroupDocs.Parser for Python via .NET.” productName: GroupDocs.Parser for Python via .NET hideChildren: false toc: true tags: python, parser, metadata, document-properties, v25.12

GroupDocs.Parser extracts metadata such as author, title, creation date, and custom properties from supported formats (see supported formats).

Extract metadata

from groupdocs.parser import Parser

with Parser("./sample.pdf") as parser:
    metadata_items = parser.get_metadata()
    if metadata_items is None:
        print("Metadata extraction is not supported for this format.")
    else:
        for item in metadata_items:
            print(f"{item.name}: {item.value}")

The following sample file is used in this example: sample.pdf

Steps

  1. Create a Parser for the target document.
  2. Call get_metadata() to receive a collection of metadata items.
  3. Iterate through name and value pairs and process them as needed.

For deeper parsing (attachments, text, images), combine metadata extraction with other basic usage topics.