Quick Start Guide

This guide gives a quick overview of how to set up and start using GroupDocs.Metadata for Python via .NET. The library lets you read, edit, and remove metadata from documents, images, audio, and video with minimal configuration.

Prerequisites

To proceed, make sure you have:

  1. A configured environment as described in the System Requirements topic.
  2. Optionally, a Temporary License to test all product features (saving files requires a license).

Set Up Your Development Environment

For best practices, use a virtual environment to manage dependencies.

Create and Activate a Virtual Environment

Create a virtual environment:

py -m venv .venv
python3 -m venv .venv
python3 -m venv .venv

Activate it:

.venv\Scripts\activate
source .venv/bin/activate
source .venv/bin/activate

Install the groupdocs-metadata-net package

py -m pip install groupdocs-metadata-net
python3 -m pip install groupdocs-metadata-net
python3 -m pip install groupdocs-metadata-net

Example 1: Read metadata

Open a file and print all detected metadata properties.

import os

from groupdocs.metadata import License, Metadata


def read_metadata():
    # Apply a license if one is present next to the script
    license_path = os.path.abspath("./GroupDocs.Metadata.lic")
    if os.path.exists(license_path):
        License().set_license(license_path)

    with Metadata("./input.docx") as metadata:
        # `lambda p: True` matches every property in the file
        for prop in metadata.find_properties(lambda p: True):
            print(f"{prop.name} = {prop.value}")


if __name__ == "__main__":
    read_metadata()

input.docx is the sample file used in this example. Click here to download it.

FileType = GroupDocs.Metadata.Formats.Document.WordProcessingTypePackage
DocumentProperties = GroupDocs.Metadata.Formats.Document.WordProcessingPackage
DublinCore = GroupDocs.Metadata.Standards.DublinCore.DublinCorePackage
InspectionPackage = GroupDocs.Metadata.Formats.Document.WordProcessingInspectionPackage
DocumentStatistics = GroupDocs.Metadata.Formats.Document.DocumentStatistics
FileFormat = 3
MimeType = application/vnd.openxmlformats-officedocument.wordprocessingml.document
Extension = .do
[TRUNCATED]

Download full output

Example 2: Remove all metadata

Strip every detected property and save a clean copy. Saving requires a license.

import os

from groupdocs.metadata import License, Metadata


def remove_metadata():
    license_path = os.path.abspath("./GroupDocs.Metadata.lic")
    if os.path.exists(license_path):
        License().set_license(license_path)

    with Metadata("./input.docx") as metadata:
        # sanitize() strips every detected property; save() needs a license
        removed = metadata.sanitize()
        print(f"Removed {removed} properties")
        metadata.save("./clean.docx")


if __name__ == "__main__":
    remove_metadata()

input.docx is the sample file used in this example. Click here to download it.

Binary file (DOCX, 14 KB)

Download full output

Example 3: Inspect a document

Read basic file information without parsing the whole metadata tree.

from groupdocs.metadata import Metadata


def document_info():
    with Metadata("./input.docx") as metadata:
        # get_document_info() reads basic facts without walking the metadata tree
        info = metadata.get_document_info()
        print(f"Format: {info.file_type.file_format}")
        print(f"MIME type: {info.file_type.mime_type}")
        print(f"Pages: {info.page_count}")
        print(f"Size: {info.size} bytes")
        print(f"Encrypted: {info.is_encrypted}")


if __name__ == "__main__":
    document_info()

input.docx is the sample file used in this example. Click here to download it.

Format: 3
MIME type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Pages: 1
Size: 17952 bytes
Encrypted: False

Download full output

Next Steps

After completing the basics, explore additional resources: