Clean metadata

Remove all recognized metadata properties from a file

Sometimes you just need to remove all metadata properties without applying any filters. The best way to do this is the sanitize method.

This example demonstrates how to remove all detected metadata packages/properties.

  1. Load a file to clean
  2. Call the sanitize method
  3. Check the actual number of removed packages/properties
  4. Save the changes
from groupdocs.metadata import Metadata


def clean_metadata():
    # Open the file to clean
    with Metadata("input.pdf") as metadata:
        # sanitize() removes every detected metadata property in one call
        affected = metadata.sanitize()
        print(f"Properties removed: {affected}")
        # Write the cleaned document to a new file
        metadata.save("output.pdf")


if __name__ == "__main__":
    clean_metadata()

input.pdf is the sample file used in this example. Click here to download it.

Binary file (PDF, 382 KB)

Download full output

As a result, we get a sanitized version of the original file.

More resources

Advanced usage topics

To learn more about library features and get familiar how to manage metadata and more, please refer to the advanced usage section.

GitHub examples

You may easily run the code above and see the feature in action in our GitHub examples:

Free online document metadata management App

You are welcome to view and edit metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images and more with our free online Free Online Document Metadata Viewing and Editing App.