Removing metadata

Not all metadata properties extracted from a file are marked with tags. Some file formats and metadata standards allow fully custom properties that can’t be tagged by the library, because their purpose is not clearly defined in the corresponding specification. In such cases you can use the property name (or any other attribute) to locate and remove it. The following example demonstrates an advanced removal scenario.

  1. Load a file to modify
  2. Pass a search predicate to the remove_properties method
  3. Check the number of properties that were actually removed
  4. Save the changes
from groupdocs.metadata import Metadata
from groupdocs.metadata.tagging import Tags


def removing_metadata():
    with Metadata("input.docx") as metadata:
        # Remove every property whose tags fall into the "content" category
        affected = metadata.remove_properties(
            lambda p: any(tag.category == Tags.content for tag in p.tags)
        )
        print(f"Affected properties: {affected}")
        metadata.save("output.docx")


if __name__ == "__main__":
    removing_metadata()

input.docx is the sample file used in this example. Click here to download it.

Binary file (DOCX, 13 KB)

Download full output

More resources

Advanced usage topics

To learn more about library features and get familiar how to manage metadata and more, please refer to the advanced usage section.

GitHub examples

You may easily run the code above and see the feature in action in our GitHub examples:

Free online document metadata management App

You are welcome to view and edit metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images and more with our free online Free Online Document Metadata Viewing and Editing App.