Extracting metadata

Using GroupDocs.Metadata you can extract the metadata properties you need from files of different types. You don’t have to worry about the exact file format or the metadata standards it uses — the same code works for all supported formats. Most commonly used metadata properties are marked with tags, and all tags are grouped into categories that make it easier to find the one you need. The code sample below demonstrates how to use tags, categories and other property attributes.

  1. Load a file to search for metadata properties
  2. Build a predicate to examine every extracted metadata property
  3. Pass the predicate to the find_properties method
  4. Iterate through the found properties
from groupdocs.metadata import Metadata
from groupdocs.metadata.tagging import Tags


def extracting_metadata():
    with Metadata("input.docx") as metadata:
        # Find every property whose tags fall into the "content" category
        properties = metadata.find_properties(
            lambda p: any(tag.category == Tags.content for tag in p.tags)
        )
        for prop in properties:
            print(f"Property name: {prop.name}, Property value: {prop.value}")


if __name__ == "__main__":
    extracting_metadata()

input.docx is the sample file used in this example. Click here to download it.

Property name: FileFormat, Property value: 3
Property name: MimeType, Property value: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Property name: WordProcessingFileFormat, Property value: 3
Property name: Category, Property value: 
Property name: Comments, Property value: 
Property name: ContentStatus, Property value: 
Property name: Keywords, Property value: 
Property name: RevisionNumber, Property value: 9
Property name: Subject, Property value: 
Property name: Title
[TRUNCATED]

Download full output

More resources

Advanced usage topics

To learn more about library features and get familiar how to manage metadata and more, please refer to the advanced usage section.

GitHub examples

You may easily run the code above and see the feature in action in our GitHub examples:

Free online document metadata management App

You are welcome to view and edit metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images and more with our free online Free Online Document Metadata Viewing and Editing App.