Using the GroupDocs.Metadata for .NET you can easily extract metadata from PDF, DOC, PPT, XLS and many other files of different types in your .NET solution.
GroupDocs.Metadata for .NET supports many file formats. See full list at supported file formats article.
You don’t need to worry about the exact file format and metadata standards it can deal with. The same code will work for all supported formats in the same way.
Most commonly used metadata properties are marked with tags that allow searching them across all supported files in various metadata packages. All tags defined in GroupDocs.Metadata are divided into categories that make it easier to find a required tag.
In this article we would like to demonstrate some advanced usage of tags, categories and other attributes of metadata properties.
The following steps and C# code sample below show how to extract metadata properties from your files in .NET solution:
Load a file to be searched for metadata properties
Make up a predicate to examine all extracted metadata properties
foreach(stringfileinDirectory.GetFiles(Constants.InputPath)){using(Metadatametadata=newMetadata(file)){if(metadata.FileFormat!=FileFormat.Unknown&&!metadata.GetDocumentInfo().IsEncrypted){// Fetch all metadata properties that fall into a particular categoryvarproperties=metadata.FindProperties(p=>p.Tags.Any(t=>t.Category==Tags.Content));Console.WriteLine("The metadata properties describing some characteristics of the file content: title, keywords, language, etc.");foreach(varpropertyinproperties){Console.WriteLine("{0} = {1}",property.Name,property.Value);}}}}
Extracting Metadata Properties By Type And Value
foreach(stringfileinDirectory.GetFiles(Constants.InputPath)){using(Metadatametadata=newMetadata(file)){if(metadata.FileFormat!=FileFormat.Unknown&&!metadata.GetDocumentInfo().IsEncrypted){// Fetch all properties having a specific type and valuevaryear=DateTime.Today.Year;properties=metadata.FindProperties(p=>p.Value.Type==MetadataPropertyType.DateTime&&p.Value.ToStruct(DateTime.MinValue).Year==year);Console.WriteLine("All datetime properties with the year value equal to the current year");foreach(varpropertyinproperties){Console.WriteLine("{0} = {1}",property.Name,property.Value);}}}}
Extracting Metadata Properties By Specified Regex Expression
foreach(stringfileinDirectory.GetFiles(Constants.InputPath)){using(Metadatametadata=newMetadata(file)){if(metadata.FileFormat!=FileFormat.Unknown&&!metadata.GetDocumentInfo().IsEncrypted){// Fetch all properties whose names match the specified regexconststringpattern="^author|company|(.+date.*)$";Regexregex=newRegex(pattern,RegexOptions.IgnoreCase);properties=metadata.FindProperties(p=>regex.IsMatch(p.Name));Console.WriteLine("All properties whose names match the following regex: {0}",pattern);foreach(varpropertyinproperties){Console.WriteLine("{0} = {1}",property.Name,property.Value);}}}}
More resources
GitHub examples
You may easily run the code above and see the feature in action in our GitHub examples:
Along with full featured .NET library we provide simple, but powerful free Apps.
You are welcome to view and edit metadata of PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, emails, images and more with our free online Free Online Document Metadata Viewing and Editing App.
Was this page helpful?
Any additional feedback you'd like to share with us?
Please tell us how we can improve this page.
Thank you for your feedback!
We value your opinion. Your feedback will help us improve our documentation.