To extract metadata from PDF documents getMetadata method is used. This method allows to extract the following metadata:
Name
Description
title
The title of the presentation.
subject
The subject of the presentation.
keywords
The keyword of the presentation.
author
The name of the presentation’s author.
application
The name of the application.
application-version
The version number of the application that created the presentation.
created-time
The time of the presentation creation.
last-saved-time
The time of the the presentation when it was last saved.
Here are the steps to extract metadata from PDF document:
Instantiate Parser object for the initial document;
Call getMetadata method and obtain collection of document metadata objects;
Iterate through the collection and get metadata names and values.
Warning
getMetadata method returns null value if metadata extraction isn’t supported for the document. For example, metadata extraction isn’t supported for TXT files. Therefore, for TXT file getMetadata method returns null. If PDF document has no metadata, getMetadata method returns an empty collection.
The following example demonstrates how to extract metadata from PDF document:
// Create an instance of Parser class
try(Parserparser=newParser(Constants.SamplePdf)){// Extract metadata from the document
Iterable<MetadataItem>metadata=parser.getMetadata();// Iterate over metadata items
for(MetadataItemitem:metadata){// Print an item name and value
System.out.println(String.format("%s: %s",item.getName(),item.getValue()));}}
More resources
GitHub examples
You may easily run the code above and see the feature in action in our GitHub examples:
Along with full featured .NET library we provide simple, but powerful free Apps.
You are welcome to parse documents and extract data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more with our free online Free Online Document Parser App.
Was this page helpful?
Any additional feedback you'd like to share with us?
Please tell us how we can improve this page.
Thank you for your feedback!
We value your opinion. Your feedback will help us improve our documentation.