To search a keyword in PDF documents search(String) method is used. This method returns the collection of SearchResult objects. For details, see Search Text.
Here are the steps to search a keyword in PDF document:
Instantiate Parser object for the initial document;
Iterate through the collection and get the position and text.
Warning
search(String) method returns null value if search isn’t supported for the document. For example, text extraction isn’t supported for Zip archive. Therefore, for Zip archive search(String) method returns null. For empty PDF document search(String) method returns an empty collection.
The following example shows how to find a keyword in PDF document:
// Create an instance of Parser class
try(Parserparser=newParser(Constants.SamplePdf)){// Search a keyword:
Iterable<SearchResult>sr=parser.search("nunc");// Iterate over search results
for(SearchResults:sr){// Print an index and found text:
System.out.println(String.format("At %d: %s",s.getPosition(),s.getText()));}}
search(String, SearchOptions) method is used for the advanced search in PDF documents - like search with regular expressions, search by pages etc. SearchOptions parameter is used to customize a search.
Here are the steps to search with a regular expression in PDF document:
Instantiate Parser object for the initial document;
Instantiate SearchOptions object with the parameters for the search;
Iterate through the collection and get the position and text.
The following example shows how to search with a regular expression in PDF document:
// Create an instance of Parser class
try(Parserparser=newParser(Constants.SamplePdf)){// Search with a regular expression with case matching
Iterable<SearchResult>sr=parser.search("(\\\\sut\\\\s)",newSearchOptions(true,false,true));// Iterate over search results
for(SearchResults:sr){// Print an index and found text:
System.out.println(String.format("At %d: %s",s.getPosition(),s.getText()));}}
More resources
GitHub examples
You may easily run the code above and see the feature in action in our GitHub examples:
Along with full featured .NET library we provide simple, but powerful free Apps.
You are welcome to parse documents and extract data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more with our free online Free Online Document Parser App.
Was this page helpful?
Any additional feedback you'd like to share with us?
Please tell us how we can improve this page.
Thank you for your feedback!
We value your opinion. Your feedback will help us improve our documentation.