The keyword parameter can contain a text or a regular expression. SearchResult class contains every occurrence of the keyword in the document text. This class has the following members:
A zero-based index of the start position of the search result. Depending on isSearchByPages property value this index starts from the document start or the document page start.
Check if collection isn’t null (search is supported for the document);
Iterate through the collection and get position and text.
The following example shows how to find a keyword in a document:
// Create an instance of Parser class
try(Parserparser=newParser(Constants.SamplePdf)){// Search a keyword:
Iterable<SearchResult>sr=parser.search("lorem");// Check if search is supported
if(sr==null){System.out.println("Search isn't supported");return;}// Iterate over search results
for(SearchResults:sr){// Print an index and found text:
System.out.println(String.format("At %d: %s",s.getPosition(),s.getText()));}}
Search text by regular expression
SearchOptions parameter is used to customize a search. This class has the following members:
Check if collection isn’t null (search is supported for the document);
Iterate through the collection and get position and text.
The following example shows how to search with a regular expression in a document:
// Create an instance of Parser class
try(Parserparser=newParser(Constants.SamplePdf)){// Search with a regular expression with case matching
Iterable<SearchResult>sr=parser.search("[0-9]+",newSearchOptions(true,false,true));// Check if search is supported
if(sr==null){System.out.println("Search isn't supported");return;}// Iterate over search results
for(SearchResults:sr){// Print an index and found text:
System.out.println(String.format("At %d: %s",s.getPosition(),s.getText()));}}
Search text with highlights
Here are the steps to search a text with a highlights:
Instantiate Parser object for the initial document;
Instantiate HighlightOptions object with the parameters for the highlight extraction;
Instantiate SearchOptions object with the parameters for the search;
Check if collection isn’t null (search is supported for the document);
Iterate through the collection and get position and text.
The following example shows how to search a text with the highlights:
// Create an instance of Parser class
try(Parserparser=newParser(Constants.SamplePdf)){HighlightOptionshighlightOptions=newHighlightOptions(15);// Search a keyword:
Iterable<SearchResult>sr=parser.search("lorem",newSearchOptions(true,false,false,highlightOptions));// Check if search is supported
if(sr==null){System.out.println("Search isn't supported");return;}// Iterate over search results
for(SearchResults:sr){// Print the found text and highlights:
System.out.println(String.format("%s%s%s",s.getLeftHighlightItem().getText(),s.getText(),s.getRightHighlightItem().getText()));}}
Search text with page numbers
Here are the steps to search a text with page numbers:
Instantiate Parser object for the initial document;
Instantiate SearchOptions object with the parameters for the search;
Check if collection isn’t null (search is supported for the document);
Iterate through the collection and get position, text and page number.
The following example shows how to search a text with page numbers:
// Create an instance of Parser class
try(Parserparser=newParser(Constants.SamplePdf)){// Search a keyword with page numbers
Iterable<SearchResult>sr=parser.search("lorem",newSearchOptions(false,false,false,true));// Check if search is supported
if(sr==null){System.out.println("Search isn't supported");return;}// Iterate over search results
for(SearchResults:sr){// Print an index, page number and found text:
System.out.println(String.format("At %d (%d): %s",s.getPosition(),s.getPageIndex(),s.getText()));}}
More resources
GitHub examples
You may easily run the code above and see the feature in action in our GitHub examples: