Field data are stored in getPageArea property. Depending on the type of the value it can contain the instance of PageTextArea or PageTableArea classes:
// Get the field data
FieldDatafield=data.get(i);// Check if the field data contains a text
if(field.getPageArea()instanceofPageTextArea){// Print the field value
System.out.println((PageTextArea)field.getPageArea()).getText());}
PageTextArea class represents a text block on the page. This class has the following members:
The text area can be single or composite. In the first case it contains a text which is bounded by a rectangular area. In the second case it contains other text areas; text and table properties are calculated by child text areas.
PageTableArea class represents a table. This class has the following members:
The following example shows how to iterate via extracted field data:
// Print all extracted data
for(inti=0;i<data.getCount();i++){// Print field name
System.out.print(data.get(i).getName()+": ");// As we have defined only text fields in the template,
// we cast PageArea property value to PageTextArea
PageTextAreaarea=data.get(i).getPageArea()instanceofPageTextArea?(PageTextArea)data.get(i).getPageArea():null;System.out.println(area==null?"Not a template field":area.getText());}
Get field by name
The following example shows how to get field by the name:
// Print prices
System.out.println("Prices:");for(FieldDatafield:data.getFieldsByName("Price")){PageTextAreaarea=field.getPageArea()instanceofPageTextArea?(PageTextArea)field.getPageArea():null;System.out.println(area==null?"Not a template field":area.getText());}
This functionality allows to iterate all data fields and select the most suitable of them. For example, if more than one text value meets the condition of the regular expression, a user can iterate over them and select the most suitable one.
Working with tables
The following example shows how to work with extracted tables:
// Parse the document by the template
DocumentDatadata=parser.parseByTemplate(template);// Print all extracted data
for(inti=0;i<data.getCount();i++){System.out.print(data.get(i).getName()+": ");// Check if the field is a table
PageTableAreaarea=data.get(i).getPageArea()instanceofPageTableArea?(PageTableArea)data.get(i).getPageArea():null;if(area==null){continue;}// Iterate via table rows
for(introw=0;row<area.getRowCount();row++){// Iterate via table columns
for(intcolumn=0;column<area.getColumnCount();column++){// Get the cell value
PageTextAreacellValue=area.getCell(row,column).getPageArea()instanceofPageTextArea?(PageTextArea)area.getCell(row,column).getPageArea():null;// Print the space between columns
if(column>0){System.out.print("\t");}// Print the cell value
System.out.print(cellValue==null?"":cellValue.getText());}// Print new line
System.out.println();}}
More resources
GitHub examples
You may easily run the code above and see the feature in action in our GitHub examples: