Field data are stored in PageArea property. Depending on the type of the value it can contain the instance of PageTextArea or PageTableArea classes:
// Get the field dataFieldDatafield=data[i];// Check if the field data contains a textif(field.PageAreaisPageTextArea){// Print the field valueConsole.WriteLine((field.PageAreaasPageTextArea).Text);}
PageTextArea class represents a text block on the page. This class has the following members:
The text area can be single or composite. In the first case it contains a text which is bounded by a rectangular area. In the second case it contains other text areas; text and table properties are calculated by child text areas.
PageTableArea class represents a table. This class has the following members:
The following example shows how to iterate via extracted field data:
for(inti=0;i<data.Count;i++){Console.Write(data[i].Name+": ");PageTextAreaarea=data[i].PageAreaasPageTextArea;Console.WriteLine(area==null?"Not a template field":area.Text);}
Get field by name
The following example shows how to get field by the name:
// Get all the fields with "Address" nameIList<FieldData>addressFields=data.GetFieldsByName("Address");if(addressFields.Count==0){Console.WriteLine("Address not found");}else{Console.WriteLine("Address");// Iterate over the fields collectionfor(inti=0;i<addressFields.Count;i++){PageTextAreaarea=addressFields[i].PageAreaasPageTextArea;Console.WriteLine(area==null?"Not a template field":area.Text);// If it's a related field:if(addressFields[i].LinkedField!=null){Console.Write("Linked to ");PageTextArealinkedArea=addressFields[i].LinkedField.PageAreaasPageTextArea;Console.WriteLine(linkedArea==null?"Not a template field":linkedArea.Text);}}}
This functionality allows to iterate all data fields and select the most suitable of them. For example, if more than one text value meets the condition of the regular expression, a user can iterate over them and select the most suitable one.
Working with tables
The following example shows how to work with extracted tables:
// Print all extracted datafor(inti=0;i<data.Count;i++){Console.Write(data[i].Name+": ");// Check if the field is a tablePageTableAreaarea=data[i].PageAreaasPageTableArea;if(area==null){continue;}// Iterate via table rowsfor(introw=0;row<area.RowCount;row++){// Iterate via table columnsfor(intcolumn=0;column<area.ColumnCount;column++){// Get the cell valuePageTextAreacellValue=area[row,column].PageAreaasPageTextArea;// Print the space between columnsif(column>0){Console.Write("\t");}// Print the cell valueConsole.Write(cellValue==null?"":cellValue.Text);}// Print a new lineConsole.WriteLine();}}
More resources
GitHub examples
You may easily run the code above and see the feature in action in our GitHub examples:
Along with full featured .NET library we provide simple, but powerful free Apps.
You are welcome to parse documents and extract data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more with our free online Free Online Document Parser App.
Was this page helpful?
Any additional feedback you'd like to share with us?
Please tell us how we can improve this page.
Thank you for your feedback!
We value your opinion. Your feedback will help us improve our documentation.