Check if data isn’t null (parse form is supported for the document);
Iterate over field data to obtain form data.
The following example shows the use case when a user fills in PDF form and send it by email (for example). The software opens this PDF and extracts the preliminary record:
// Create an instance of Parser classusing(Parserparser=newParser(Constants.SampleCarWashPdf)){// Extract data from PDF documentDocumentDatadata=parser.ParseForm();// Check if form extraction is supportedif(data==null){Console.WriteLine("Form extraction isn't supported.");return;}// Create the preliminary record objectPreliminaryRecordrec=newPreliminaryRecord();rec.Name=GetFieldText(data,"Name");rec.Model=GetFieldText(data,"Model");rec.Time=GetFieldText(data,"Time");rec.Description=GetFieldText(data,"Description");// We can save the preliminary record object to the database, // send it as the web response or just print it to the consoleConsole.WriteLine("Preliminary record");Console.WriteLine("Name: {0}",rec.Name);Console.WriteLine("Model: {0}",rec.Model);Console.WriteLine("Time: {0}",rec.Time);Console.WriteLine("Description: {0}",rec.Description);}privatestaticstringGetFieldText(DocumentDatadata,stringfieldName){// Get the field from data collectionFieldDatafieldData=data.GetFieldsByName(fieldName).FirstOrDefault();// Check if the field data is not null (a field with the fieldName is contained in data collection)// and check if the field data contains the textreturnfieldData!=null&&fieldData.PageAreaisPageTextArea?(fieldData.PageAreaasPageTextArea).Text:null;}/// <summary>/// Simple POCO object to store the extracted data./// </summary>publicclassPreliminaryRecord{publicstringName{get;set;}publicstringModel{get;set;}publicstringTime{get;set;}publicstringDescription{get;set;}}
Iterate over field data to obtain the document data.
The following example shows how to parse data from PDF document by the user-generated template:
// Create an instance of Parser classusing(Parserparser=newParser(Constants.SampleInvoicePdf)){// Parse the document by the templateDocumentDatadata=parser.ParseByTemplate(GetTemplate());// Check if parsing document by template is supportedif(data==null){Console.WriteLine("Parsing Document by Template isn't supported.");return;}// Print extracted fieldsfor(inti=0;i<data.Count;i++){Console.Write(data[i].Name+": ");PageTextAreaarea=data[i].PageAreaasPageTextArea;Console.WriteLine(area==null?"Not a template field":area.Text);}}privatestaticTemplateGetTemplate(){// Create detector parameters for "Details" tableTemplateTableParametersdetailsTableParameters=newTemplateTableParameters(newRectangle(newPoint(35,320),newSize(530,55)),null);// Create detector parameters for "Summary" tableTemplateTableParameterssummaryTableParameters=newTemplateTableParameters(newRectangle(newPoint(330,385),newSize(220,65)),null);// Create a collection of template itemsTemplateItem[]templateItems=newTemplateItem[]{newTemplateField(newTemplateFixedPosition(newRectangle(newPoint(35,135),newSize(100,10))),"FromCompany"),newTemplateField(newTemplateFixedPosition(newRectangle(newPoint(35,150),newSize(100,35))),"FromAddress"),newTemplateField(newTemplateFixedPosition(newRectangle(newPoint(35,190),newSize(150,2))),"FromEmail"),newTemplateField(newTemplateFixedPosition(newRectangle(newPoint(35,250),newSize(100,2))),"ToCompany"),newTemplateField(newTemplateFixedPosition(newRectangle(newPoint(35,260),newSize(100,15))),"ToAddress"),newTemplateField(newTemplateFixedPosition(newRectangle(newPoint(35,290),newSize(150,2))),"ToEmail"),newTemplateField(newTemplateRegexPosition("Invoice Number"),"InvoiceNumber"),newTemplateField(newTemplateLinkedPosition("InvoiceNumber",newSize(200,15),newTemplateLinkedPositionEdges(false,false,true,false)),"InvoiceNumberValue"),newTemplateField(newTemplateRegexPosition("Order Number"),"InvoiceOrder"),newTemplateField(newTemplateLinkedPosition("InvoiceOrder",newSize(200,15),newTemplateLinkedPositionEdges(false,false,true,false)),"InvoiceOrderValue"),newTemplateField(newTemplateRegexPosition("Invoice Date"),"InvoiceDate"),newTemplateField(newTemplateLinkedPosition("InvoiceDate",newSize(200,15),newTemplateLinkedPositionEdges(false,false,true,false)),"InvoiceDateValue"),newTemplateField(newTemplateRegexPosition("Due Date"),"DueDate"),newTemplateField(newTemplateLinkedPosition("DueDate",newSize(200,15),newTemplateLinkedPositionEdges(false,false,true,false)),"DueDateValue"),newTemplateField(newTemplateRegexPosition("Total Due"),"TotalDue"),newTemplateField(newTemplateLinkedPosition("TotalDue",newSize(200,15),newTemplateLinkedPositionEdges(false,false,true,false)),"TotalDueValue"),newTemplateTable(detailsTableParameters,"details",null),newTemplateTable(summaryTableParameters,"summary",null)};// Create a document templateTemplatetemplate=newTemplate(templateItems);returntemplate;}
More resources
GitHub examples
You may easily run the code above and see the feature in action in our GitHub examples:
Along with full featured .NET library we provide simple, but powerful free Apps.
You are welcome to parse documents and extract data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more with our free online Free Online Document Parser App.
Was this page helpful?
Any additional feedback you'd like to share with us?
Please tell us how we can improve this page.
Thank you for your feedback!
We value your opinion. Your feedback will help us improve our documentation.