Templates let you extract structured values (fields, tables, barcodes) from recurring documents. The Python API mirrors the .NET template classes; see groupdocs.parser.templates for the full list.
Regex-based fields
Use TemplateRegexPosition to locate values by pattern and extract only the matched group:
The following sample file is used in this example: invoice.pdf
Tables with detector parameters
Define table bounds and column separators with TemplateTableParameters when you need to extract line items:
fromgroupdocs.parserimportParserfromgroupdocs.parser.dataimportRectangle,Point,Sizefromgroupdocs.parser.templatesimport(Template,TemplateItem,TemplateTable,TemplateTableParameters,)table_area=Rectangle(Point(175.0,350.0),Size(400.0,200.0))columns=[185.0,370.0,425.0,485.0,545.0]table=TemplateTable(TemplateTableParameters(table_area,columns),"Details",0,# restrict to the first page; omit to scan all pages)template=Template([table])withParser("./invoice.pdf")asparser:data=parser.parse_by_template(template)ifdata:details=data["Details"].page_areaprint(f"Rows extracted: {details.row_count}")
The following sample file is used in this example: invoice.pdf
Tips
Combine regex, fixed, and linked positions in one template to anchor values reliably.
Keep field names unique (case-insensitive).
Reuse TemplateTableLayout when the same table structure appears on multiple pages—see the API reference for layout helpers.
If a template item cannot be located, the corresponding field/table is empty; handle None in your code.
Was this page helpful?
Any additional feedback you'd like to share with us?
Please tell us how we can improve this page.
Thank you for your feedback!
We value your opinion. Your feedback will help us improve our documentation.