Extract data from HTML documents

GroupDocs.Parser provides the functionality to extract data from HTML documents and other markup formats.

The following table provides the list of supported formats:

FormatDescription
HTMLHypertext Markup Language File
XHTMLExtensible Hypertext Markup Language File
MHTMLMIME HTML File
MDMarkdown
XMLXML File

More resources

GitHub examples

You may easily run the code above and see the feature in action in our GitHub examples:

Free online document parser App

Along with full featured .NET library we provide simple, but powerful free Apps.

You are welcome to parse documents and extract data from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more with our free online Free Online Document Parser App.

The following examples demonstrate how to extract data from HTML documents: