GroupDocs.Parser for .NET 20.3 Release Notes
This page contains release notes for GroupDocs.Parser for .NET 20.3
Major Features
There are the following improvements in this release:
- Improved the support of text structure extraction
- Improved table of contents extraction API
Full List of Issues Covering all Changes in this Release
Key | Summary | Category |
---|---|---|
PARSERNET-1432 | Improve the support of text structure extraction | Improvement |
PARSERNET-1431 | Improve table of contents extraction API | Improvement |
Public API and Backward Incompatible Changes
Improve table of contents extraction API
Description
This feature improves API of text extraction by table of contents items.
Public API changes
GroupDocs.Parser.Data.TocItem public class was updated with changes as follows:
- Added ExtractText method
- GetText method was marked as obsolete
Usage
The following example how to extract a text by the an item of table of contents:
// Create an instance of Parser class
using (Parser parser = new Parser(Constants.SampleDocxWithToc))
{
// Get table of contents
IEnumerable<TocItem> tocItems = parser.GetToc();
// Check if toc extraction is supported
if (tocItems == null)
{
Console.WriteLine("Table of contents extraction isn't supported");
}
// Iterate over items
foreach (TocItem tocItem in tocItems)
{
// Print the text of the chapter
using (TextReader reader = tocItem.ExtractText())
{
Console.WriteLine("----");
Console.WriteLine(reader.ReadToEnd());
}
}
}
Improve the support of text structure extraction
Description
This feature adds text extraction from shapes, word art objects and text boxes for Microsoft Office formats. Also added hyperlink extraction for spreadsheets and presentations.
The structure of XML representation of a document was changed. For details, see Extract text structure.
Public API changes
There are no changes in public API
Usage
The following example shows how to extract hyperlinks from the document:
// Create an instance of Parser class
using (Parser parser = new Parser(filePath))
{
// Extract text structure to the XML reader
using (XmlReader reader = parser.GetStructure())
{
// Check if text structure extraction is supported
if (reader == null)
{
Console.WriteLine("Text structure extraction isn't supported.");
return;
}
// Process the XML document
// Read the XML document to search hyperlinks
while (reader.Read())
{
// Check if this is a start element with "hyperlink" name
if (reader.NodeType == XmlNodeType.Element && reader.IsStartElement() && reader.Name.ToLowerInvariant() == "hyperlink")
{
// Extract "link" attribute
string value = reader.GetAttribute("link");
if (value != null)
{
Console.WriteLine(value);
}
}
}
}
}