GroupDocs.Parser allows to extract images from PDF, Emails, Ebooks, Microsoft Office: Word (DOC, DOCX), PowerPoint (PPT, PPTX), Excel (XLS, XLSX), LibreOffice formats and many others (see full list at supported document formats article).
GroupDocs.Parser’s allows to easily implement simple and complex image extraction cases at the same time (see more at advanced help section).
In this article you can see how to extract images from any supported format without additional settings.
Extract images from documents
To extract images from documents simply call GetImages method:
IEnumerable<PageImageArea>GetImages();
The methods return a collection of PageImageArea objects:
Saves the image to the file in a different format.
Here are the steps to extract images from the whole document:
Instantiate Parser object for the initial document;
Call GetImages method and obtain collection of image objects;
Check if collection isn’t null (images extraction is supported for the document);
Iterate through the collection and get sizes, image types and image contents.
The following example shows how to extract all images from the whole document:
// Create an instance of Parser classusing(Parserparser=newParser(filePath)){// Extract imagesIEnumerable<PageImageArea>images=parser.GetImages();// Check if images extraction is supportedif(images==null){Console.WriteLine("Images extraction isn't supported");return;}// Iterate over imagesforeach(PageImageAreaimageinimages){// Print a page index, rectangle and image type:Console.WriteLine(string.Format("Page: {0}, R: {1}, Type: {2}",image.Page.Index,image.Rectangle,image.FileType));}}
More resources
Advanced usage topics
To learn more about image extraction feature, please refer the advanced help section.
GitHub examples
You may easily run the code above and see the feature in action in our GitHub examples: