Extract images from document

GroupDocs.Parser provides the functionality to extract images from documents by the GetImages method:

IEnumerable<PageImageArea> GetImages();

The methods return a collection of PageImageArea objects:

MemberDescription
PageThe page that contains the text area.
RectangleThe rectangular area on the page that contains the text area.
FileTypeThe format of the image.
RotationThe rotation angle of the image.
Stream GetImageStream()Returns the image stream.
Stream GetImageStream(ImageOptions)Returns the image stream in a different format.
Save(string)Saves the image to the file.
Save(string, ImageOptions)Saves the image to the file in a different format.

ImageOptions class is used to define the image format into which the image is converted. The following image formats are supported:

  • Bmp
  • Gif
  • Jpeg
  • Png
  • WebP

Here are the steps to extract images from the whole document:

  • Instantiate Parser object for the initial document;
  • Call GetImages method and obtain collection of PageImageArea objects;
  • Check if collection isn’t null (images extraction is supported for the document);
  • Iterate through the collection and get sizes, image types and image contents.

The following example shows how to extract all images from the whole document:

// Create an instance of Parser class
using (Parser parser = new Parser(filePath))
{
    // Extract images
    IEnumerable<PageImageArea> images = parser.GetImages();
    // Check if images extraction is supported
    if (images == null)
    {
        Console.WriteLine("Images extraction isn't supported");
        return;
    }
    // Iterate over images
    foreach (PageImageArea image in images)
    {
        // Print a page index, rectangle and image type:
        Console.WriteLine(string.Format("Page: {0}, R: {1}, Type: {2}", image.Page.Index, image.Rectangle, image.FileType));
    }
}

More resources

GitHub examples

You may easily run the code above and see the feature in action in our GitHub examples:

Free online image extractor App

Along with full featured .NET library we provide simple, but powerfull free APPs.

You are welcome to extract images from PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX, Emails and more with our free online GroupDocs Parser App.