Get Document Information

GroupDocs.Viewer exposes two inspection methods on the Viewer class:

  • get_file_info() — returns a FileInfo result with the detected file type and name. Cheap; no rendering required.
  • get_view_info(options) — returns a ViewInfo result that includes the page count, per-page dimensions, and format-specific metadata. Takes a ViewInfoOptions created with one of the for_html_view() / for_png_view() / for_jpg_view() / for_pdf_view() factory methods.

For several format families, get_view_info returns a format-specific subclass of ViewInfo that exposes extra properties:

Format familyResult typeExtra properties
PDFPdfViewInfoprinting_allowed
Archive (ZIP / RAR / 7Z)ArchiveViewInfofolders
CAD (DWG / DXF / DGN)CadViewInfolayers, layouts
Outlook (PST / OST)OutlookViewInfofolders
Project (MPP / MPT)ProjectManagementViewInfostart_date, end_date

Cast the ViewInfo result to the appropriate subclass to access the extra properties.

Example 1: Get File Type and Pages Count

The following code snippet shows how to get the file type and the page count for any supported document:

from groupdocs.viewer import Viewer
from groupdocs.viewer.options import ViewInfoOptions

def get_file_type_and_pages_count():
    # Load PDF document
    with Viewer("sample.pdf") as viewer:
        info = viewer.get_view_info(ViewInfoOptions.for_html_view())

        print("Document type:", info.file_type)
        print("Pages count:", len(info.pages))

if __name__ == "__main__":
    get_file_type_and_pages_count()

sample.pdf is the sample file used in this example. Click here to download it.

Document type: Portable Document Format File (.pdf)
Pages count: 2

Download full output

Example 2: Get PDF-Specific Information

For PDF documents, the returned PdfViewInfo also reports whether printing is allowed:

from typing import cast
from groupdocs.viewer import Viewer
from groupdocs.viewer.options import ViewInfoOptions
from groupdocs.viewer.results import PdfViewInfo

def get_pdf_info():
    with Viewer("sample.pdf") as viewer:
        info = viewer.get_view_info(ViewInfoOptions.for_html_view())
        pdf_info = cast(PdfViewInfo, info)

        print("File type:", pdf_info.file_type)
        print("Pages count:", len(pdf_info.pages))
        print("Printing allowed:", pdf_info.printing_allowed)

if __name__ == "__main__":
    get_pdf_info()

sample.pdf is the sample file used in this example. Click here to download it.

File type: Portable Document Format File (.pdf)
Pages count: 2
Printing allowed: True

Download full output

Example 3: Get Archive Folder Structure

For archives (ZIP / RAR / 7Z), ArchiveViewInfo.folders returns the top-level folder list:

from typing import cast
from groupdocs.viewer import Viewer
from groupdocs.viewer.options import ViewInfoOptions
from groupdocs.viewer.results import ArchiveViewInfo

def get_archive_info():
    with Viewer("documents.zip") as viewer:
        info = viewer.get_view_info(ViewInfoOptions.for_html_view())
        archive_info = cast(ArchiveViewInfo, info)

        print("File type:", archive_info.file_type)
        print("Pages count:", len(archive_info.pages))
        print("Folders:")
        for folder in archive_info.folders:
            print(f"  {folder}")

if __name__ == "__main__":
    get_archive_info()

documents.zip is the sample file used in this example. Click here to download it.

File type: Zipped File (.zip)
Pages count: 1
Folders:
  /first_folder/
  /second_folder/

Download full output

Example 4: Get CAD Layers and Layouts

For CAD drawings (DWG / DXF / DGN), CadViewInfo exposes the drawing’s layers and layouts:

from typing import cast
from groupdocs.viewer import Viewer
from groupdocs.viewer.options import ViewInfoOptions
from groupdocs.viewer.results import CadViewInfo

def get_cad_info():
    with Viewer("sample.dwg") as viewer:
        info = viewer.get_view_info(ViewInfoOptions.for_html_view())
        cad_info = cast(CadViewInfo, info)

        print("File type:", cad_info.file_type)
        print("Pages count:", len(cad_info.pages))
        print("Layers:")
        for layer in cad_info.layers:
            print(f"  {layer.name} (visible={layer.visible})")
        print("Layouts:")
        for layout in cad_info.layouts:
            print(f"  {layout.name} ({layout.width}x{layout.height})")

if __name__ == "__main__":
    get_cad_info()

sample.dwg is the sample file used in this example. Click here to download it.

File type: AutoCAD Drawing Database File (.dwg)
Pages count: 1
Layers:
  0 (visible=True)
  CIRCLE (visible=True)
  TRIANGLE (visible=True)
  QUADRANT (visible=True)
Layouts:
  Model (26x23)

Download full output

Example 5: Get Outlook Data File Folders

For Outlook data files (PST / OST), OutlookViewInfo.folders exposes the mailbox folder list:

from typing import cast
from groupdocs.viewer import Viewer
from groupdocs.viewer.options import ViewInfoOptions
from groupdocs.viewer.results import OutlookViewInfo

def get_outlook_info():
    with Viewer("sample.pst") as viewer:
        info = viewer.get_view_info(ViewInfoOptions.for_html_view())
        outlook_info = cast(OutlookViewInfo, info)

        print("File type:", outlook_info.file_type)
        print("Pages count:", len(outlook_info.pages))
        print("Folders:")
        for folder in outlook_info.folders:
            print(f"  {folder}")

if __name__ == "__main__":
    get_outlook_info()

sample.pst is the sample file used in this example. Click here to download it.

File type: Outlook Personal Information Store File (.pst)
Pages count: 1
Folders:
  Inbox
  Deleted Items
  Outbox
  Sent Items
  Calendar
  Contacts
  Drafts
[TRUNCATED]

Download full output

Example 6: Get Microsoft Project Dates

For Microsoft Project files (MPP / MPT), ProjectManagementViewInfo exposes the project’s start_date and end_date:

from typing import cast
from groupdocs.viewer import Viewer
from groupdocs.viewer.options import ViewInfoOptions
from groupdocs.viewer.results import ProjectManagementViewInfo

def get_project_info():
    with Viewer("sample.mpp") as viewer:
        info = viewer.get_view_info(ViewInfoOptions.for_html_view())
        project_info = cast(ProjectManagementViewInfo, info)

        print("File type:", project_info.file_type)
        print("Pages count:", len(project_info.pages))
        print("Start date:", project_info.start_date)
        print("End date:", project_info.end_date)

if __name__ == "__main__":
    get_project_info()

sample.mpp is the sample file used in this example. Click here to download it.

File type: Microsoft Project File (.mpp)
Pages count: 1
Start date: 2008-06-01 00:00:00+00:00
End date: 2008-09-03 00:00:00+00:00

Download full output

See also

Close
Loading

Analyzing your prompt, please hold on...

An error occurred while retrieving the results. Please refresh the page and try again.