Get Document Information
Leave feedback
On this page
GroupDocs.Comparison for Python via .NET can return the following information about a document without performing a comparison:
file_type— the document file type (PDF, Word, Excel, PowerPoint, image, etc.).page_count— number of pages.size— file size in bytes.pages_info— per-page information.
from groupdocs.comparison import Comparer
def get_document_info():
with Comparer("./source.docx") as comparer:
info = comparer.source.get_document_info()
print(f"File type: {info.file_type.file_format}")
print(f"Number of pages: {info.page_count}")
print(f"Document size: {info.size} bytes")
print("\nDocument info extracted successfully.")
if __name__ == "__main__":
get_document_info()
source.docx is the document used in this example. Click here to download it.
File type: Microsoft Word Document
Number of pages: 1
Document size: 26611 bytes
Document info extracted successfully.
from groupdocs.comparison import Comparer
def get_document_info_from_stream():
with open("./source.docx", "rb") as source_stream:
with Comparer(source_stream) as comparer:
info = comparer.source.get_document_info()
print(f"File type: {info.file_type.file_format}")
print(f"Number of pages: {info.page_count}")
print(f"Document size: {info.size} bytes")
if __name__ == "__main__":
get_document_info_from_stream()
source.docx is the document used in this example. Click here to download it.
File type: Microsoft Word Document
Number of pages: 1
Document size: 26611 bytes
Comparer.source.get_document_info() is one of three ways to read information about a document without running a full comparison:
| What you need | Use |
|---|---|
| File type, size, page count of a specific document | This page |
| The list of every supported format at runtime | Get supported file formats |
| Visual thumbnails of selected pages | Generate document pages preview |
Pair get_document_info() with the format-list to validate inputs early in a pipeline (e.g., reject anything outside an allowlist), and use page previews to give end users a quick visual preview before committing to a full diff.
- Get supported file formats — enumerate supported types at runtime.
- Generate document pages preview — render page thumbnails.
- Specify file type for comparison manually — when the inferred type is wrong.
Was this page helpful?
Any additional feedback you'd like to share with us?
Please tell us how we can improve this page.
Thank you for your feedback!
We value your opinion. Your feedback will help us improve our documentation.