Load file from local disk

When the document file is located on the local disk, GroupDocs.Parser allows you to load it using the Parser class constructor by specifying an absolute or relative path.

The following code snippet shows how to load a document from a local disk:

Load document by file path

from groupdocs.parser import Parser

# Specify the file path (absolute or relative)
file_path = "sample.docx"

# Create an instance of Parser class with the file path
with Parser(file_path) as parser:
    # Extract text from the document
    text_reader = parser.get_text()
    
    if text_reader is not None:
        # Print the extracted text
        print(text_reader)
    else:
        print("Text extraction isn't supported for this format")

The following sample file is used in this example: sample.docx

Load document with absolute path

from groupdocs.parser import Parser

# Specify an absolute file path
file_path = "sample.pdf"

# Create an instance of Parser class
with Parser(file_path) as parser:
    # Get document info
    doc_info = parser.get_document_info()
    
    print(f"File type: {doc_info.file_type.file_format}")
    print(f"Page count: {doc_info.page_count}")
    print(f"File size: {doc_info.size} bytes")

The following sample file is used in this example: sample.pdf

Batch processing multiple files

from groupdocs.parser import Parser
import os

def process_files_from_directory(directory_path):
    """Process all supported documents in a directory"""
    
    # Get all files in the directory
    for filename in os.listdir(directory_path):
        file_path = os.path.join(directory_path, filename)
        
        # Skip if not a file
        if not os.path.isfile(file_path):
            continue
        
        try:
            print(f"
Processing: {filename}")
            
            # Create parser instance
            with Parser(file_path) as parser:
                # Get document info
                doc_info = parser.get_document_info()
                print(f"  Type: {doc_info.file_type.file_format}")
                print(f"  Pages: {doc_info.page_count}")
                
                # Extract text
                text_reader = parser.get_text()
                if text_reader:
                    text = text_reader
                    print(f"  Characters: {len(text)}")
                    
        except Exception as e:
            print(f"  Error: {e}")

Error handling

It’s good practice to handle potential errors when loading files:

from groupdocs.parser import Parser
import os

def safe_load_document(file_path):
    """Safely load a document with error handling"""
    
    # Check if file exists
    if not os.path.exists(file_path):
        print(f"Error: File not found - {file_path}")
        return None
    
    try:
        # Create parser instance
        parser = Parser(file_path)
        print(f"Document loaded successfully: {file_path}")
        return parser
        
    except Exception as e:
        print(f"Error loading document: {e}")
        return None

# Try to load a document
parser = safe_load_document("sample.pdf")

if parser:
    try:
        # Extract data
        text_reader = parser.get_text()
        if text_reader:
            print(text_reader)

The following sample file is used in this example: sample.pdf

More resources

GitHub examples

You may find more code examples in our GitHub repository:

Free online document parser

Along with the full-featured library, we provide a free online document parser app. You are welcome to extract data from PDF, DOCX, XLSX, and more with our Free Online Document Parser App.