How to Run Examples Leave feedback

Run examples using PyPI

To get started make sure that Python is installed (version 3.5 or higher).

Clone repository with examples:

git clone https://github.com/groupdocs-parser/GroupDocs.Parser-for-Python-via-.NET.git

Navigate to the project folder:

cd ./GroupDocs.Parser-for-Python-via-.NET

Install the necessary packages:
```
pip install groupdocs-parser-net
```
Run the examples:
```
python run_examples.py
```

To check what examples are available, open the run_examples.py file in your favorite text editor. Uncomment examples you want to run and type python run_examples.py to start them.

Build project from scratch

If you prefer to create a project from scratch, follow these steps:

Step 1: Install GroupDocs.Parser

Install GroupDocs.Parser for Python via .NET using pip:

pip install groupdocs-parser-net

Step 2: Create a Python Script

Create a new Python file (e.g., example.py) and add the following code:

Python

from groupdocs.parser import Parser

def extract_text_from_document():
    # Create an instance of Parser class
    with Parser("./sample.docx") as parser:
        # Extract text from the document
        text_reader = parser.get_text()
        
        if text_reader is not None:
            # Print the extracted text
            extracted_text = text_reader
            print(extracted_text)
        else:
            print("Text extraction isn't supported for this format")

if __name__ == "__main__":
    extract_text_from_document()

sample.docx

The following sample file is used in this example: sample.docx

Step 3: Run the Script

Execute your Python script:

python example.py

The extracted text will appear in the console.

Common Examples

Extract Text from PDF

Python

from groupdocs.parser import Parser

def extract_text_from_pdf():
    with Parser("./sample.pdf") as parser:
        text_reader = parser.get_text()
        if text_reader:
            print(text_reader)

if __name__ == "__main__":
    extract_text_from_pdf()

sample.pdf

The following sample file is used in this example: sample.pdf

Extract Metadata

Python

from groupdocs.parser import Parser

def extract_metadata_example():
    with Parser("./sample.docx") as parser:
        metadata = parser.get_metadata()
        if metadata:
            for item in metadata:
                print(f"{item.name}: {item.value}")

if __name__ == "__main__":
    extract_metadata_example()

sample.docx

The following sample file is used in this example: sample.docx

Extract Images

Python

from groupdocs.parser import Parser

def extract_images_example():
    with Parser("./sample.pdf") as parser:
        images = parser.get_images()
        if images:
            for i, image in enumerate(images):
                with open(f"image_{i}.png", "wb") as file:
                    image_stream = image.get_image_stream()
                    file.write(image_stream.read())

if __name__ == "__main__":
    extract_images_example()

sample.pdf

The following sample file is used in this example: sample.pdf

Contribute

If you like to add or improve an example, we encourage you to contribute to the project. All examples in this repository are open-source and can be freely used in your own applications.

To contribute, you can fork the repository, edit the source code and create a pull request. We will review the changes and include them in the repository if found helpful.

We value your opinion. Your feedback will help us improve our documentation.

How to Run Examples Leave feedback

On this page

Run examples using PyPI

Build project from scratch

Step 1: Install GroupDocs.Parser

Step 2: Create a Python Script

Step 3: Run the Script

Common Examples

Extract Text from PDF

Extract Metadata

Extract Images

Contribute

Was this page helpful?

Any additional feedback you'd like to share with us?

Please tell us how we can improve this page.

Thank you for your feedback!

On this page