# GroupDocs.Editor for Python via .NET — Complete Documentation > Native Python library that loads Word, Excel, PowerPoint, PDF, email, eBook, and text documents, converts them to editable HTML/CSS, and saves them back — or to another format — on Windows, Linux, and macOS. No Microsoft Office or OpenOffice required. --- ## Adding class name to input controls Path: /editor/python-net/adding-class-name-to-input-controls/ Almost all formats within WordProcessing format family, like DOC(X/M), ODT etc., support input controls of different kinds. WordProcessing documents can contain different buttons, textboxes, check-boxes, combo-boxes, input fields, dropdown lists, radio-buttons, date/time pickers and much more, which are internally present as Structured Document Tag (SDT) entities or Fields ("Insert > Quick Parts > Document Property/Field"). **[GroupDocs.Editor](https://products.groupdocs.com/editor/python-net)** supports all of these entities and preserves them while converting the document to the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance. Finally, when generating a HTML document from [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) in order to edit it in the WYSIWYG HTML-editor, these input controls are translated into the most appropriate HTML structures and elements. Additionally, when input document contains not only input controls, but also user-entered data, this data is also preserved and will be present in the output HTML document. In some specific use-cases the end-user may require not to edit the entire document content, but only edit and/or gather data, entered into the input controls. For such case it is required to identify all these input controls in some way in order to fetching them, to distinguish them from all other HTML elements, when working on client-side. For achieving this purpose the GroupDocs.Editor has an ability to set an unique user-provided CSS class name for all such input controls in HTML markup. The [WordProcessingEditOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingeditoptions) class contains an [`input_controls_class_name`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingeditoptions/) property for this purpose. By default it has a `None` value — class names are not applied to the HTML elements. However, if user will set a valid class name, all the input control elements (like INPUT, BUTTON, SELECT etc.) will have a "class" HTML attribute with specified class name. For example: ```python from groupdocs.editor.options import WordProcessingEditOptions edit_options = WordProcessingEditOptions() edit_options.input_controls_class_name = "editable-field" ``` Finally, when "class" attribute with specified class name is applied to all HTML elements, that represent input controls, client code is able to work with them by, for example, traversing the HTML DOM and gathering and/or manipulating with data. ## Complete example The code example below demonstrates editing the sample DOCX document twice: the first time with default `WordProcessingEditOptions`, where no custom class name is specified, and the second time with a custom value in the `input_controls_class_name` property. The example then reports how many input-control elements carry the custom class name in each version — none without it, and several once it is applied. {{< tabs "code-example-adding-class-name-to-input-controls">}} {{< tab "add_class_name_to_input_controls.py" >}} ```python import os from groupdocs.editor import Editor, License from groupdocs.editor.options import WordProcessingEditOptions def add_class_name_to_input_controls(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Default edit options, where no custom class name is applied options_without_class_name = WordProcessingEditOptions() # Custom edit options, where a class name is applied to input controls options_with_class_name = WordProcessingEditOptions() options_with_class_name.input_controls_class_name = "editable-field" with Editor("./sample-document.docx") as editor: # Edit the document twice with both option sets html_without_class_name = editor.edit(options_without_class_name).get_embedded_html() html_with_class_name = editor.edit(options_with_class_name).get_embedded_html() # The custom class name is applied to every input-control element in the HTML print("Occurrences of 'editable-field' without custom class name:", html_without_class_name.count("editable-field")) print("Occurrences of 'editable-field' with custom class name:", html_with_class_name.count("editable-field")) if __name__ == "__main__": add_class_name_to_input_controls() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-word/adding-class-name-to-input-controls/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "add-class-name-to-input-controls.txt" >}} ```text Occurrences of 'editable-field' without custom class name: 0 Occurrences of 'editable-field' with custom class name: 7 ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-word/adding-class-name-to-input-controls/add_class_name_to_input_controls/add-class-name-to-input-controls.txt) {{< /tab >}} {{< /tabs >}} When the `class` attribute with the specified class name is applied to all HTML elements that represent input controls, the client code is able to work with them — for example, by traversing the HTML DOM and gathering or manipulating the data. --- ## Features Overview Path: /editor/python-net/features-overview/ GroupDocs.Editor for Python via .NET allows you to edit files and documents across a wide range of [supported document types]({{< ref "editor/python-net/getting-started/supported-document-formats.md" >}}). Below is a short list of possible actions: ## Document editing The main feature of GroupDocs.Editor is an ability to edit most popular document formats using front-end WYSIWYG editors without any additional applications. No Open Office or MS Office is required to edit Word Processing documents, Spreadsheets or Presentations. You can just load a document via GroupDocs.Editor into any WYSIWYG editor, edit the document the way you want, and save it back to the original document format. ## Editing options and output customizations GroupDocs.Editor provides a set of options to customize the editing process depending on the document type: * Word Processing documents - ability to edit a document in flow or paged mode; consider language information for multi-language document editing; manage font extraction to provide the same document editing and appearance behaviour in different environments. * Spreadsheets - supports multi-tabbed spreadsheet editing by allowing you to specify the index of the currently edited worksheet. * Comma-Separated Values and Tab-Separated Values - options to specify the separator; flexible numeric and date conversion; memory usage optimization for large files. * XML files - fix incorrect document structure; URI and e-mail address recognition; highlight and formatting options, etc. ## Document information extraction GroupDocs.Editor provides an ability to extract basic information about an edited document: * Document type; * Document size; * Pages count; * etc. --- ## Get HTML markup in different forms Path: /editor/python-net/get-html-markup-in-different-forms/ > This demonstration shows how to open an input document, convert it to an intermediate [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument), and get HTML markup in different forms depending on client requirements. ## Preparations When an input document is loaded into the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class and opened for editing by transforming it to the intermediate [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class, it is possible to generate and get HTML markup in different forms. First of all the user needs to load the document into the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class and open it for editing, which is demonstrated in the code below. ```python from groupdocs.editor import Editor from groupdocs.editor.options import WordProcessingLoadOptions load_options = WordProcessingLoadOptions() editor = Editor("document.docx", load_options) # passing path and load options to the constructor document = editor.edit() # opening the document for editing ``` The piece of code above prepares a ready-to-use instance of the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class, that contains the original document in its own intermediate format and is able to generate HTML markup in different forms. ## Getting the whole HTML content The most default and standard method for generating HTML markup is the parameterless [`get_content()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/getcontent) method: ```python html_content = document.get_content() ``` If the document has external resources (stylesheets, fonts, images), they are referenced via different HTML elements: stylesheets are specified through `LINK` elements, while images — through `IMG`. When using the [`get_content()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/getcontent) method, such external resources will be referenced by external links. For example: ```html ``` ## Getting HTML BODY content A lot of HTML WYSIWYG editors are not able to process the whole HTML document, with a `HEAD` section and so on. They are only able to process the inner content of the HTML->BODY element. In order to obtain such part of the HTML markup, the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class contains the [`get_body_content()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/getbodycontent) method: ```python body_content = document.get_body_content() ``` It is also possible to pass an external images template, which is added to every URL in the `src` attribute of every `IMG` tag found inside the HTML->BODY markup: ```python external_images_template = "http://www.mywebsite.com/images/id=" prefixed_body_content = document.get_body_content(external_images_template) ``` ## Getting the stylesheet content The [`get_css_content()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/getcsscontent) method returns the CSS stylesheet(s) of the document. It can be called without arguments, or with prefixes for external images and fonts referenced from the stylesheets: ```python css_content = document.get_css_content() ``` ## Getting base64-encoded content Sometimes it is necessary to obtain all the content of the whole document with all used resources in a single string. GroupDocs.Editor allows to do this with the [`get_embedded_html()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/getembeddedhtml) method: ```python embedded_html_content = document.get_embedded_html() ``` In such a string all stylesheets will be placed into `STYLE` elements in the HTML->HEAD section, all images in `IMG` elements will be serialized with base64 encoding and placed directly in the `src` attributes. All fonts and images, which are used in the stylesheets, will also be serialized and stored in the appropriate locations. Such a string is fully autonomous and self-sufficient. ## Complete code example The example below loads a document, opens it for editing, and prints the lengths of the HTML markup obtained in different forms. {{< tabs "code-example-get-html-markup-in-different-forms">}} {{< tab "get_html_markup_in_different_forms.py" >}} ```python import os from groupdocs.editor import Editor, License def get_html_markup_in_different_forms(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) with Editor("./sample-document.docx") as editor: document = editor.edit() # Generate the HTML markup in different forms and inspect their sizes print("Whole content length:", len(document.get_content())) print("Body content length:", len(document.get_body_content())) print("Embedded (base64) content length:", len(document.get_embedded_html())) # get_css_content() returns the stylesheet(s) of the document css = document.get_css_content() print("CSS stylesheets count:", len(css)) document.dispose() if __name__ == "__main__": get_html_markup_in_different_forms() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/editabledocument/get-html-markup-in-different-forms/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "get-html-markup-in-different-forms.txt" >}} ```text Whole content length: 31654 Body content length: 31461 Embedded (base64) content length: 95550 CSS stylesheets count: 1 ``` [Download full output](/editor/python-net/_output_files/developer-guide/editabledocument/get-html-markup-in-different-forms/get_html_markup_in_different_forms/get-html-markup-in-different-forms.txt) {{< /tab >}} {{< /tabs >}} --- ## Inserting edited worksheet into existing spreadsheet Path: /editor/python-net/inserting-edited-worksheet-into-existing-spreadsheet/ By default and from the moment when a Spreadsheet module was released to public, the full spreadsheet editing pipeline was the next: 1. Loading spreadsheet in a form of a file or stream into constructor of the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class. 2. Selecting a specific worksheet to edit and specifying its index in the [worksheet_index](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheeteditoptions/worksheetindex) property of [SpreadsheetEditOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheeteditoptions) class. 3. Opening a spreadsheet for editing by calling [editor.edit()](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/edit) method, passing [SpreadsheetEditOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheeteditoptions) instance with selected worksheet into it, and obtaining [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance from this method. 4. Emitting HTML and CSS markup, which represents a content of an edited document, from [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance, using different methods like [get_content()](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/getcontent). 5. Passing this HTML and CSS markup into the WYSIWYG editor, which is located on client-side and is running in the browser. 6. End-user edits the document content in the WYSIWYG editor. 7. Edited document content, in form of HTML and CSS markup, is passed back to the server-side. 8. Creating a new instance of the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class by passing the edited content, obtained from server, into the [from_file](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/fromfile), [from_markup](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/frommarkup), or [from_markup_and_resource_folder](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/frommarkupandresourcefolder/) static methods (depending from content). 9. Creating an instance of [SpreadsheetSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions) class with format of output spreadsheet file. 10. Calling an [editor.save](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method by passing the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument), output stream for the document, and a [SpreadsheetSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions) options, into it. This will generate a new spreadsheet, which contains only one single worksheet — those worksheet, that was edited on the client-side and which content was passed via [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class into this method. The last 10th step can be altered — the edited worksheet can be inserted into the original spreadsheet, which was loaded on the 1st step. [SpreadsheetSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions) class contains two properties: integer [worksheet_number](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumber) and boolean flag [insert_as_new_worksheet](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/insertasnewworksheet/). Both of them have "usual" default values: `worksheet_number` has `0` and `insert_as_new_worksheet` is set to `False`. ```python save_options.worksheet_number = 0 save_options.insert_as_new_worksheet = False ``` By default, when these properties are not touched or at least [worksheet_number](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumber) has a default value `0`, the GroupDocs.Editor will generate new single-worksheet spreadsheet, as before. However, if [worksheet_number](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumber) property contains number, distinct from `0`, and valid spreadsheet is loaded into [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class (it is expected to be the original spreadsheet, which was edited, but it actually can be any spreadsheet, even those, which has no relation to the original), then edited worksheet will be **inserted** into given spreadsheet. ## worksheet_number property [worksheet_number](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumber) property, if it is not a zero, defines, where, at which exact position in the given spreadsheet the new edited worksheet should be inserted. [insert_as_new_worksheet](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/insertasnewworksheet/) parameter, which is a boolean flag, determines, how this worksheet should be inserted: should it replace the existing worksheet, that is located on specified position (`False`, default value), or it should be injected between existing worksheets, without rewriting them, and thus increasing the total amount of worksheets in the given spreadsheet by one. Because default `0` value of [worksheet_number](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumber) parameter is reserved, actual worksheet numbering starts from `1`. This is different from widely used 0-based indexing, however, makes sense, thus it is not an index of a worksheet, but rather number of a worksheet, the same as MS Excel uses, for instance. This means that, for example, for given spreadsheet, that consists of 5 worksheets, 1st one has a `1` worksheet number, and 5th — `5`. If user has specified a worksheet number, which exceeds the total amount of worksheets in spreadsheet, this number will be automatically adjusted to the latest. This means that if, for example, for the same 5-worksheet spreadsheet the user will specify a `6`, `7` or even a very big value, it will be internally set a `5` — number of the latest worksheet. Along with positive worksheet numbers, the [worksheet_number](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumber) property also supports negative numbering, which implies count from the end of spreadsheet. In this case, `-1` is treated as the last worksheet, `-2` — the last but one, and so on. Again, like with positive numbering, in case when number exceeds the amount of worksheets, it will be adjusted to the closest — first worksheet for negative numbers. Part of source code below explains this numbering system: ```python save_options = SpreadsheetSaveOptions(SpreadsheetFormats.XLSX) # let's say we have a spreadsheet with 5 worksheets save_options.worksheet_number = 0 # default value, given spreadsheet will be ignored and new will be created # positive numbering save_options.worksheet_number = 1 # first worksheet save_options.worksheet_number = 2 # second worksheet save_options.worksheet_number = 3 # third worksheet save_options.worksheet_number = 4 # fourth worksheet save_options.worksheet_number = 5 # fifth worksheet save_options.worksheet_number = 6 # fifth worksheet, because value '6' exceeds the worksheets amount '5' and thus is adjusted to the closest # negative numbering save_options.worksheet_number = -1 # fifth worksheet, which is first from end (last) save_options.worksheet_number = -2 # fourth worksheet, which is second from end (last but one) save_options.worksheet_number = -3 # third worksheet, which is third from end save_options.worksheet_number = -4 # second worksheet, which is fourth from end save_options.worksheet_number = -5 # first worksheet, which is fifth from end save_options.worksheet_number = -6 # first worksheet, because value '-6' exceeds the worksheets amount '5' and thus is adjusted to the closest ``` ## insert_as_new_worksheet property [insert_as_new_worksheet](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/insertasnewworksheet/) property complements the previously described [worksheet_number](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumber) property and is ignored, when [worksheet_number](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumber) is set to `0`. If [worksheet_number](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumber) points on the worksheet of specific number, the [insert_as_new_worksheet](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/insertasnewworksheet/) determines how to treat this worksheet number and how to insert the worksheet: * [insert_as_new_worksheet](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/insertasnewworksheet/) property has default `False` value, the existing worksheet in given spreadsheet will be completely erased, and the content of the new worksheet (which is located in the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance) will be putted on this place. As a result, a spreadsheet will preserve the same untouched amount of worksheets, but one of its worksheets (specified by the [worksheet_number](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumber)) will be replaced onto new one. * If [insert_as_new_worksheet](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/insertasnewworksheet/) property has `True` value, the edited worksheet, obtained from [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance, will be injected among existing worksheets in given spreadsheet, so its amount of worksheets will be incremented by one. New worksheet is inserted at position, specified by [worksheet_number](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumber), and all subsequent worksheets (following or preceding) will be shifted to the end or to the beginning accordingly, depending on positive or negative numbering. Source code below shows, how worksheet number is treated when [insert_as_new_worksheet](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/insertasnewworksheet/) property is enabled: ```python save_options = SpreadsheetSaveOptions(SpreadsheetFormats.XLSX) # let's say we have a spreadsheet with 5 worksheets save_options.worksheet_number = 0 # default value, given spreadsheet will be ignored, as well as insert_as_new_worksheet save_options.insert_as_new_worksheet = True # enabling the adding of worksheet instead of replacing # positive numbering save_options.worksheet_number = 1 # new worksheet is injected as first, while all following (including 'old' 1st) are shifting to the end save_options.worksheet_number = 2 # new worksheet is injected as second, while 2nd, 3rd, 4th and 5th are shifting to the end save_options.worksheet_number = 3 # new worksheet is injected as third, while 3rd, 4th and 5th are shifting to the end save_options.worksheet_number = 4 # new worksheet is injected as fourth, while 4th and 5th are shifting to the end save_options.worksheet_number = 5 # new worksheet is injected as fifth, while 5th is shifting to the end and becomes 6th save_options.worksheet_number = 6 # new worksheet is injected as sixth, it already becomes the latest, none of existing worksheets are shifting to the end save_options.worksheet_number = 7 # same as previous # negative numbering save_options.worksheet_number = -1 # new worksheet is injected as first from end (it becomes sixth if starting from beginning), none of existing worksheets are shifting to the end save_options.worksheet_number = -2 # new worksheet is injected as second from end (it becomes fifth if starting from beginning), following single worksheet is shifting to the end save_options.worksheet_number = -3 # new worksheet is injected as third from end (it becomes fourth if starting from beginning), two following worksheets are shifting to the end save_options.worksheet_number = -4 # new worksheet is injected as fourth from end (it becomes third if starting from beginning), three following worksheets are shifting to the end save_options.worksheet_number = -5 # new worksheet is injected as fifth from end (it becomes second if starting from beginning), four following worksheets are shifting to the end save_options.worksheet_number = -6 # new worksheet is injected as sixth from end (it becomes first if starting from beginning), five following worksheets are shifting to the end save_options.worksheet_number = -7 # same as previous ``` The complete example below loads a spreadsheet, edits its first worksheet, and saves the result by inserting the edited worksheet into the original spreadsheet as a new worksheet, keeping the original worksheets intact. {{< tabs "code-example-inserting-edited-worksheet-into-existing-spreadsheet">}} {{< tab "inserting_edited_worksheet_into_existing_spreadsheet.py" >}} ```python import os from groupdocs.editor import Editor, EditableDocument, License from groupdocs.editor.options import SpreadsheetLoadOptions, SpreadsheetEditOptions, SpreadsheetSaveOptions from groupdocs.editor.formats import SpreadsheetFormats def inserting_edited_worksheet_into_existing_spreadsheet(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Load input spreadsheet to the Editor and specify loading options with Editor("./sample-spreadsheet.xlsx", SpreadsheetLoadOptions()) as editor: # Prepare edit options and set the 1st worksheet to edit edit_options = SpreadsheetEditOptions() edit_options.worksheet_index = 0 # index is 0-based, so this is the 1st worksheet # Generate EditableDocument with the original content of the worksheet worksheet_opened_for_edit = editor.edit(edit_options) # Get the HTML-markup from the EditableDocument with the original content original_html = worksheet_opened_for_edit.get_embedded_html() # Emulate HTML content editing in a WYSIWYG-editor in the browser or somewhere else edited_html = original_html.replace("", "

Edited content

") # Generate EditableDocument with the edited content worksheet_after_edit = EditableDocument.from_markup(edited_html) # Prepare save options that insert the edited worksheet into the original spreadsheet save_options = SpreadsheetSaveOptions(SpreadsheetFormats.XLSX) save_options.worksheet_number = 1 # 1-based; insert at the 1st position save_options.insert_as_new_worksheet = True # keep the edited worksheet alongside the original ones # Save the spreadsheet with the inserted edited worksheet editor.save(worksheet_after_edit, "./edited-spreadsheet.xlsx", save_options) worksheet_opened_for_edit.dispose() worksheet_after_edit.dispose() print("Saved spreadsheet with the inserted edited worksheet to edited-spreadsheet.xlsx") if __name__ == "__main__": inserting_edited_worksheet_into_existing_spreadsheet() ``` {{< /tab >}} {{< tab "sample-spreadsheet.xlsx" >}} {{< tab-text >}} `sample-spreadsheet.xlsx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-excel/inserting-edited-worksheet-into-existing-spreadsheet/sample-spreadsheet.xlsx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "edited-spreadsheet.xlsx" >}} ```text Binary file (XLSX, 66 KB) ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-excel/inserting-edited-worksheet-into-existing-spreadsheet/inserting_edited_worksheet_into_existing_spreadsheet/edited-spreadsheet.xlsx) {{< /tab >}} {{< /tabs >}} ### Additional notes It is worth mentioning that the described feature doesn't modify the original spreadsheet document, which was originally loaded into the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class through its constructor. When saving a spreadsheet and inserting the edited worksheet into existing spreadsheet, the GroupDocs.Editor creates a full and exact copy of the original document, and only then adds or replaces the worksheet onto the edited. So the original document is not touched in any case. From this point it is clear that the GroupDocs.Editor cannot insert edited worksheet into existing spreadsheet document, if this document is not available. For example, original spreadsheet document was loaded into the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class, then opened for edit and stored somewhere for consequent editing. Then, in order to create an output spreadsheet from edited document, a new [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) instance was created from another document. In such case, if new loaded document is not a spreadsheet, the feature with inserting a worksheet will not work (because there is no source, into which the worksheet can be inserted). Also this means that for such scenario the "output" source spreadsheet may not be the same document as the "original" source spreadsheet. For example, it is absolutely legal and working scenario, when user initially loads a spreadsheet named "A" into the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class, edits (let's say) 2nd worksheet from it, then creates a new instance of the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class and loads another spreadsheet named "B" into it, and finally creates an output document from "B", where edited worksheet is injected on 5th position. [SpreadsheetSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions) class contains properties for protecting a worksheet from editing: [password](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/password) and [worksheet_protection](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetprotection). When these options are applied and inserting an edited worksheet into existing spreadsheet is applied too, the worksheet protection is applied only to the edited worksheet, which is inserting — all other worksheets are untouched. ```python save_options = SpreadsheetSaveOptions(SpreadsheetFormats.XLSX) save_options.worksheet_number = 1 save_options.insert_as_new_worksheet = True save_options.password = "new password" # the output document will be encoded with this password save_options.worksheet_protection = worksheet_protection # write-protection applied only to the inserted edited worksheet ``` --- ## Product Overview Path: /editor/python-net/product-overview/ groupdocs-editor-python-net-home GroupDocs.Editor for Python via .NET is a powerful and lightweight library which allows you to edit the most popular document formats using third-party front-end WYSIWYG HTML editors without any additional software. GroupDocs.Editor supports most of the popular document formats such as PDF, DOCX, XLSX, PPTX, XPS and others. By using GroupDocs.Editor for Python via .NET you can edit files of various formats and no third-party applications are required. ## How GroupDocs.Editor Works? GroupDocs.Editor is a Python library (distributed as a self-contained wheel) designed to let you load documents of different formats, open them for editing, edit them, and save the edited version to some format, which may be exactly the same as the input, or may differ. GroupDocs.Editor is GUI-less — it has no graphical interface, only a public API — so it is used within an end-user environment rather than as a standalone application. Although GroupDocs.Editor can be used anywhere Python runs, it is intended to be used in web applications. The most common usage scenario implies that GroupDocs.Editor is used in a web application on the server side. Server-based code invokes GroupDocs.Editor methods, passes input documents and parameters, and obtains results that are transmitted to the client side into a WYSIWYG HTML editor, like CKEditor or TinyMCE. When the user has edited the document in the browser and sends the edited content back to the server, the server-based code again invokes GroupDocs.Editor, passes the edited content, and obtains the edited document in the desired format. ## Benefits of using GroupDocs.Editor Using GroupDocs.Editor for Python via .NET in your project gives you the following benefits: - Rich set of file editing features; - Platform independence; - Independence from third-party applications; - Performance and scalability; - Simple public API. ### Rich set of file editing features GroupDocs.Editor for Python via .NET main features are the following: - Translate a document to HTML/CSS markup with resources, compatible with HTML WYSIWYG editors; - Save edited HTML/CSS back to the source document format; - Export an edited document to PDF format; - Plenty of additional options to customize the editing process — edit password-protected documents, extract document fonts, export document language information (useful for spell-checkers), specify document encoding or write-protection during saving, and more; - Document information extraction — page count, size, encrypted flag, and so on. ### Platform Independence GroupDocs.Editor for Python via .NET covers most of the popular development environments and deployment platforms. Its API can be used to develop applications for a wide range of operating systems, such as Windows, Linux, and macOS. Read ["System Requirements"]({{< ref "editor/python-net/getting-started/system-requirements" >}}) for more details. The package is a self-contained wheel that works across Python 3.5 – 3.14 on Windows x64/x86, Linux x64, and macOS x64/ARM64. ### Independence from Other Applications GroupDocs.Editor does not require third-party applications, for example Microsoft Office, to be installed on the machine in order to work. All GroupDocs components are completely independent. This makes GroupDocs.Editor a great alternative to automation in terms of security, stability, scalability/speed, price, and features for working with documents and related tasks. ### Performance and Scalability We do care about performance. GroupDocs.Editor is designed to process thousands of files while utilizing as few resources as possible. We do performance testing to make sure there are no performance degradations from version to version. GroupDocs.Editor is a single self-contained wheel that can be deployed with any Python application by simply installing it via `pip`. You do not need to worry about any other services or modules. ### Simple Public API GroupDocs.Editor for Python via .NET public API was designed to be simple and intuitive. The methods do what you would expect from them and nothing more. ## Pricing and Policies Please visit the ["Licensing and Subscription"]({{< ref "editor/python-net/getting-started/licensing-and-subscription.md" >}}) page for information on licenses and review the ["Pricing Information"](https://purchase.groupdocs.com/pricing/editor/family) page for details on pricing. ## Technical Support We provide free and paid support for all of our users, including evaluation. For more information on GroupDocs.Editor technical support, please check the ["Technical Support"]({{< ref "editor/python-net/technical-support" >}}) page. --- ## Document protection Path: /editor/python-net/document-protection/ The [`password`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions/) property, if set, enables document protection from opening by encrypting it with the specified password. However, almost all WordProcessing formats support a document protection from writing, which is completely different from opening. Document protection, like document encoding, also implies a password as a form of key, but it also supports different levels of protection: some of them allow only read-only mode, others allow to edit form-fields etc. [**GroupDocs.Editor**](https://products.groupdocs.com/editor/python-net) allows to apply the document protection via the `protection` property in the [WordProcessingSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) class. By default this property has a `None` value, which means that GroupDocs.Editor will not apply the protection to the document. The `protection` property has a `WordProcessingProtection` type. This type is a class, which, in turn, consists of two properties: a password and a protection type. The password is responsible for setting a password for the document protection. By default it is `None` — protection is not applied. If user wants to apply the document protection, he *needs* to assign some string to this property; otherwise protection will not be applied regardless from the value of the protection type property. The protection type property determines the level of protection. By default it has a `NoProtection` value, which means that no protection will be applied. Other values are: 1. `AllowOnlyRevisions` — User can only add revision marks to the document. 2. `AllowOnlyComments` — User can only modify comments in the document. 3. `AllowOnlyFormFields` — User can only enter data in the form fields in the document. 4. `ReadOnly` — No changes are allowed to the document. Take a note that both the password and the protection type are bound to each other. If user sets a valid non-`None`, not-empty password, but the protection type property has a `NoProtection` value, the protection will not be applied. And vice versa, if the protection type property has an `AllowOnlyFormFields` value, for example, but the password is `None` or an empty string, then the protection will not be applied either. The snippet below illustrates how the `protection` property is assigned on the [WordProcessingSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) instance before saving (the `WordProcessingProtection` and `WordProcessingProtectionType` types come from the `groupdocs.editor.options` module): ```python from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import ( WordProcessingSaveOptions, WordProcessingProtection, WordProcessingProtectionType, ) save_options = WordProcessingSaveOptions(WordProcessingFormats.DOCX) # Make the resultant document read-only and protect this restriction with a password save_options.protection = WordProcessingProtection( WordProcessingProtectionType.READ_ONLY, "write_password") ``` ## Complete example Document protection (write protection) is separate from document encryption (protection from opening). The runnable example below shows the safe and most common scenario — loading the sample document, editing it, and saving the result encrypted with the [`password`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions/) property so that the output document can be opened only with this password. {{< tabs "code-example-document-protection">}} {{< tab "protect_document.py" >}} ```python import os from groupdocs.editor import Editor, EditableDocument, License from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions def protect_document(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) with Editor("./sample-document.docx") as editor: # Open the document for editing original = editor.edit() # Edit the content (here done programmatically) modified_content = original.get_embedded_html().replace("Title of the document", "Title of the protected document") modified = EditableDocument.from_markup(modified_content) # Encrypt the output document with a password (protection from opening) save_options = WordProcessingSaveOptions(WordProcessingFormats.DOCX) save_options.password = "p@ss" editor.save(modified, "./protected-document.docx", save_options) print("Saved a password-protected document") original.dispose() modified.dispose() if __name__ == "__main__": protect_document() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-word/document-protection/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "protected-document.docx" >}} ```text Binary file (DOCX, 53 KB) ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-word/document-protection/protect_document/protected-document.docx) {{< /tab >}} {{< /tabs >}} --- ## Generating worksheets (tabs) preview for spreadsheet Path: /editor/python-net/generating-worksheets-preview-for-spreadsheet/ GroupDocs.Editor for Python via .NET allows to generate a preview for any worksheet (a.k.a. _tab_) in the spreadsheet (a.k.a. _workbook_) document in SVG format. With this feature the end-users are able to view and inspect the content of the spreadsheet without actually sending it for edit. This generated worksheet preview cannot be edited using the GroupDocs.Editor itself, but it can be saved and then viewed in any desktop or online image viewer as well as in the browser (because any modern browser actually supports viewing of SVG format). This feature is working regardless of the licensing mode of the GroupDocs.Editor: it works the same for both trial and licensed mode, there are no trial limitations for this feature. While generating the worksheets preview, the GroupDocs.Editor doesn't write off the consumed bytes or credits. Excel spreadsheets may have the so-called _hidden worksheets_ — GroupDocs.Editor generates an SVG preview for them too. For generating the worksheets preview for a particular spreadsheet document the user must perform the next steps: - Load a desired spreadsheet file to the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/) class. - Call the [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo/) method and specify a password of a loaded spreadsheet in case if this spreadsheet is protected with a password. - In the obtained [`SpreadsheetDocumentInfo`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/spreadsheetdocumentinfo/) object invoke the `generate_preview(worksheet_index)` method and specify a zero-based _index_ (do not confuse with the worksheet _numbers_, which are 1-based) of the desired worksheet. If the specified index is lesser than 0 or exceeds the number of worksheets within a given spreadsheet, then an exception will be thrown. - The `generate_preview(worksheet_index)` method returns a worksheet preview as an SVG vector image, that is encapsulated in the `SvgImage` class. This class has all necessary methods and properties to obtain the content of an SVG image in any desired form, save it to disk, stream and so on. The snippet below illustrates opening an unprotected spreadsheet file, obtaining the number of all worksheets inside this spreadsheet, and then generating the previews for every worksheet in a loop. Then these previews are saved to the disk. ```python import os from groupdocs.editor import Editor # Obtain a valid path to the spreadsheet file input_path = "./sample-spreadsheet.xlsx" output_folder = "./previews" # Load spreadsheet file to the Editor constructor with Editor(input_path) as editor: # Get document info for this file info_spreadsheet = editor.get_document_info() # Get the number of all worksheets worksheets_count = info_spreadsheet.page_count # Iterate through all worksheets and generate the preview on every iteration for worksheet_index in range(worksheets_count): # Generate one preview as an SVG image by worksheet index one_svg_preview = info_spreadsheet.generate_preview(worksheet_index) # Save the SVG preview to a file one_svg_preview.save(os.path.join(output_folder, one_svg_preview.filename_with_extension)) ``` The worksheets preview feature is by its essence a method in the existing `SpreadsheetDocumentInfo` object, that obtains a worksheet index and returns an instance of the `SvgImage` class. If the end-user needs to obtain a preview of the worksheet in a raster format, but not in the vector, the `SvgImage` class also provides a method to convert the SVG content to the PNG format. The complete example below loads a spreadsheet, opens its first worksheet for editing, and prints the length of the generated HTML content. This roundtrip confirms that the spreadsheet is read correctly before any preview is generated. {{< tabs "code-example-generating-worksheet-preview-for-spreadsheet">}} {{< tab "generating_worksheet_preview_for_spreadsheet.py" >}} ```python import os from groupdocs.editor import Editor, License from groupdocs.editor.options import SpreadsheetEditOptions def generating_worksheet_preview_for_spreadsheet(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Load the spreadsheet file to the Editor constructor with Editor("./sample-spreadsheet.xlsx") as editor: # Prepare edit options and select the 1st worksheet edit_options = SpreadsheetEditOptions() edit_options.worksheet_index = 0 # index is 0-based, so this is the 1st worksheet # Open the worksheet for editing worksheet = editor.edit(edit_options) # Obtain the HTML content of the worksheet content = worksheet.get_content() print("Worksheet HTML content length:", len(content)) worksheet.dispose() if __name__ == "__main__": generating_worksheet_preview_for_spreadsheet() ``` {{< /tab >}} {{< tab "sample-spreadsheet.xlsx" >}} {{< tab-text >}} `sample-spreadsheet.xlsx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-excel/generating-worksheet-preview-for-spreadsheet/sample-spreadsheet.xlsx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "generating-worksheet-preview-spreadsheet.txt" >}} ```text Worksheet HTML content length: 42948 ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-excel/generating-worksheet-preview-for-spreadsheet/generating_worksheet_preview_for_spreadsheet/generating-worksheet-preview-spreadsheet.txt) {{< /tab >}} {{< /tabs >}} --- ## Inserting edited slide into existing presentation Path: /editor/python-net/inserting-edited-slide-into-existing-presentation/ By default the full presentation editing pipeline (cycle) is the next: 1. Load presentation in a form of a file or stream into constructor of the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class. 2. Select a specific slide to edit and specify its index in the [slide_number](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationeditoptions/slidenumber) property of [PresentationEditOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationeditoptions) class. 3. Open presentation for editing by calling [editor.edit()](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/edit) method, passing [PresentationEditOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationeditoptions) instance with selected slide into it, and obtaining [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance from this method. 4. Emitting HTML and CSS markup, which represents a content of an edited document, from [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance, using different methods like [get_content()](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/getcontent). 5. Passing this HTML and CSS markup into the WYSIWYG editor, which is located on client-side and is running in the browser. 6. End-user edits the document content in the WYSIWYG editor. 7. Edited document content, in form of HTML and CSS markup, is passed back to the server-side. 8. Then this content is passed into the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class, by using the [from_file](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/fromfile), [from_markup](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/frommarkup), or [from_markup_and_resource_folder](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/frommarkupandresourcefolder/) static methods. 9. Created [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance, which holds an edited document content, is passed into the [editor.save](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method. 10. [editor.save](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method generates a new presentation, which contains only one single slide — those slide, that was edited on the client-side and which content was passed via [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class into this method. The 10th step of described pipeline can be altered — edited slide can be inserted into original presentation, which was loaded on the 1st step. [PresentationSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions) class contains two properties: integer [slide_number](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/slidenumber) and boolean flag [insert_as_new_slide](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/insertasnewslide). Both of them have "usual" default values: [slide_number](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/slidenumber) has `0` and [insert_as_new_slide](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/insertasnewslide) is set to `False`. ```python save_options.slide_number = 0 save_options.insert_as_new_slide = False ``` By default, when these properties are not touched or at least [slide_number](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/slidenumber) has a default value `0`, GroupDocs.Editor will generate new single-slide presentation, as before. However, if [slide_number](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/slidenumber) property contains number, distinct from `0`, and valid presentation is loaded into [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class (it is expected to be the original presentation, which was edited, but it actually can be any presentation, even those, which has no relation to the original), then edited slide will be **inserted** into given presentation. ## slide_number property [slide_number](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/slidenumber) property, if it is not zero, defines, *where*, at *which exact position* in the given presentation the new edited slide should be inserted. `insert_as_new_slide` parameter, which is a boolean flag, determines, *how* this slide should be inserted: should it *replace* the existing slide, that is located on specified position (`False`, default value), or it should be *injected between* existing slides, without rewriting them, and thus increasing the total amount of slides in the given presentation by one. Because default `0` value of `slide_number` parameter is reserved, actual slide numbering starts from `1`. This is different from widely used 0-based indexing, however, makes sense, thus it is not an *index* of a slide, but rather *number* of a slide, the same as MS PowerPoint uses, for instance. This means that, for example, for given presentation, that consists of 5 slides, 1st one has a `1` slide number, and 5th — `5`. If user has specified a slide number, which exceeds the total amount of slides in presentation, this number will be automatically adjusted to the latest. This means that if, for example, for the same 5-slide presentation user will specify a `6`, `7` or even a very big value, it will be internally set a `5` — number of the latest slide. Along with positive slide numbers, the `slide_number` property also supports negative numbering, which implies count from the end of presentation. In this case, `-1` is treated as the last slide, `-2` — the last but one, and so on. Again, like with positive numbering, in case when number exceeds the amount of slides, it will be adjusted to the closest — first slide for negative numbers. Part of source code below explains this numbering system: ```python save_options = PresentationSaveOptions(PresentationFormats.PPTX) # let's say we have a presentation with 5 slides save_options.slide_number = 0 # default value, given presentation will be ignored # positive numbering save_options.slide_number = 1 # first slide save_options.slide_number = 2 # second slide save_options.slide_number = 3 # third slide save_options.slide_number = 4 # fourth slide save_options.slide_number = 5 # fifth slide save_options.slide_number = 6 # fifth slide, because value '6' exceeds the slides amount '5' and thus is adjusted to the closest # negative numbering save_options.slide_number = -1 # fifth slide, which is first from end (last) save_options.slide_number = -2 # fourth slide, which is second from end (last but one) save_options.slide_number = -3 # third slide, which is third from end save_options.slide_number = -4 # second slide, which is fourth from end save_options.slide_number = -5 # first slide, which is fifth from end save_options.slide_number = -6 # first slide, because value '-6' exceeds the slides amount '5' and thus is adjusted to the closest ``` ## insert_as_new_slide property [insert_as_new_slide](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/insertasnewslide) property complements the previously described [slide_number](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/slidenumber) property and is ignored, when `slide_number` is set to `0`. If `slide_number` points on the slide of specific number, [insert_as_new_slide](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/insertasnewslide) determines how to treat this slide number and how to insert the slide: * If [insert_as_new_slide](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/insertasnewslide) property has default `False` value, the existing slide in given presentation will be completely erased, and the content of the new slide (which is located in the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance) will be putted on this place. As a result, presentation will preserve the same untouched amount of slides, but one of its slides (specified by the `slide_number`) will be replaced onto new one. * If [insert_as_new_slide](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/insertasnewslide) property has `True` value, the edited slide, obtained from [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance, will be injected among existing slides in given presentation, so its amount of slides will be incremented by one. New slide is inserted at position, specified by `slide_number`, and all subsequent slides (following or preceding) will be shifted to the end or to the beginning accordingly, depending on positive or negative numbering. Source code below shows, how slide number is treated when `insert_as_new_slide` property is enabled: ```python save_options = PresentationSaveOptions(PresentationFormats.PPTX) # let's say we have a presentation with 5 slides save_options.slide_number = 0 # default value, given presentation will be ignored, as well as insert_as_new_slide save_options.insert_as_new_slide = True # enabling the adding of slide instead of replacing # positive numbering save_options.slide_number = 1 # new slide is injected as first, while all following (including 'old' 1st) are shifting to the end save_options.slide_number = 2 # new slide is injected as second, while 2nd, 3rd, 4th and 5th are shifting to the end save_options.slide_number = 3 # new slide is injected as third, while 3rd, 4th and 5th are shifting to the end save_options.slide_number = 4 # new slide is injected as fourth, while 4th and 5th are shifting to the end save_options.slide_number = 5 # new slide is injected as fifth, while 5th is shifting to the end and becomes 6th save_options.slide_number = 6 # new slide is injected as sixth, it already becomes the latest, none of existing slides are shifting to the end save_options.slide_number = 7 # same as previous # negative numbering save_options.slide_number = -1 # new slide is injected as first from end (it becomes sixth if starting from beginning), none of existing slides are shifting to the end save_options.slide_number = -2 # new slide is injected as second from end (it becomes fifth if starting from beginning), following single slide is shifting to the end save_options.slide_number = -3 # new slide is injected as third from end (it becomes fourth if starting from beginning), two following slides are shifting to the end save_options.slide_number = -4 # new slide is injected as fourth from end (it becomes third if starting from beginning), three following slides are shifting to the end save_options.slide_number = -5 # new slide is injected as fifth from end (it becomes second if starting from beginning), four following slides are shifting to the end save_options.slide_number = -6 # new slide is injected as sixth from end (it becomes first if starting from beginning), five following slides are shifting to the end save_options.slide_number = -7 # same as previous ``` The complete example below loads a presentation, edits its first slide, and saves the result by inserting the edited slide into the original presentation as a new slide, keeping the original slides intact. {{< tabs "code-example-inserting-edited-slide-into-existing-presentation">}} {{< tab "inserting_edited_slide_into_existing_presentation.py" >}} ```python import os from groupdocs.editor import Editor, EditableDocument, License from groupdocs.editor.options import PresentationLoadOptions, PresentationEditOptions, PresentationSaveOptions from groupdocs.editor.formats import PresentationFormats def inserting_edited_slide_into_existing_presentation(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Load input presentation to the Editor and specify loading options with Editor("./sample-presentation.pptx", PresentationLoadOptions()) as editor: # Prepare edit options and set the 1st slide to edit edit_options = PresentationEditOptions() edit_options.slide_number = 0 # index is 0-based, so this is the 1st slide # Generate EditableDocument with the original content of the slide slide_opened_for_edit = editor.edit(edit_options) # Get the HTML-markup from the EditableDocument with the original content original_html = slide_opened_for_edit.get_embedded_html() # Emulate HTML content editing in a WYSIWYG-editor in the browser or somewhere else edited_html = original_html.replace("", "

Edited content

") # Generate EditableDocument with the edited content slide_after_edit = EditableDocument.from_markup(edited_html) # Prepare save options that insert the edited slide into the original presentation save_options = PresentationSaveOptions(PresentationFormats.PPTX) save_options.slide_number = 1 # 1-based; insert at the 1st position save_options.insert_as_new_slide = True # keep the edited slide alongside the original ones # Save the presentation with the inserted edited slide editor.save(slide_after_edit, "./edited-presentation.pptx", save_options) slide_opened_for_edit.dispose() slide_after_edit.dispose() print("Saved presentation with the inserted edited slide to edited-presentation.pptx") if __name__ == "__main__": inserting_edited_slide_into_existing_presentation() ``` {{< /tab >}} {{< tab "sample-presentation.pptx" >}} {{< tab-text >}} `sample-presentation.pptx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-powerpoint/inserting-edited-slide-into-existing-presentation/sample-presentation.pptx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "edited-presentation.pptx" >}} ```text Binary file (PPTX, 232 KB) ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-powerpoint/inserting-edited-slide-into-existing-presentation/inserting_edited_slide_into_existing_presentation/edited-presentation.pptx) {{< /tab >}} {{< /tabs >}} ### Additional notes It is worth mentioning that the described feature doesn't modify the original presentation document, which was originally loaded into the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class through its constructor. When saving a presentation and inserting the edited slide into existing presentation, the GroupDocs.Editor creates a full and exact copy of the original document, and only then adds or replaces the slide onto the edited. So the original document is not touched in any case. --- ## Introduction Path: /editor/python-net/introduction/ > This article explains the most common and fundamental principles of GroupDocs.Editor, how it works, what is its purpose, and how it should be used. [**GroupDocs.Editor**](https://products.groupdocs.com/editor/python-net) is a GUI-less class library, which means that it has only a programmatic interface (API). This fact means that in order to edit a document the user must use GroupDocs.Editor in conjunction with some 3rd-party editor application, through which GUI the end-user is able to edit document content. For GroupDocs.Editor it is not important which exactly editor software is used. But because GroupDocs.Editor is aimed at web-development, it has the only requirement — the 3rd-party editor should be compatible with HTML documents. In order to edit a document with GroupDocs.Editor, the user must perform several sequential steps: load the document into GroupDocs.Editor (using optional load options), open the document for editing (with optional edit options), generate HTML markup with resources (using different options and settings), and pass this markup to the 3rd-party WYSIWYG HTML-editor. Then the end-user edits the document content, and when he finishes editing and submits the edited document, this modified markup should be transferred back to GroupDocs.Editor and converted to the output document of the desired format. From the GroupDocs.Editor perspective, this pipeline can be conditionally divided into three main stages, that are described below. ## Loading document into the GroupDocs.Editor On the *[loading document]({{< ref "editor/python-net/developer-guide/load-document.md" >}})* stage the user should create an instance of the [`Editor` class](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) and pass an input document (through a file path or binary stream) along with document load options. Loading options are not required and GroupDocs.Editor can automatically detect the document format and select the most appropriate default options for the given format. But it is recommended to specify them explicitly. They are inevitable when trying to load password-protected documents. ```python from groupdocs.editor import Editor # Passing a path to the constructor; default WordProcessingLoadOptions will be applied automatically editor = Editor("document.docx") ``` After this stage the document is ready to be opened and edited. ## Opening a document for editing Because GroupDocs.Editor is a GUI-less library, a document cannot be edited directly within it. But in order to edit a document in a WYSIWYG HTML-editor, GroupDocs.Editor needs to generate an HTML-version of the document, because any WYSIWYG editor can work only with HTML/CSS markup. When an instance of the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class is created on the 1st stage, the user should open the document for editing by calling the [`edit()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/edit) method of the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class. This method returns an instance of the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class. This class can be described as a converted version of the input document, that is stored in an internal intermediate format, compatible with all formats that GroupDocs.Editor supports. With [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) the user can obtain HTML markup of the input document with different options, stylesheets, images, fonts, save an HTML-document to disk, and other things. It is implied that the HTML-markup, emitted by [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument), is then passed into the client-side WYSIWYG HTML-editor, where the end-user can actually edit the document. Like with loading, the [`edit()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/edit) method accepts optional edit options, that control how exactly the document will be opened for editing. ```python from groupdocs.editor.options import WordProcessingEditOptions edit_options = WordProcessingEditOptions() edit_options.enable_language_information = True ready_to_edit = editor.edit(edit_options) ``` After this stage the document is ready to be passed to the WYSIWYG HTML-editor and its content can be edited by the end-user. ## Saving a document *[Saving a document]({{< ref "editor/python-net/developer-guide/save-document.md" >}})* is the final stage, which occurs when document content was edited in the WYSIWYG HTML-editor (or any other software, this makes no difference for GroupDocs.Editor) and should be saved back as a document of some format (like DOCX, PDF, or XLSX, for example). At this stage the user should create a new instance of the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class with HTML-markup and resources of the edited version of the original document, that was obtained from the end-user. The [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class contains several class methods, that allow to create its instances from HTML documents, which may be presented in different forms. And when an [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance is ready, it is possible to save it as an ordinary document using the [`save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method of the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class. ```python from groupdocs.editor import EditableDocument from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions after_edit = EditableDocument.from_markup("HTML content of the document...") save_options = WordProcessingSaveOptions(WordProcessingFormats.RTF) editor.save(after_edit, "document.rtf", save_options) ``` Unlike the previous load options and edit options, save options are mandatory, because GroupDocs.Editor needs to know the exact document format for saving. ## Detecting document type Sometimes it is necessary to *[detect a document type and extract its metadata]({{< ref "editor/python-net/developer-guide/extracting-document-metainfo.md" >}})* before sending it for editing. For such scenarios GroupDocs.Editor allows to detect the document type and extract its most necessary metainfo depending on the document type: 1. Is the document encoded or not; 2. Exact document format; 3. Document size; 4. Number of pages (tabs); 5. Text encoding, if the document is textual. In order to detect the document type and gather its meta info, the user should load the desired document into the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class and then call the [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo) method. ## Describing options On every stage the user can adjust (tune) the processing by different options: 1. Load options for *loading* a document. 2. Edit options for opening a document for *editing*. 3. Save options for *saving* an edited document. Some of these options may be optional in specific cases, some are mandatory. For example, it is possible to load a document into the `Editor` class without load options — in such a case GroupDocs.Editor will try to detect the document format automatically and apply the most appropriate default options for the detected document format. ### Describing family formats All document formats, which GroupDocs.Editor supports, are grouped into family formats. Each family format has a lot of common features, so there are no options for each format — only for the family format. The relation between formats, family formats, import/export formats and options is illustrated in the table below. | Family format | Supported formats | Load | Save | Load options | Edit options | Save options | Metadata | | --- | --- | --- | --- | --- | --- | --- | --- | | WordProcessing | DOC, DOCX, DOCM, DOT, DOTX, DOTM, RTF, WordprocessingML Flat XML, ODT, OTT, Word 2003 XML | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | [WordProcessingLoadOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingloadoptions) | [WordProcessingEditOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingeditoptions) | [WordProcessingSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) | [WordProcessingDocumentInfo](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/wordprocessingdocumentinfo) | | Spreadsheet | XLS, XLT, XLSX, XLSM, XLSB, XLTX, XLTM, XLAM, SpreadsheetML XML, ODS, FODS, SXC, DIF | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | [SpreadsheetLoadOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetloadoptions) | [SpreadsheetEditOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheeteditoptions) | [SpreadsheetSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions) | [SpreadsheetDocumentInfo](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/spreadsheetdocumentinfo) | | DSV | CSV, TSV, semicolon-separated, whitespace-separated, arbitrary separator | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | N/A | [DelimitedTextEditOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/delimitedtexteditoptions) | [DelimitedTextSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/delimitedtextsaveoptions) | N/A | | Presentation | PPT, PPTX, PPTM, PPS, PPSX, PPSM, POT, POTX, POTM, ODP, OTP | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | [PresentationLoadOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationloadoptions) | [PresentationEditOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationeditoptions) | [PresentationSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions) | [PresentationDocumentInfo](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/presentationdocumentinfo) | | XML | Any XML document | ![(tick)](/editor/python-net/images/check.png) | ![(error)](/editor/python-net/images/error.png) | N/A | [XmlEditOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/xmleditoptions) | N/A | [TextualDocumentInfo](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/textualdocumentinfo) | | TXT | Any text document | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | N/A | [TextEditOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/texteditoptions) | [TextSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/textsaveoptions) | [TextualDocumentInfo](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/textualdocumentinfo) | | Fixed-layout format | PDF | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | [PdfLoadOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/pdfloadoptions) | [PdfEditOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/pdfeditoptions) | [PdfSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/pdfsaveoptions) | [FixedLayoutDocumentInfo](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/fixedlayoutdocumentinfo) | | Fixed-layout format | XPS (including OpenXPS) | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | N/A | N/A | [XpsSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/xpssaveoptions) | [FixedLayoutDocumentInfo](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/fixedlayoutdocumentinfo) | | e-Book | Mobi, AZW3, ePub | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | N/A | [EbookEditOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/ebookeditoptions) | [EbookSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/ebooksaveoptions) | [EbookDocumentInfo](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/ebookdocumentinfo) | | Email | EML, EMLX, TNEF, MSG, HTML, MHTML, ICS, VCF, PST, MBOX, OFT | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | N/A | [EmailEditOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/emaileditoptions) | [EmailSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/emailsaveoptions) | [EmailDocumentInfo](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/emaildocumentinfo) | ### Additional materials Detailed information about every stage of document processing along with source code examples, options explanations and so on, can be found in the next articles: 1. [Load document]({{< ref "editor/python-net/developer-guide/load-document.md" >}}) 2. [Save document]({{< ref "editor/python-net/developer-guide/save-document.md" >}}) Detailed reviews of the supported family formats and the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class, along with source code examples, can be found in the next articles: * [Create and edit new WordProcessing document]({{< ref "editor/python-net/developer-guide/create-document.md" >}}) * [Extracting document metainfo]({{< ref "editor/python-net/developer-guide/extracting-document-metainfo.md" >}}) * [Working with formats]({{< ref "editor/python-net/developer-guide/working-with-formats.md" >}}) * [Working with HTML resources]({{< ref "editor/python-net/developer-guide/working-with-html-resources.md" >}}) * [How to edit Mobi file]({{< ref "editor/python-net/developer-guide/working-with-mobi-documents.md" >}}) --- ## Save HTML to folder Path: /editor/python-net/save-html-to-folder/ > This demonstration shows how to open an input document, convert it to an intermediate [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument), and save it to disk as an HTML file with a resource folder. Almost all HTML WYSIWYG client-side editors are able to open an HTML document from disk (from a path). [**GroupDocs.Editor**](https://products.groupdocs.com/editor/python-net) allows to open any supportable document, convert it to HTML and save it to disk, which may be very useful for subsequently editing it in some WYSIWYG editor. When the document is opened for editing with [`editor.edit()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/edit), the resulting [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) can be written to disk with its [`save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/save) method. The method accepts the path to the output HTML file and the path to a folder where the resources (images, stylesheets, fonts) will be saved. ```python from groupdocs.editor import Editor from groupdocs.editor.options import WordProcessingLoadOptions load_options = WordProcessingLoadOptions() editor = Editor("document.docx", load_options) # passing path and load options to the constructor document = editor.edit() # Save HTML markup together with resources into a folder document.save("document.html", "document_resources") ``` In this example we load an input WordProcessing (DOCX) document into the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class with load options, specific for this document family type - [`WordProcessingLoadOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingloadoptions). Then the document is converted to the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) using the [`edit()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/edit) method. In the last line the content is saved to the HTML file on disk, and all resources are placed into the specified folder. ## Complete code example The example below loads a document, opens it for editing, and saves the HTML markup together with all of its resources into a folder. {{< tabs "code-example-save-html-to-folder">}} {{< tab "save_html_to_folder.py" >}} ```python import os from groupdocs.editor import Editor, License def save_html_to_folder(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) with Editor("./sample-document.docx") as editor: editable = editor.edit() # Write the HTML markup and every resource into a folder editable.save("output.html", "output_resources") editable.dispose() print("Saved HTML to output.html with resources in output_resources") if __name__ == "__main__": save_html_to_folder() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/editabledocument/save-html-to-folder/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "output.html" >}} ```text Sample Document ver.1

Title of the document

Subtitle #1

}} {{< /tabs >}} --- ## Supported Document Formats Path: /editor/python-net/supported-document-formats/ The following table indicates the file formats that GroupDocs.Editor for Python via .NET can edit. You can use the input below to filter supported formats by extension. {{< table-filter placeholder="Start typing to find file format" forumUrl="https://forum.groupdocs.com/c/editor/20">}} ## WordProcessing family formats | Format | Description | Create | Import | Export | Auto Detection | | --- | --- | --- | --- | --- | --- | | [DOC](https://docs.fileformat.com/word-processing/doc/) | MS Word 97-2007 Binary File Format | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [DOCX](https://docs.fileformat.com/word-processing/docx/) | Office Open XML WordProcessingML Macro-Free Document | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [DOCM](https://docs.fileformat.com/word-processing/docm/) | Office Open XML WordProcessingML Macro-Enabled Document | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [DOT](https://docs.fileformat.com/word-processing/dot/) | MS Word 97-2007 Template | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [DOTX](https://docs.fileformat.com/word-processing/dotx/) | Office Open XML WordprocessingML Macro-Free Template | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [DOTM](https://docs.fileformat.com/word-processing/dotm/) | Office Open XML WordprocessingML Macro-Enabled Template | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | FlatOPC | Office Open XML WordprocessingML stored in a flat XML file instead of a ZIP package | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [ODT](https://docs.fileformat.com/word-processing/odt/) | Open Document Format Text Document | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [OTT](https://docs.fileformat.com/word-processing/ott/) | Open Document Format Text Document Template | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [RTF](https://docs.fileformat.com/word-processing/rtf/) | Rich Text Format | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | WordML | Microsoft Office Word 2003 XML Format — WordProcessingML or WordML | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ## Spreadsheet family formats | Format | Description | Create | Import | Export | Auto Detection | | --- | --- | --- | --- | --- | --- | | [XLS](https://docs.fileformat.com/spreadsheet/xls/) | Excel 97-2003 Binary File Format | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [XLT](https://docs.fileformat.com/spreadsheet/xlt/) | Excel 97-2003 Template | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [XLSX](https://docs.fileformat.com/spreadsheet/xlsx/) | Office Open XML Workbook Macro-Free | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [XLSM](https://docs.fileformat.com/spreadsheet/xlsm/) | Office Open XML Workbook Macro-Enabled | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [XLTX](https://docs.fileformat.com/spreadsheet/xltx/) | Office Open XML Template Macro-Free | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [XLTM](https://docs.fileformat.com/spreadsheet/xltm/) | Office Open XML Template Macro-Enabled | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [XLSB](https://docs.fileformat.com/spreadsheet/xlsb/) | Excel Binary Workbook | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [XLAM](https://docs.fileformat.com/spreadsheet/xlam/) | Excel Add-in | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | SpreadsheetML | Microsoft Office Excel 2002 and Excel 2003 XML Format | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [ODS](https://docs.fileformat.com/spreadsheet/ods/) | OpenDocument Spreadsheet | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [FODS](https://docs.fileformat.com/spreadsheet/fods/) | Flat OpenDocument Spreadsheet — stored as a single uncompressed XML document | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [SXC](https://docs.fileformat.com/spreadsheet/sxc/) | StarOffice or OpenOffice.org Calc XML Spreadsheet | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [DIF](https://docs.fileformat.com/spreadsheet/dif/) | Data Interchange Format | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | DSV | Delimiter Separated Values document (arbitrary delimiter) | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | ![(error)](/editor/python-net/images/error.png) | ![(error)](/editor/python-net/images/error.png) | | [CSV](https://docs.fileformat.com/spreadsheet/csv/) | Comma Separated Values document | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [TSV](https://docs.fileformat.com/spreadsheet/tsv/) | Tab Separated Values document | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ## Presentation family formats | Format | Description | Create | Import | Export | Auto Detection | | --- | --- | --- | --- | --- | --- | | [PPT](https://wiki.fileformat.com/presentation/ppt/) | Microsoft PowerPoint 95 Presentation | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [PPT](https://wiki.fileformat.com/presentation/ppt/) | Microsoft PowerPoint 97-2003 Presentation | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [PPTX](https://wiki.fileformat.com/presentation/pptx/) | Microsoft Office Open XML PresentationML Macro-Free Document | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [PPTM](https://wiki.fileformat.com/presentation/pptm/) | Microsoft Office Open XML PresentationML Macro-Enabled Document | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [PPS](https://wiki.fileformat.com/presentation/pps/) | Microsoft PowerPoint 97-2003 SlideShow | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [PPSX](https://wiki.fileformat.com/presentation/ppsx/) | Microsoft Office Open XML PresentationML Macro-Free SlideShow | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [PPSM](https://wiki.fileformat.com/presentation/ppsm/) | Microsoft Office Open XML PresentationML Macro-Enabled SlideShow | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [POT](https://wiki.fileformat.com/presentation/pot/) | Microsoft PowerPoint 97-2003 Presentation Template | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [POTX](https://wiki.fileformat.com/presentation/potx/) | Microsoft Office Open XML PresentationML Macro-Free Template | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [POTM](https://wiki.fileformat.com/presentation/potm/) | Microsoft Office Open XML PresentationML Macro-Enabled Template | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [ODP](https://wiki.fileformat.com/presentation/odp/) | OpenDocument Presentation | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [OTP](https://wiki.fileformat.com/presentation/otp/) | OpenDocument Presentation template | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | FODP | Flat XML ODF Presentation | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | ## Fixed-layout family formats | Format | Description | Create | Import | Export | Auto Detection | | --- | --- | --- | --- | --- | --- | | [PDF](https://docs.fileformat.com/pdf/) | Portable Document Format | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [XPS](https://docs.fileformat.com/page-description-language/xps/) | XML Paper Specification | ![(error)](/editor/python-net/images/error.png) | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ## Email family formats | Format | Description | Create | Import | Export | Auto Detection | | --- | --- | --- | --- | --- | --- | | [EML](https://docs.fileformat.com/email/eml/) | RFC-822 Internet Message Format Standard | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [EMLX](https://docs.fileformat.com/email/emlx/) | Apple Mail App format | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [MSG](https://docs.fileformat.com/email/msg/) | Microsoft Outlook and Exchange email format | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [MBOX](https://docs.fileformat.com/email/mbox/) | Container for collection of electronic mail messages | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [TNEF](https://docs.fileformat.com/email/tnef/) | Transport Neutral Encapsulation Format | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [MHT](https://docs.fileformat.com/web/mht/) | MIME encapsulation of aggregate HTML documents | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [PST](https://docs.fileformat.com/email/pst/) | Personal Storage Table | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [OFT](https://docs.fileformat.com/email/oft/) | Outlook MSG file format for message template | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [OST](https://docs.fileformat.com/email/oft/) | Offline Storage Table | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [VCF](https://docs.fileformat.com/email/vcf/) | Virtual Card Format | ![(error)](/editor/python-net/images/error.png) | ![(error)](/editor/python-net/images/error.png) | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | | [ICS](https://docs.fileformat.com/email/ics/) | Internet Calendaring and Scheduling Core Object Specification (iCalendar) | ![(error)](/editor/python-net/images/error.png) | ![(error)](/editor/python-net/images/error.png) | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | ## eBook family formats | Format | Description | Create | Import | Export | Auto Detection | | --- | --- | --- | --- | --- | --- | | [ePub](https://docs.fileformat.com/ebook/epub/) | Electronic Publication | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [MOBI](https://docs.fileformat.com/ebook/mobi/) | MobiPocket | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [AZW3](https://docs.fileformat.com/ebook/azw3/) | AZW3, also known as Kindle Format 8 (KF8) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ## Markup formats | Format | Description | Create | Import | Export | Auto Detection | | --- | --- | --- | --- | --- | --- | | [HTML](https://docs.fileformat.com/web/html/) | HyperText Markup Language | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | | [MHTML](https://docs.fileformat.com/web/mhtml/) | MIME Encapsulation of Aggregate HTML | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | | [CHM](https://docs.fileformat.com/web/chm/) | Microsoft Compiled HTML Help | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | | [XML](https://docs.fileformat.com/web/xml/) | eXtensible Markup Language | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | | [JSON](https://docs.fileformat.com/web/json/) | JavaScript Object Notation | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | | [MD](https://docs.fileformat.com/word-processing/md/) | Markdown | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ## Other formats | Format | Description | Create | Import | Export | Auto Detection | | --- | --- | --- | --- | --- | --- | | [TXT](https://docs.fileformat.com/word-processing/txt/) | Plain Text | ![(error)](/editor/python-net/images/error.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | ![(tick)](/editor/python-net/images/check.png) | --- ## Deleting slides from presentation Path: /editor/python-net/deleting-slides-from-presentation/ There are many [presentation](https://docs.fileformat.com/presentation/) document formats — [PPT](https://docs.fileformat.com/presentation/ppt/), [PPTX](https://docs.fileformat.com/presentation/pptx/), [ODP](https://docs.fileformat.com/presentation/odp/) and many more. They all have one common thing — the *slides*. Each presentation document has one or more slides, and these slides may be treated as containers of different content: text, geometric shapes, raster and vector images, animations, video, audio, comments, embedded objects and many more. The GroupDocs.Editor from its beginning had an ability to edit presentations, but only on a per-slide basis: a user can edit only the content of one slide at a time. Initially, when the content of the slide was edited, it was only possible to save it as a new one-slide presentation. In particular, the pipeline was the next: user loads a presentation to the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class, selects the slide to edit with help of [`PresentationEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationeditoptions), obtains [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument), edits the HTML-content of the slide, saves this content back to the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument), and finally generates a new presentation file with one slide using the [`editor.save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method. Later this pipeline was expanded — it became possible to insert an edited slide into the original presentation using two ways: 1. by replacing the original slide, that was initially set for editing, onto its edited version, 2. and by putting the edited slide to "co-exist" together with its original version before edit. This mechanism is explained in detail in a [separate article](https://docs.groupdocs.com/editor/python-net/inserting-edited-slide-into-existing-presentation/). It is possible not only to edit and replace slides in the resultant presentation, but also delete them from it. The [`PresentationSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions) class contains a [`slide_numbers_to_delete`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/slidenumberstodelete/) property, which basically is a list of integers, where each integer represents a number of one slide that should be removed. Slide numbers here are 1-based, so the 1st slide has number `1`, not `0`. With the slides removal feature the GroupDocs.Editor performs the next algorithm during saving the presentation: 1. User edits a content of the slide in the WYSIWYG-editor, passes the edited content to the instance of the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class [using one of its static methods](https://docs.groupdocs.com/editor/python-net/create-editabledocument-from-file-or-markup/), creates and adjusts the [`PresentationSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions), and passes the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance, [`PresentationSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions) instance, and output stream or file path for writing to the [`editor.save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method. 2. If the value of the [`slide_number`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/slidenumber) property in the [`PresentationSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions) instance is set to the default value `0`, the GroupDocs.Editor generates a new presentation with one slide inside it. The [`slide_numbers_to_delete`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/slidenumberstodelete/) property is ignored regardless of its value. 3. If the value of the [`slide_number`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/slidenumber) property in the [`PresentationSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions) instance has some non-zero value, the GroupDocs.Editor treats it as a command to insert the edited slide into the original presentation. 4. Depending on the [`insert_as_new_slide`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/insertasnewslide) property, the GroupDocs.Editor replaces old slide, specified by the value of the [`slide_number`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/slidenumber) property, on the new one, taken from the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance, or puts the new slide to be placed among existing slides without replacing any of them. 5. Finally, when all slide insertions and rearrangements are finished, the GroupDocs.Editor reads the value of the [`slide_numbers_to_delete`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/slidenumberstodelete/) property, iterates over all numbers and removes specified slides from the presentation. 6. The final presentation, with updated and deleted slides, is written to the output stream or file. The example below shows a full roundtrip of an input PPTX file: a presentation is loaded, its first slide is converted to the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument), then HTML-markup is emitted from the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance, modified, and the edited HTML-markup is converted back to another [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance. This edited slide is then inserted into the original presentation, while the [`slide_numbers_to_delete`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions/slidenumberstodelete/) property is set to remove the original second slide during saving. {{< tabs "code-example-deleting-slides-from-presentation">}} {{< tab "deleting_slides_from_presentation.py" >}} ```python import os from groupdocs.editor import Editor, EditableDocument, License from groupdocs.editor.options import PresentationLoadOptions, PresentationEditOptions, PresentationSaveOptions from groupdocs.editor.formats import PresentationFormats def deleting_slides_from_presentation(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Load input presentation to the Editor and specify loading options with Editor("./sample-presentation.pptx", PresentationLoadOptions()) as editor: # Prepare edit options and set the 1st slide to edit edit_options = PresentationEditOptions() edit_options.show_hidden_slides = True edit_options.slide_number = 0 # index is 0-based, so this is the 1st slide # Generate EditableDocument with the original content of the slide slide_opened_for_edit = editor.edit(edit_options) # Get the HTML-markup from the EditableDocument with the original content original_html = slide_opened_for_edit.get_embedded_html() # Emulate HTML content editing in a WYSIWYG-editor in the browser or somewhere else edited_html = original_html.replace("", "

Edited content

") # Generate EditableDocument with the edited content slide_after_edit = EditableDocument.from_markup(edited_html) # Prepare save options that delete a slide during saving save_options = PresentationSaveOptions(PresentationFormats.PPTX) # A non-zero slide_number is required so the presentation is rebuilt and deletions are applied save_options.slide_number = 1 # 1-based; insert the edited slide at the 1st position save_options.insert_as_new_slide = True # keep the edited slide alongside the original ones save_options.slide_numbers_to_delete = [1] # delete the 1st original slide (1-based) # Save the presentation with the deleted slide editor.save(slide_after_edit, "./edited-presentation.pptx", save_options) slide_opened_for_edit.dispose() slide_after_edit.dispose() print("Saved presentation with a deleted slide to edited-presentation.pptx") if __name__ == "__main__": deleting_slides_from_presentation() ``` {{< /tab >}} {{< tab "sample-presentation.pptx" >}} {{< tab-text >}} `sample-presentation.pptx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-powerpoint/deleting-slides-from-presentation/sample-presentation.pptx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "edited-presentation.pptx" >}} ```text Binary file (PPTX, 144 KB) ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-powerpoint/deleting-slides-from-presentation/deleting_slides_from_presentation/edited-presentation.pptx) {{< /tab >}} {{< /tabs >}} --- ## Deleting worksheets from spreadsheet Path: /editor/python-net/deleting-worksheets-from-spreadsheet/ A **spreadsheet**, also known as a **workbook** — is a [family of the document formats](https://docs.fileformat.com/spreadsheet/), designed to work with tabular data. [XLS](https://docs.fileformat.com/spreadsheet/xls/), [XLSX](https://docs.fileformat.com/spreadsheet/xlsx/), [ODS](https://docs.fileformat.com/spreadsheet/ods/), [CSV](https://docs.fileformat.com/spreadsheet/csv/) formats are the most common examples of such document formats, while Microsoft Excel, LibreOffice Calc, Apache OpenOffice Calc are examples of table processors — programs, which allow the creation and editing of such documents. GroupDocs.Editor has an ability to edit existing spreadsheet documents, to create new spreadsheets from scratch, and also to delete worksheets from the edited spreadsheet during saving. Need to keep in mind that not every spreadsheet may have the worksheets. For example, text-based separator-delimited formats like [CSV](https://docs.fileformat.com/spreadsheet/csv/) and [TSV](https://docs.fileformat.com/spreadsheet/tsv/) are basically the text files, they have no worksheets, so the worksheet cannot be removed from them. But almost all of the binary spreadsheet formats like XLS, XLSX and ODS do have them. In GroupDocs.Editor the user edits one worksheet at a time — the spreadsheet [is loaded](https://docs.groupdocs.com/editor/python-net/load-document/) to the constructor of the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class, then worksheet is specified by its number in the [`SpreadsheetEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheeteditoptions), and then document is converted to the editable form, represented by the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class. Straight and simple. However, when the content of the worksheet was edited by the user and passed back to the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class, there are two options: 1. Users can save the edited content of the worksheet as a new spreadsheet with this single worksheet inside. This is the default behaviour that was present in the GroupDocs.Editor from its beginning. 2. Users can save the edited worksheet by inserting it to the original spreadsheet. This is described in a [separate article](https://docs.groupdocs.com/editor/python-net/inserting-edited-worksheet-into-existing-spreadsheet/). This second option also has two sub-options: 1. The edited worksheet can *replace* the original worksheet in the input spreadsheet. For instance, in the loaded spreadsheet with 3 worksheets the user has chosen the 2nd one for edit; this worksheet was edited and then inserted back to the input spreadsheet, replacing the original 2nd worksheet onto the edited one. 2. The edited worksheet can be inserted into the input spreadsheet to stay *together* with the original one. For instance, in the loaded spreadsheet with 3 worksheets the user has chosen the 2nd one for edit; this worksheet was edited and then inserted back to the input spreadsheet, so now the spreadsheet contains 4 worksheets: the first, fourth, and two versions of the second (original and edited). When the 2nd option (inserting edited worksheet into the original spreadsheet) is used, users also can delete particular worksheet(s) from it. The [`SpreadsheetSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions) class has a [`worksheet_numbers_to_delete`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumberstodelete/) property of `list[int]` type. By default it has a `None` value, which means that no worksheets should be removed. However, when it has one or more valid worksheet numbers, the worksheets with these numbers are removed while calling the [`editor.save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method. Need to mention that the worksheet numbers in a [`worksheet_numbers_to_delete`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumberstodelete/) property are 1-based. For instance, for removing the first and fourth worksheets the [`worksheet_numbers_to_delete`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumberstodelete/) property should have a `[1, 4]` value. With the worksheet removal feature the GroupDocs.Editor performs the next algorithm during saving the spreadsheet: 1. User edits a content of the worksheet in the WYSIWYG-editor, passes the edited content to the instance of the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class [using one of its static methods](https://docs.groupdocs.com/editor/python-net/create-editabledocument-from-file-or-markup/), creates and adjusts the [`SpreadsheetSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions), and passes the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance, [`SpreadsheetSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions) instance, and output stream or file path for writing to the [`editor.save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method. 2. If the value of the [`worksheet_number`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumber) property in the [`SpreadsheetSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions) instance is set to the default value `0`, the GroupDocs.Editor generates a new spreadsheet with one worksheet. The [`worksheet_numbers_to_delete`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumberstodelete/) property is ignored regardless of its value. 3. If the value of the [`worksheet_number`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumber) property in the [`SpreadsheetSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions) instance has some non-zero value, the GroupDocs.Editor treats it as a command to insert the edited worksheet into the original spreadsheet. 4. Depending on the [`insert_as_new_worksheet`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/insertasnewworksheet/) property, the GroupDocs.Editor replaces old worksheet, specified by the value of the [`worksheet_number`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumber) property, on the new one, taken from the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance, or puts the new worksheet to be placed among existing worksheets without replacing any of them. 5. Finally, when all worksheet insertions and rearrangements are finished, the GroupDocs.Editor reads the value of the [`worksheet_numbers_to_delete`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumberstodelete/) property, iterates over all numbers and removes specified worksheets from the spreadsheet. 6. The final spreadsheet, with updated and deleted worksheets, is written to the output stream or file. The example below shows a full roundtrip of an input XLSX file: a spreadsheet is loaded, its first worksheet is converted to the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument), then HTML-markup is emitted from the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance, modified, and the edited HTML-markup is converted back to another [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance. This edited worksheet is then inserted into the original spreadsheet, while the [`worksheet_numbers_to_delete`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions/worksheetnumberstodelete/) property is set to remove the original second worksheet during saving. {{< tabs "code-example-deleting-worksheets-from-spreadsheet">}} {{< tab "deleting_worksheets_from_spreadsheet.py" >}} ```python import os from groupdocs.editor import Editor, EditableDocument, License from groupdocs.editor.options import SpreadsheetLoadOptions, SpreadsheetEditOptions, SpreadsheetSaveOptions from groupdocs.editor.formats import SpreadsheetFormats def deleting_worksheets_from_spreadsheet(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Load input spreadsheet to the Editor and specify loading options with Editor("./sample-spreadsheet.xlsx", SpreadsheetLoadOptions()) as editor: # Prepare edit options and set the 1st worksheet to edit edit_options = SpreadsheetEditOptions() edit_options.worksheet_index = 0 # index is 0-based, so this is the 1st worksheet # Generate EditableDocument with the original content of the 1st worksheet worksheet_opened_for_edit = editor.edit(edit_options) # Get the HTML-markup from the EditableDocument with the original content original_html = worksheet_opened_for_edit.get_embedded_html() # Emulate HTML content editing in a WYSIWYG-editor in the browser or somewhere else edited_html = original_html.replace("", "

Edited content

") # Generate EditableDocument with the edited content worksheet_after_edit = EditableDocument.from_markup(edited_html) # Prepare save options that delete a worksheet during saving save_options = SpreadsheetSaveOptions(SpreadsheetFormats.XLSX) # A non-zero worksheet_number is required so the spreadsheet is rebuilt and deletions are applied save_options.worksheet_number = 1 # 1-based; insert the edited worksheet at the 1st position save_options.insert_as_new_worksheet = True # keep the edited worksheet alongside the original ones save_options.worksheet_numbers_to_delete = [1] # delete the 1st original worksheet (1-based) # Save the spreadsheet with the deleted worksheet editor.save(worksheet_after_edit, "./edited-spreadsheet.xlsx", save_options) worksheet_opened_for_edit.dispose() worksheet_after_edit.dispose() print("Saved spreadsheet with a deleted worksheet to edited-spreadsheet.xlsx") if __name__ == "__main__": deleting_worksheets_from_spreadsheet() ``` {{< /tab >}} {{< tab "sample-spreadsheet.xlsx" >}} {{< tab-text >}} `sample-spreadsheet.xlsx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-excel/deleting-worksheets-from-spreadsheet/sample-spreadsheet.xlsx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "edited-spreadsheet.xlsx" >}} ```text Binary file (XLSX, 35 KB) ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-excel/deleting-worksheets-from-spreadsheet/deleting_worksheets_from_spreadsheet/edited-spreadsheet.xlsx) {{< /tab >}} {{< /tabs >}} --- ## Enabling language information Path: /editor/python-net/enabling-language-information/ Documents of all WordProcessing formats can contain text in different languages. But, unlike the plain text documents (TXT), WordProcessing documents also contain a metadata about specific language (locale) of every piece of text. [**GroupDocs.Editor**](https://products.groupdocs.com/editor/python-net) allows to extract and export this language information. For achieving this the [WordProcessingEditOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingeditoptions) class contains the [`enable_language_information`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingeditoptions/) public boolean property: ```python from groupdocs.editor.options import WordProcessingEditOptions edit_options = WordProcessingEditOptions() edit_options.enable_language_information = True ``` By default its value is `False`, which means that language metadata will not be extracted. But when this option is manually enabled, GroupDocs.Editor extracts locale info for every piece of textual content and preserves it in the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance, when document is edited. Finally, when user has obtained the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance and is generating the HTML markup for transferring it to the WYSIWYG HTML-editor in order to make document editable in the browser, this language information is represented as the 'lang' HTML attributes with appropriate values inside the SPAN HTML elements. Enabling language information is useful when document contains different text parts in different languages; if document has text in some single language, this option does not have much sense and thus is disabled by default. However, when document is multi-language, enabling language information may be very suitable for two scenarios: * It eases spell checking for client-side JavaScript spell-checkers, that are working in the browser. However, this is very dependent on specific spell-checker, as not all spell-checkers are able to grab values from "lang" attributes or even use language information at all. * It improves the quality of output WordProcessing document in roundtrip scenarios. When a document with enabled [`enable_language_information`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingeditoptions/) option was converted to the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance, then HTML markup was generated, edited in some HTML-editor, and then a new instance of [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class was created from the edited markup, language metadata in "lang" attributes is still preserved. When the edited [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) will be converted back to the output document of some WordProcessing format like DOCX or RTF, the textual content inside it will have connections to the correct locale. ## Complete example The example below loads the sample document, edits it with `enable_language_information` turned on, and prints the length of the generated HTML markup that carries the language metadata. {{< tabs "code-example-enabling-language-information">}} {{< tab "enable_language_information.py" >}} ```python import os from groupdocs.editor import Editor, License from groupdocs.editor.options import WordProcessingEditOptions def enable_language_information(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Enable extraction of language (locale) information edit_options = WordProcessingEditOptions() edit_options.enable_language_information = True with Editor("./sample-document.docx") as editor: editable = editor.edit(edit_options) html = editable.get_content() print("HTML markup with language information, length:", len(html)) editable.dispose() if __name__ == "__main__": enable_language_information() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-word/enabling-language-information/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "enable-language-information.txt" >}} ```text HTML markup with language information, length: 35801 ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-word/enabling-language-information/enable_language_information/enable-language-information.txt) {{< /tab >}} {{< /tabs >}} --- ## Showcases Path: /editor/python-net/showcases/ {{< alert style="info" >}}Want to try GroupDocs.Editor for Python via .NET by yourself? Please check the Python code examples, how-to articles, and free online demonstrations provided below to learn more about the document editing features.{{< /alert >}} ## GitHub Examples To get started with GroupDocs.Editor for Python via .NET, you can explore a variety of runnable code examples available on GitHub. The repository provides a rich collection of resources that showcase the capabilities of the library and demonstrate how to implement document editing features. You can access these resources at the following GitHub repository: [GroupDocs.Editor for Python via .NET GitHub Repository](https://github.com/groupdocs-editor/GroupDocs.Editor-for-Python-via-.NET) There you'll find code examples that cover different scenarios, helping you understand the library's usage and integration possibilities. See [How to Run Examples]({{< ref "editor/python-net/getting-started/how-to-run-examples.md" >}}) to run them locally. ## Online Demo To experience the full potential of GroupDocs.Editor, you can take advantage of the free online document editor app. This online demo allows you to edit various document formats, including DOCX, PDF, XLSX, PPTX, and more. It's a great way to explore the capabilities of GroupDocs.Editor without any installation. Visit the following link to access the Free Online Document Editor App: [Free Online Document Editor App](https://products.groupdocs.app/editor) By using the online demo, you can test and evaluate the features of GroupDocs.Editor in a real-world editing environment. It's a convenient way to assess the library's capabilities and determine its suitability for your document editing needs. --- ## System Requirements Path: /editor/python-net/system-requirements/ ## Overview GroupDocs.Editor for Python via .NET does not require MS Office, Open Office, or any other external software or third-party tool to be installed. The package is a self-contained wheel that bundles everything it needs, so the only prerequisites are a supported version of Python and the operating-system packages listed below. Just follow one of the ways described in [Development Environment, Installation and Configuration]({{< ref "editor/python-net/getting-started/installation.md" >}}). ## Supported Python Versions GroupDocs.Editor for Python via .NET supports the following Python versions: * Python 3.5 * Python 3.6 * Python 3.7 * Python 3.8 * Python 3.9 * Python 3.10 * Python 3.11 * Python 3.12 * Python 3.13 * Python 3.14 ## Supported Operating Systems The package is distributed as a self-contained wheel that runs on the following platforms: ### Windows * Windows x64 * Windows x86 No additional dependencies are required on Windows. ### Linux * Linux x64 On Linux you need to install a few system packages for graphics and font rendering: ```bash apt install libgdiplus libfontconfig1 ttf-mscorefonts-installer ``` ### macOS * macOS x64 (Intel) * macOS ARM64 (Apple Silicon) On macOS install the graphics library via Homebrew: ```bash brew install mono-libgdiplus ``` ## No MS Office Required Unlike many document-processing tools, GroupDocs.Editor for Python via .NET does not depend on Microsoft Office, Open Office, or any other application being installed on the machine. All loading, editing, and saving of Word Processing documents, Spreadsheets, Presentations, PDFs, emails, eBooks, and text/markup formats is performed entirely by the bundled engine. --- ## Working with resources Path: /editor/python-net/working-with-resources/ > This demonstration shows and explains different operations with resources, including retrieving them in different scenarios. ## Introduction Almost all documents of any type have resources. These are first of all images; some document formats also hold fonts. Even for a plain text document (TXT), when converting it to HTML for editing, there will be one stylesheet, that is treated as a resource. WordProcessing documents of some formats, Office Open XML usually, can also contain embedded audio files. GroupDocs.Editor allows to work with resources on the editing phase, when the document was loaded into the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class and opened for editing by generating the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance, that is produced by the [Editor.edit()](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/edit) method. GroupDocs.Editor classifies all resources into several groups: * Images, including: raster (PNG, BMP, JPEG, GIF, ICON) and vector (SVG and WMF). * Fonts, including: TTF, EOT, WOFF, WOFF2. * Textual resources: CSS stylesheets. * Audio files: MP3. ## Preparations Let's prepare an [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance by loading and editing some input WordProcessing document, as always: ```python from groupdocs.editor import Editor from groupdocs.editor.options import WordProcessingLoadOptions editor = Editor("document.docx", WordProcessingLoadOptions()) before_edit = editor.edit() # create an EditableDocument instance ``` ## Obtaining resources Now, when the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance is ready, it is possible to obtain resources from it, and [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) provides several ways for this. First of all, resources can be retrieved by their type. [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) exposes an iterable collection for every resource type: * `images` — all images, raster and vector. * `fonts` — all fonts. * `css` — the CSS stylesheets, where each item represents one stylesheet. * `audio` — the MP3 audio files. Secondly, completely all resources may be obtained with a single property — `all_resources`. It returns everything above, combined, and in fact is a concatenation of the previous collections. All these collections can be iterated with `for` loops and measured with `len()`: ```python images = before_edit.images fonts = before_edit.fonts stylesheets = before_edit.css audio_files = before_edit.audio all_together = before_edit.all_resources for one_image in images: print("image resource:", one_image) ``` ## CSS resources There is also a dedicated way for the stylesheets. The reason is that stylesheets can contain external resources too, presented as links with URLs — for example images, fonts, and other stylesheets. In such a case it may be necessary to adjust such a link. For coping with this, [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) contains the [`get_css_content()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/getcsscontent) method. Without arguments it returns the stylesheets as-is; it can also accept prefixes for external images and fonts referenced from the stylesheets: ```python stylesheets_without_prefixes = before_edit.get_css_content() external_images_prefix = "http://www.mywebsite.com/images/id=" external_fonts_prefix = "http://www.mywebsite.com/fonts/id=" stylesheets_with_prefixes = before_edit.get_css_content(external_images_prefix, external_fonts_prefix) ``` ## Complete code example The example below loads a document, opens it for editing, and reports how many resources of each kind were extracted. {{< tabs "code-example-working-with-resources">}} {{< tab "working_with_resources.py" >}} ```python import os from groupdocs.editor import Editor, License def working_with_resources(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) with Editor("./sample-document.docx") as editor: before_edit = editor.edit() # Inspect the extracted resources by their type print("Images:", len(before_edit.images)) print("Stylesheets:", len(before_edit.css)) print("Fonts:", len(before_edit.fonts)) print("All resources:", len(before_edit.all_resources)) # Enumerate the stylesheets of the document for one_stylesheet in before_edit.css: print("stylesheet resource:", one_stylesheet) before_edit.dispose() if __name__ == "__main__": working_with_resources() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/editabledocument/working-with-resources/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "working-with-resources.txt" >}} ```text Images: 4 Stylesheets: 1 Fonts: 0 All resources: 5 stylesheet resource: GroupDocs.Editor.HtmlCss.Resources.Textual.CssText ``` [Download full output](/editor/python-net/_output_files/developer-guide/editabledocument/working-with-resources/working_with_resources/working-with-resources.txt) {{< /tab >}} {{< /tabs >}} --- ## Create EditableDocument from file or markup Path: /editor/python-net/create-editabledocument-from-file-or-markup/ > This demonstration shows how to create an instance of the EditableDocument class from HTML files on disk or from HTML markup with resources. ## Introduction When working with [**GroupDocs.Editor**](https://products.groupdocs.com/editor/python-net) in the usual way by loading, opening, editing and saving documents, the instances of the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class are produced by the [Editor.edit()](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/edit) method and accepted by the [Editor.save()](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method. However, in some cases it is required to create an [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance from existing HTML markup with optional resources. For example, some document was loaded to the `Editor` class, opened for editing, and then the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) was saved to the disk as a set of an HTML file and connected resources. Then this HTML document was passed to the WYSIWYG-editor, edited, and saved back to the disk as modified HTML. Or the raw output from the WYSIWYG-editor was saved to a string variable. In order to save it to some final format like DOCX or XLSX, the user needs to pass the document to the [Editor.save()](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method in the form of an [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance. This means that the user should create an instance of the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class manually. [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) contains three public class methods for creating its instances: [`from_file`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/fromfile), [`from_markup`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/frommarkup) and [`from_markup_and_resource_folder`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/frommarkupandresourcefolder). This article reviews all of them. ## Opening from file Let's suppose that we have an HTML file with edited document content, that is saved on the disk. There is also a folder with resources (images, fonts, stylesheets), that are used by this document, and the document has correct links to these resources. Let's say the HTML document has the name "document.html". The resource folder is located near it and has the name "document_resources", and, what is most important, the HTML markup from "document.html" has proper links to files from the "document_resources" folder. In that case creating the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance is the most simple — the [`from_file`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/fromfile) class method accepts the path to the HTML file and the path to the resource folder: ```python from groupdocs.editor import EditableDocument # HTML file plus the folder where its resources are stored document = EditableDocument.from_file("document.html", "document_resources") ``` If the HTML file contains correct links, GroupDocs.Editor will scan the links and find the resources automatically. If the HTML file contains a link to a resource that is not present in the resource folder, it will be omitted. ## Opening from markup and prepared resources In some cases the edited HTML document is not present as a file. It may be stored in a database, obtained from remote storage, or something else. Quite often the whole document content, with HTML- and CSS-markup and all the resources, is packed inside a single string (resources are packed into the HTML markup using the [data URI scheme](https://en.wikipedia.org/wiki/Data_URI_scheme) with base64 encoding). In such cases, when there are no external resources, the [`from_markup`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/frommarkup) class method is the most convenient — it accepts a single string with the raw HTML markup: ```python from groupdocs.editor import EditableDocument input_html_markup = "Edited document..." document = EditableDocument.from_markup(input_html_markup) ``` ## Opening from markup and resource folder In some cases there is HTML markup of an edited document, but the resources are represented as a set of files located in a specific folder. For example, the original document was converted to an [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) and then saved to the disk as an HTML document with an *.html file and a folder with resources. Then it was loaded to the WYSIWYG-editor, edited by the end-user, and the edited content was obtained back from the front-end. So now there is edited HTML markup available as a string, and a resource folder. In such a case the [`from_markup_and_resource_folder`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/frommarkupandresourcefolder) class method should be used. It accepts two mandatory parameters: the HTML markup string and a valid path to an existing resource folder. ```python from groupdocs.editor import EditableDocument input_html_markup = "Edited document..." document = EditableDocument.from_markup_and_resource_folder(input_html_markup, "document_resources") ``` This method scans the resource folder, reads all found *.css files, and applies the parsed stylesheets to the document content. This is the main distinction between this method and the previously described [`from_file`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/fromfile) one — this one applies all stylesheets in the folder to the document, while [`from_file`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/fromfile) applies only those stylesheets which are explicitly referenced from the HTML markup. If any stylesheet references external images and/or fonts, GroupDocs.Editor will try to find these resources in the same resource folder too. ## Complete code example The example below builds a small HTML string, wraps it into an [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance with [`from_markup`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/frommarkup), and saves it as a DOCX document using an [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) opened on a sample document. {{< tabs "code-example-create-editabledocument-from-file-or-markup">}} {{< tab "create_editabledocument_from_file_or_markup.py" >}} ```python import os from groupdocs.editor import Editor, EditableDocument, License from groupdocs.editor.options import WordProcessingSaveOptions from groupdocs.editor.formats import WordProcessingFormats def create_editabledocument_from_file_or_markup(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Build an HTML markup string (for example, obtained from a WYSIWYG editor) html = "Edited document

Hello from markup

" # Create an EditableDocument instance directly from the markup editable = EditableDocument.from_markup(html) # Save the created EditableDocument as a DOCX document with Editor("./sample-document.docx") as editor: save_options = WordProcessingSaveOptions(WordProcessingFormats.DOCX) editor.save(editable, "./output.docx", save_options) print("EditableDocument created from markup and saved to ./output.docx") if __name__ == "__main__": create_editabledocument_from_file_or_markup() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/editabledocument/create-editabledocument-from-file-or-markup/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "output.docx" >}} ```text Binary file (DOCX, 7 KB) ``` [Download full output](/editor/python-net/_output_files/developer-guide/editabledocument/create-editabledocument-from-file-or-markup/create_editabledocument_from_file_or_markup/output.docx) {{< /tab >}} {{< /tabs >}} --- ## Edit and Update Form Fields Path: /editor/python-net/edit-and-update-form-fields/ This article demonstrates how to edit form fields in a Word document using GroupDocs.Editor for Python via .NET. It guides you through loading a document, inspecting its form fields, and updating them. ## Step-by-Step Guide 1. **Load the document into the Editor instance** Open the document as a binary stream and pass it to the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class together with [`WordProcessingLoadOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingloadoptions). If the document is password-protected, specify the password through the load options. ```python from groupdocs.editor import Editor from groupdocs.editor.options import WordProcessingLoadOptions with open("form-fields.docx", "rb") as stream: with Editor(stream, WordProcessingLoadOptions()) as editor: # Further code will be placed here pass ``` 2. **Retrieve the FormFieldManager** Obtain the [`FormFieldManager`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/formfieldmanager/) instance from the [`form_field_manager`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/) property of the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class and read the `form_field_collection`. ```python with open("form-fields.docx", "rb") as stream: with Editor(stream, WordProcessingLoadOptions()) as editor: field_manager = editor.form_field_manager collection = field_manager.form_field_collection print("Form fields count:", len(collection)) ``` 3. **Update a form field** A specific form field can be modified and the change applied back to the document with the `update_form_filed(...)` method. The snippet below is illustrative — retrieve the form field of interest from the collection, modify its properties, and pass the collection to `update_form_filed(...)`. Adjust the per-type access details to the exact members exposed by your build. ```python # Illustrative: retrieve, modify, and update a text form field text_field = collection.get_form_field("Text1") text_field.locale_id = 1029 text_field.value = "new Value" field_manager.update_form_filed(collection) ``` 4. **Save the updated document** After updating the form fields, save the document with [`editor.save(...)`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) using the desired save options. ```python from groupdocs.editor.options import WordProcessingSaveOptions from groupdocs.editor.formats import WordProcessingFormats save_options = WordProcessingSaveOptions(WordProcessingFormats.DOCX) editable = editor.edit() editor.save(editable, "output.docx", save_options) ``` ## Complete code example Below is the complete runnable example. It opens the document, retrieves the form field manager, and reports the number of form fields it contains. The mutation steps shown above are illustrative; the runnable example performs only safe, documented calls. {{< tabs "code-example-edit-and-update-form-fields">}} {{< tab "edit_and_update_form_fields.py" >}} ```python import os from groupdocs.editor import Editor, License from groupdocs.editor.options import WordProcessingLoadOptions def edit_and_update_form_fields(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Open the document as a stream and load it with WordProcessingLoadOptions with open("./form-fields.docx", "rb") as stream: with Editor(stream, WordProcessingLoadOptions()) as editor: # Read the FormFieldManager instance field_manager = editor.form_field_manager # Read the form field collection collection = field_manager.form_field_collection print("Has invalid form fields:", field_manager.has_invalid_form_fields()) print("Form fields count:", len(collection)) if __name__ == "__main__": edit_and_update_form_fields() ``` {{< /tab >}} {{< tab "form-fields.docx" >}} {{< tab-text >}} `form-fields.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/form-field-management/edit-and-update-form-fields/form-fields.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "edit-and-update-form-fields.txt" >}} ```text Has invalid form fields: True Form fields count: 31 ``` [Download full output](/editor/python-net/_output_files/developer-guide/form-field-management/edit-and-update-form-fields/edit_and_update_form_fields/edit-and-update-form-fields.txt) {{< /tab >}} {{< /tabs >}} --- ## Fixing Invalid Form Fields Path: /editor/python-net/fixing-invalid-form-fields/ This article demonstrates how to fix invalid form fields in a Word document using GroupDocs.Editor for Python via .NET. It guides you through loading a document, identifying invalid form fields, and fixing them. ## Step-by-Step Guide 1. **Load the document into the Editor instance** Open the document as a binary stream and pass it to the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class together with [`WordProcessingLoadOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingloadoptions). If the document is password-protected, specify the password through the load options. ```python from groupdocs.editor import Editor from groupdocs.editor.options import WordProcessingLoadOptions with open("form-fields.docx", "rb") as stream: with Editor(stream, WordProcessingLoadOptions()) as editor: # Further code will be placed here pass ``` 2. **Retrieve the FormFieldManager** Obtain the [`FormFieldManager`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/formfieldmanager/) instance from the [`form_field_manager`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/) property of the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class. ```python with open("form-fields.docx", "rb") as stream: with Editor(stream, WordProcessingLoadOptions()) as editor: field_manager = editor.form_field_manager ``` 3. **Detect invalid form fields** Check whether the document contains invalid form fields with `has_invalid_form_fields()`, and obtain their names with `get_invalid_form_field_names()`. ```python has_invalid = field_manager.has_invalid_form_fields() print("FormFieldCollection contains invalid items:", has_invalid) invalid_names = list(field_manager.get_invalid_form_field_names()) print("Invalid form field names:", invalid_names) ``` 4. **Fix the invalid form fields** The invalid form fields can be repaired with `fix_invalid_form_field_names(...)`. The snippet below is illustrative — generate a unique replacement name for each invalid field and pass them to the method. Adjust the per-item access details to the exact members exposed by your build. ```python # Illustrative: assign unique fixed names and repair the invalid fields import uuid invalid_form_fields = field_manager.get_invalid_form_field_names() for invalid_item in invalid_form_fields: invalid_item.fixed_name = "{0}_{1}".format(invalid_item.name, uuid.uuid4()) field_manager.fix_invalid_form_field_names(invalid_form_fields) ``` ## Complete code example Below is the complete runnable example. It opens the document, retrieves the form field manager, and reports whether it has invalid form fields together with their names. The repair steps shown above are illustrative; the runnable example performs only safe, documented calls. {{< tabs "code-example-fixing-invalid-form-fields">}} {{< tab "fixing_invalid_form_fields.py" >}} ```python import os from groupdocs.editor import Editor, License from groupdocs.editor.options import WordProcessingLoadOptions def fixing_invalid_form_fields(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Open the document as a stream and load it with WordProcessingLoadOptions with open("./form-fields.docx", "rb") as stream: with Editor(stream, WordProcessingLoadOptions()) as editor: # Read the FormFieldManager instance field_manager = editor.form_field_manager # Detect invalid form fields print("Has invalid form fields:", field_manager.has_invalid_form_fields()) invalid_form_fields = list(field_manager.get_invalid_form_field_names()) print("Invalid form fields detected:", len(invalid_form_fields)) if __name__ == "__main__": fixing_invalid_form_fields() ``` {{< /tab >}} {{< tab "form-fields.docx" >}} {{< tab-text >}} `form-fields.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/form-field-management/fixing-invalid-form-fields/form-fields.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "fixing-invalid-form-fields.txt" >}} ```text Has invalid form fields: True Invalid form fields detected: 21 ``` [Download full output](/editor/python-net/_output_files/developer-guide/form-field-management/fixing-invalid-form-fields/fixing_invalid_form_fields/fixing-invalid-form-fields.txt) {{< /tab >}} {{< /tabs >}} --- ## Float and paginal modes Path: /editor/python-net/float-and-paginal-modes/ ## How to enable different document edit modes? WordProcessing module of [**GroupDocs.Editor**](https://products.groupdocs.com/editor/python-net), that is responsible for converting all WordProcessing formats to [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instances and backward (from [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) to some of WordProcessing format), contains two modes: *float* and *paginal* (also known as *paged*), where the first one — float, is default. These modes are presented by two properties with the same name and type: ```python enable_pagination # bool ``` At first, such property is present in the [`WordProcessingEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingeditoptions) class. In this case this option is responsible for the selected mode during the forward (WordProcessing to [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument)) conversion. Secondly, such property is present in the [WordProcessingSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) class and is responsible for the selected mode during the backward ([EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) to WordProcessing) conversion. The main "rule of thumb" for the end-user is to preserve the same mode during full document roundtrip and not to change it. In other words, if input document was converted to the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) (followed by the HTML emitting from [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance) in the *float* mode, then the resultant [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance (obtained from a document, edited on client-side) should be converted back to the WordProcessing format also in the *float* mode (same rule for *paginal* mode too). The main distinction of these two modes lies in the form of representation of the input WordProcessing document. By default all WordProcessing formats internally have no pages and page division, they are internally represented as a float smoothness and pageless structure. However, they contain page-related information like headers, footers, page watermarks, page numbering formatting info etc. And when WordProcessing document is opened in some desktop or browser-based text processor like MS Word, OpenOffice or Google Docs, this text processor dynamically and "on the fly" splits the document onto multiple pages, which are re-calculated every time when user is performing some edits in the document content. For example, when user removes a line from the beginning of a huge document, the empty space is "collapsing", and all subsequent content is like "moving upward" to fill the gap, where the removed line was located. Same real-time calculations are valid also for situations when content is added, — newly inserted content is pushing the existing content to move downward, with creating new pages if necessary. The problem is that most of the used client-side browser-based JavaScript HTML WYSIWYG editors like TinyMCE or CKEditor do not support pages, page separation and described calculations at all. They support only documents with pageless structure. That's why GroupDocs.Editor supports two modes: * In the *float* mode the WordProcessing document is represented in pageless form. Content is not separated onto pages. There are no page-related entities like headers, footers, watermarks and page numbers. This mode is the most suitable for almost all widespread WYSIWYG HTML-editors and that's why this mode is default. * In the *paginal* mode the WordProcessing document is represented as a set of pages, where content is divided onto pages in the same way as MS Word does. All page-related entities like headers, footers, watermarks and page numbers are present. This mode should be turned on manually (as it is described above) and is fitting for scenarios where end-user is able to process such content in some appropriate and suitable way, for example, in his own-made HTML editing software. ## Complete example The example below loads the sample document and edits it in the *paginal* mode by setting `enable_pagination` to `True`. To switch to the default *float* mode, set this property to `False` (or simply leave the default). {{< tabs "code-example-float-and-paginal-modes">}} {{< tab "edit_in_paginal_mode.py" >}} ```python import os from groupdocs.editor import Editor, License from groupdocs.editor.options import WordProcessingEditOptions def edit_in_paginal_mode(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Enable the paginal (paged) edit mode; set False for the default float mode edit_options = WordProcessingEditOptions() edit_options.enable_pagination = True with Editor("./sample-document.docx") as editor: editable = editor.edit(edit_options) html = editable.get_content() print("Edited in paginal mode, HTML markup length:", len(html)) editable.dispose() if __name__ == "__main__": edit_in_paginal_mode() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-word/float-and-paginal-modes/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "edit-in-paginal-mode.txt" >}} ```text Edited in paginal mode, HTML markup length: 32503 ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-word/float-and-paginal-modes/edit_in_paginal_mode/edit-in-paginal-mode.txt) {{< /tab >}} {{< /tabs >}} ## Paginal edit mode in PDF Along with the family of WordProcessing formats, GroupDocs.Editor supports PDF as one of the output (resultant) formats. In other words, an input WordProcessing document may be opened, edited, but saved not only to the WordProcessing, but also to the PDF. The [PdfSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/pdfsaveoptions) class contains the same `enable_pagination` boolean flag, which, like the same in WordProcessing, has a `False` default value, meaning *float* mode, while a `True` value means *paginal* mode. If input WordProcessing document was converted to [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) in *paginal* mode, the output PDF should be generated in *paginal* mode too, with enabled `enable_pagination` flag. --- ## Generating slides preview for presentation Path: /editor/python-net/generating-slides-preview-for-presentation/ GroupDocs.Editor for Python via .NET allows generating the preview for any slide in the presentation document in SVG format. This feature allows the end-user to view and inspect the content of the presentation without actually sending it for edit. This generated slide preview cannot be edited, but it can be viewed in any desktop or online image viewer as well as in the browser (because any modern browser actually supports SVG format). This feature allows end-users to generate a preview for any slide within the presentation regardless of the licensing mode of the GroupDocs.Editor: it works the same for both trial and licensed mode, there are no trial limitations for this feature. While generating the slides preview, the GroupDocs.Editor doesn't write off the consumed bytes or credits. For generating the slides preview the `generate_preview(slide_index)` method from the [`PresentationDocumentInfo`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/presentationdocumentinfo/) object should be used, and slide is specified by its zero-based _index_ (do not confuse with the slide _numbers_, which are 1-based). If the specified index is lesser than 0 or exceeds the number of slides within a given presentation, then an exception will be thrown. Slide previews can be generated for both encoded and unencoded presentations; for encoded the end-user must specify a valid password in the [`get_document_info`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo/) method. The `generate_preview(slide_index)` method returns a slide preview as an SVG vector image, that is encapsulated in the `SvgImage` class. This class has all necessary methods and properties to obtain the content of an SVG image in any desired form, save it to disk and so on. The snippet below illustrates opening an unprotected presentation file and generating the previews for every slide. Then these previews are saved to the disk. ```python import os from groupdocs.editor import Editor # Obtain a valid path to the presentation file input_path = "./sample-presentation.pptx" output_folder = "./previews" # Load the file to the Editor constructor with Editor(input_path) as editor: # Get document info for this file info_slides = editor.get_document_info() # Get the number of all slides slides_count = info_slides.page_count # Iterate through all slides and generate the preview on every iteration for i in range(slides_count): # Generate one preview as an SVG image by slide index one_svg_preview = info_slides.generate_preview(i) # Save it to a file one_svg_preview.save(os.path.join(output_folder, one_svg_preview.filename_with_extension)) ``` The slides preview feature is by its essence a method in the existing `PresentationDocumentInfo` object, that obtains a slide index and returns an instance of the `SvgImage` class. If the end-user needs to obtain a preview of the slide in a raster format, but not in the vector, the `SvgImage` class also provides a method to convert the SVG content to the PNG format. The complete example below loads a presentation, opens its first slide for editing, and prints the length of the generated HTML content. This roundtrip confirms that the presentation is read correctly before any preview is generated. {{< tabs "code-example-generating-slides-preview-for-presentation">}} {{< tab "generating_slides_preview_for_presentation.py" >}} ```python import os from groupdocs.editor import Editor, License from groupdocs.editor.options import PresentationEditOptions def generating_slides_preview_for_presentation(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Load the presentation file to the Editor constructor with Editor("./sample-presentation.pptx") as editor: # Prepare edit options and select the 1st slide edit_options = PresentationEditOptions() edit_options.slide_number = 0 # index is 0-based, so this is the 1st slide # Open the slide for editing slide = editor.edit(edit_options) # Obtain the HTML content of the slide content = slide.get_content() print("Slide HTML content length:", len(content)) slide.dispose() if __name__ == "__main__": generating_slides_preview_for_presentation() ``` {{< /tab >}} {{< tab "sample-presentation.pptx" >}} {{< tab-text >}} `sample-presentation.pptx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-powerpoint/generating-slides-preview-for-presentation/sample-presentation.pptx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "generating-slides-preview-presentation.txt" >}} ```text Slide HTML content length: 8422 ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-powerpoint/generating-slides-preview-for-presentation/generating_slides_preview_for_presentation/generating-slides-preview-presentation.txt) {{< /tab >}} {{< /tabs >}} --- ## Installation Path: /editor/python-net/installation/ GroupDocs.Editor for Python via .NET is distributed as a pre-built wheel on [PyPI](https://pypi.org/project/groupdocs-editor-net/). The PyPI index hosts a separate wheel for each supported platform, and `pip` picks the correct one automatically. Each wheel is self-contained (~120 MB): it bundles the embedded .NET runtime and every native dependency, so no MS Office, OpenOffice, or separate .NET install is required. Before installing, confirm your environment matches the supported platforms and Python versions listed in the [System Requirements]({{< ref "editor/python-net/getting-started/system-requirements.md" >}}) topic. ## Install Package from PyPI Open a terminal and run the install command for your platform: {{< tabs "install-pypi">}} {{< tab "Windows" >}} ```ps py -m pip install groupdocs-editor-net ``` {{< /tab >}} {{< tab "Linux" >}} ```bash python3 -m pip install groupdocs-editor-net ``` {{< /tab >}} {{< tab "macOS" >}} ```bash python3 -m pip install groupdocs-editor-net ``` {{< /tab >}} {{< /tabs >}} After running the command you should see output similar to: ```bash Collecting groupdocs-editor-net Downloading groupdocs_editor_net-26.5-py3-none-win_amd64.whl.metadata (6.0 kB) Downloading groupdocs_editor_net-26.5-py3-none-win_amd64.whl (104.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 104.0/104.0 MB 2.8 MB/s eta 0:00:00 Installing collected packages: groupdocs-editor-net Successfully installed groupdocs-editor-net-26.5 ``` The wheel file name will include a platform suffix that matches your operating system — for example `manylinux1_x86_64` on Ubuntu/Debian, `macosx_11_0_arm64` on Apple Silicon, or `win_amd64` on 64-bit Windows. ## Add the Package to `requirements.txt` For reproducible environments, pin the package version in your `requirements.txt`: ```txt groupdocs-editor-net==26.5 ``` Then install all dependencies in one step: ```bash pip install -r requirements.txt ``` ## Install from a Pre-Downloaded Wheel If your build environment cannot reach PyPI, download the appropriate wheel from the [GroupDocs Releases website](https://releases.groupdocs.com/editor/python-net/) and install it locally. The following wheels are published for each release: - **Windows 64-bit**: file name ends with `win_amd64.whl` - **Linux x64 (glibc)**: file name ends with `manylinux1_x86_64.whl` - **macOS Apple Silicon**: file name ends with `macosx_11_0_arm64.whl` - **macOS Intel**: file name ends with `macosx_10_14_x86_64.whl` Place the downloaded wheel into your project folder, then install it: {{< tabs "install-wheel">}} {{< tab "Windows (64-bit)" >}} ```ps py -m pip install groupdocs_editor_net-26.5-py3-none-win_amd64.whl ``` {{< /tab >}} {{< tab "Linux (glibc)" >}} ```bash python3 -m pip install groupdocs_editor_net-26.5-py3-none-manylinux1_x86_64.whl ``` {{< /tab >}} {{< tab "macOS (Apple Silicon)" >}} ```bash python3 -m pip install groupdocs_editor_net-26.5-py3-none-macosx_11_0_arm64.whl ``` {{< /tab >}} {{< tab "macOS (Intel)" >}} ```bash python3 -m pip install groupdocs_editor_net-26.5-py3-none-macosx_10_14_x86_64.whl ``` {{< /tab >}} {{< /tabs >}} Expected output: ```bash Processing groupdocs_editor_net-26.5-py3-none-*.whl Installing collected packages: groupdocs-editor-net Successfully installed groupdocs-editor-net-26.5 ``` ## Platform Prerequisites On Windows no extra steps are required. On Linux and macOS, install the native libraries the rendering engine depends on: {{< tabs "platform-prereqs">}} {{< tab "Linux" >}} ```bash apt install libgdiplus libfontconfig1 ttf-mscorefonts-installer ``` {{< /tab >}} {{< tab "macOS" >}} ```bash brew install mono-libgdiplus ``` {{< /tab >}} {{< /tabs >}} ## Next Steps - Follow the [Quick Start Guide]({{< ref "editor/python-net/getting-started/quick-start-guide" >}}) to run your first edit. - Clone the [examples repository](https://github.com/groupdocs-editor/GroupDocs.Editor-for-Python-via-.NET) and read [Running Examples]({{< ref "editor/python-net/getting-started/how-to-run-examples.md" >}}) to try every documented scenario locally. - If you work with AI agents or LLMs, see [Agents and LLMs]({{< ref "editor/python-net/agents-and-llm-integration" >}}) for MCP and `AGENTS.md` integration details. --- ## Load document Path: /editor/python-net/load-document/ This guide explains how to load a document from a local disk or file stream for editing using the GroupDocs.Editor for Python via .NET API. ## Introduction In this article, you will learn how to load an input document into [**GroupDocs.Editor**](https://products.groupdocs.com/editor/python-net) and apply load options. ## Loading Documents To load an input document, which should be accessible either as a binary stream or through a valid file path, create an instance of the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class using one of its constructor overloads. Below are examples of loading documents from a file path and from a stream. ```python from groupdocs.editor import Editor # Load document from a file path editor = Editor("document.docx") # Load document from a binary stream with open("document.docx", "rb") as stream: editor = Editor(stream) ``` When using the constructor overloads shown above, GroupDocs.Editor automatically detects the format of the input document and applies the most suitable default loading options. However, it is recommended to specify the correct loading options explicitly by using constructor overloads that accept two parameters. Here is how you can do this: ```python from groupdocs.editor import Editor from groupdocs.editor.options import WordProcessingLoadOptions, SpreadsheetLoadOptions # Load document from a file path with load options word_load_options = WordProcessingLoadOptions() editor = Editor("document.docx", word_load_options) # Load document from a stream with load options spreadsheet_load_options = SpreadsheetLoadOptions() with open("spreadsheet.xlsx", "rb") as stream: editor = Editor(stream, spreadsheet_load_options) ``` The following complete example loads a document from a local file path with load options and prints its basic metadata: {{< tabs "code-example-load-document">}} {{< tab "load_document.py" >}} ```python import os from groupdocs.editor import Editor, License from groupdocs.editor.options import WordProcessingLoadOptions def load_document(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Prepare load options for the WordProcessing family load_options = WordProcessingLoadOptions() # Load the document from a local file path with load options with Editor("./sample-document.docx", load_options) as editor: info = editor.get_document_info() print("Loaded:", info.format.name, "-", info.page_count, "page(s)") if __name__ == "__main__": load_document() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/load-document/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "load-document.txt" >}} ```text Loaded: Office Open XML WordProcessingML Macro-Free Document (DOCX) - 3 page(s) ``` [Download full output](/editor/python-net/_output_files/developer-guide/load-document/load_document/load-document.txt) {{< /tab >}} {{< /tabs >}} ## Load Options Please note that not all document formats have associated classes for load options. Only the WordProcessing, Spreadsheet, Presentation families, and a distinct PDF format have specific load options classes. Other formats, such as DSV, TXT, or XML, do not have load options. | Format Family | Example Formats | Load Options Class | |---------------------|----------------------------|--------------------------------------------------------------------------------------------------------------------------------| | WordProcessing | DOC, DOCX, DOCM, DOT, ODT | [`WordProcessingLoadOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingloadoptions) | | Spreadsheet | XLS, XLSX, XLSM, XLSB | [`SpreadsheetLoadOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetloadoptions) | | Presentation | PPT, PPTX, PPS, POT | [`PresentationLoadOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationloadoptions) | | Fixed-layout format | PDF | [`PdfLoadOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/pdfloadoptions) | ## Handling Password-Protected Documents Using load options is essential when working with password-protected documents. Any document can be loaded into the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) instance, even if it is password-protected. However, if the password is not handled correctly, an exception will be thrown during the editing process. Here is how GroupDocs.Editor handles passwords: 1. If the document is not password-protected, any specified password will be ignored. 2. If the document is password-protected but no password is specified, a [`PasswordRequiredException`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/passwordrequiredexception) will be thrown during editing. 3. If the document is password-protected and an incorrect password is provided, an [`IncorrectPasswordException`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/incorrectpasswordexception) will be thrown during editing. The example below demonstrates how to specify a password for opening a password-protected WordProcessing document: ```python from groupdocs.editor import Editor from groupdocs.editor.options import WordProcessingLoadOptions word_load_options = WordProcessingLoadOptions() word_load_options.password = "correct_password" editor = Editor("protected-document.docx", word_load_options) ``` {{< alert style="info" >}}The same approach applies to Spreadsheet, Presentation, and PDF documents as well.{{< /alert >}} --- ## Working with Form Fields Path: /editor/python-net/working-with-form-fields/ This article demonstrates how to load and read form fields in a Word document using GroupDocs.Editor for Python via .NET. We will go through the process of opening a document, retrieving the form field manager, and inspecting the form fields it contains. ## Step-by-Step Guide 1. **Load the document into the Editor instance** Open the document as a binary stream and pass it to the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class together with [`WordProcessingLoadOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingloadoptions). If the document is password-protected, specify the password through the load options. ```python from groupdocs.editor import Editor from groupdocs.editor.options import WordProcessingLoadOptions with open("form-fields.docx", "rb") as stream: with Editor(stream, WordProcessingLoadOptions()) as editor: # Further code will be placed here pass ``` 2. **Retrieve the FormFieldManager** Obtain the [`FormFieldManager`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/formfieldmanager/) instance from the [`form_field_manager`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/) property of the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class. Use it to read the `form_field_collection` and to check whether the document contains invalid form fields. ```python with open("form-fields.docx", "rb") as stream: with Editor(stream, WordProcessingLoadOptions()) as editor: field_manager = editor.form_field_manager collection = field_manager.form_field_collection print("Has invalid form fields:", field_manager.has_invalid_form_fields()) print("Form fields count:", len(collection)) ``` 3. **Process the form fields** The `form_field_collection` can be iterated to inspect each form field. Every field exposes common properties such as `name` and `type`. The snippet below illustrates how to enumerate the collection and react to a field's type. Treat the per-type access details as illustrative — adjust them to the exact members exposed by your build. ```python # Illustrative: iterate the collection and inspect each form field for form_field in collection: print("name:", form_field.name, "type:", form_field.type) ``` ## Complete code example Below is the complete runnable example. It opens the document, retrieves the form field manager, and reports whether the document has invalid form fields together with the number of form fields it contains. {{< tabs "code-example-working-with-form-fields">}} {{< tab "working_with_form_fields.py" >}} ```python import os from groupdocs.editor import Editor, License from groupdocs.editor.options import WordProcessingLoadOptions def working_with_form_fields(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Open the document as a stream and load it with WordProcessingLoadOptions with open("./form-fields.docx", "rb") as stream: with Editor(stream, WordProcessingLoadOptions()) as editor: # Read the FormFieldManager instance field_manager = editor.form_field_manager # Read the form field collection collection = field_manager.form_field_collection print("Has invalid form fields:", field_manager.has_invalid_form_fields()) print("Form fields count:", len(collection)) # Iterate the collection and inspect each form field for form_field in collection: print("name:", form_field.name, "type:", form_field.type) if __name__ == "__main__": working_with_form_fields() ``` {{< /tab >}} {{< tab "form-fields.docx" >}} {{< tab-text >}} `form-fields.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/form-field-management/working-with-form-fields/form-fields.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "working-with-form-fields.txt" >}} ```text Has invalid form fields: True Form fields count: 31 name: Text1 type: 0 name: Check1 type: 5 name: Check2 type: 5 name: Check3 type: 5 name: Check4 type: 5 name: Check5 type: 5 name: Check6 type: 5 name: Text7 type: 0 [TRUNCATED] ``` [Download full output](/editor/python-net/_output_files/developer-guide/form-field-management/working-with-form-fields/working_with_form_fields/working-with-form-fields.txt) {{< /tab >}} {{< /tabs >}} ## Conclusion This guide demonstrates how to work with form fields in Word documents using GroupDocs.Editor for Python via .NET. By following these steps, you can load a document, retrieve the form field manager, and inspect the form fields it contains. This functionality is essential for applications that require manipulation and extraction of form field data from documents. --- ## Font extraction options Path: /editor/python-net/font-extraction-options/ ## Introduction WordProcessing documents usually contain text content, while every piece of text should be represented with some font. There may be used system fonts (installed in the operating system), but also custom fonts, which are not installed in the system. On the other hand, lots of WordProcessing formats like DOCX and ODT have an ability to store fonts inside the document itself; such fonts, stored as binary resources inside WordProcessing files, are called *embedded*. Embedded fonts are the only choice for the end-user to use some specific fonts, which are not installed in the system. But any font can be embedded into the document. This means that, for example, there can be a situation when the same font is installed in the system, and at the same time embedded in the WordProcessing document. In the MS Windows operating system before Windows 10 (and MS Windows Server 2016) there is only one location, where fonts installed in the system are stored: the system folder "`%windir%\fonts`", fonts from which are available for all users. However, starting from MS Windows 10 every user has its own local fonts storage: "`%userprofile%\AppData\Local\Microsoft\Windows\Fonts`" (along with the common folder, which still exists, of course). When we're saying "the WordProcessing document uses some font", this statement can be treated differently, because the WordProcessing document can use fonts differently. From the "broad" point of view, every font, which is referenced in the WordProcessing somehow, applied to text content or is a part of some style, is used by the document. In counterpart, from the "narrow" point of view, only the font which is applied to some text content is used by the document. For example, there can be a scenario when a user created a WordProcessing document, writes some text, creates some style named "*Style1*" with a specific font "*Font1*", and applies this style "*Style1*" to the text. In this situation that specific "*Font1*" is used by the document from all points of view. However, after this, the user edits the document and removes all the textual content, which uses the "*Style1*" style, but, what is important, the "*Style1*" is not removed from the document, — it still exists and holds a reference to the "*Font1*". In this situation, from the narrow point of view, the document doesn't use "*Font1*", because there is no piece of text which uses this font, even while "*Font1*" is still a part of the "*Style1*" style, which is a part of the document. ## How to extract fonts GroupDocs.Editor has an ability to extract fonts from a WordProcessing document and represent them as resources in the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance; when this instance will be used for generating HTML markup, all font resources will be present and correctly linked to the HTML content. GroupDocs.Editor is able to extract embedded fonts from the WordProcessing document, as well as extract installed fonts from the system. There are two public properties responsible for working with fonts, both of them are located in the [WordProcessingEditOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingeditoptions) class: ```python font_extraction # FontExtractionOptions enum extract_only_used_font # bool ``` `FontExtractionOptions` is a public enum, located in the `groupdocs.editor.options` module. By default, when an instance of the [WordProcessingEditOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingeditoptions) class is created, this enum has a default value `NotExtract`, which means do not extract any fonts: neither from the document nor from the system. Other values are the following: * `ExtractAllEmbedded`. In this case GroupDocs.Editor extracts all the fonts embedded in the input WordProcessing document regardless of the fact whether some of them are installed in the system or not. In other words, the converter finds and extracts all 100% of font resources which are embedded into the input WordProcessing document, but it doesn't determine whether they are system or custom. Because of this fact the converter doesn't touch the Windows Registry or system folders at all. * `ExtractEmbeddedWithoutSystem`. Unlike the previous option, in this case GroupDocs.Editor not only simply extracts all embedded fonts, but also checks every font whether it is system or not. If some embedded font is system-installed, it will be ignored and will not be present in the font resources inside the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument). In order to detect which font is system or not, the converter tries to obtain a list of all system fonts by using the Windows Registry and system folders, and then compares this list with a set of embedded fonts. As a result, only a subset of those embedded fonts which were not found in the system will be returned. * `ExtractAll`. When this option is selected, GroupDocs.Editor tries to extract absolutely all fonts which are used in the input document, regardless of their nature: embedded or system. In order to do this, GroupDocs.Editor analyzes an input WordProcessing document and finds all fonts which are used there (from the "broad" point of view). If some of these used fonts are embedded in the document, GroupDocs.Editor extracts and uses them. If the document contains no embedded fonts at all, or a collection of embedded fonts is present but doesn't cover all used fonts in the document, GroupDocs.Editor tries to extract these font resources from the system by using the Windows Registry and system folders. This option is perfectly useful in case clients want to make sure that the input document will have a perfect representation for every end-user regardless of what fonts are installed on the client machine or not. The second property, a boolean flag named `extract_only_used_font`, by default has a `False` value, which means that all fonts used in the WordProcessing document from the "broad" point of view will be processed. But when it is enabled by setting a `True` value, GroupDocs.Editor processes only those fonts which are used in the WordProcessing document from the "narrow" point of view, i.e. only those fonts which are applied to some textual content in the document. These two public options work in conjunction. In particular: * If the `font_extraction` enum has a `NotExtract` value, then the value of the `extract_only_used_font` property will be ignored. * If the `font_extraction` enum has an `ExtractAllEmbedded` value, and the `extract_only_used_font` flag is set to `True`, GroupDocs.Editor will not extract 100% of all embedded fonts, but only a subset of those embedded which are used in the WordProcessing document from the "narrow" point of view. * If the `font_extraction` enum has an `ExtractEmbeddedWithoutSystem` value, and the `extract_only_used_font` flag is set to `True`, GroupDocs.Editor returns a subset of the fonts returned by this option when `extract_only_used_font` is `False` — only those fonts will be returned which are simultaneously embedded in the WordProcessing document, not installed in the system, and used in the document from the "narrow" point of view. * If the `font_extraction` enum has an `ExtractAll` value, and the `extract_only_used_font` flag is set to `True`, GroupDocs.Editor first of all forms a list of fonts which are used in the document from the "narrow" point of view, and then only these fonts are returned from the embedded or (if not found in embedded) from system-installed font storages. The `font_extraction` enum value is assigned as illustrated below (the `FontExtractionOptions` enum comes from the `groupdocs.editor.options` module): ```python from groupdocs.editor.options import WordProcessingEditOptions, FontExtractionOptions edit_options = WordProcessingEditOptions() edit_options.font_extraction = FontExtractionOptions.EXTRACT_ALL ``` ## Complete example The runnable example below loads the sample document and edits it with `font_extraction` set to `ExtractAll` and the `extract_only_used_font` flag enabled, then prints the number of extracted font resources. {{< tabs "code-example-font-extraction-options">}} {{< tab "extract_used_fonts.py" >}} ```python import os from groupdocs.editor import Editor, License from groupdocs.editor.options import WordProcessingEditOptions, FontExtractionOptions def extract_used_fonts(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Extract the fonts used in the document; extract_only_used_font keeps only # those applied to textual content (the "narrow" point of view) edit_options = WordProcessingEditOptions() edit_options.font_extraction = FontExtractionOptions.EXTRACT_ALL edit_options.extract_only_used_font = True with Editor("./sample-document.docx") as editor: editable = editor.edit(edit_options) fonts_count = len(list(editable.fonts)) print("Extracted font resources:", fonts_count) editable.dispose() if __name__ == "__main__": extract_used_fonts() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-word/font-extraction-options/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "extract-used-fonts.txt" >}} ```text Extracted font resources: 6 ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-word/font-extraction-options/extract_used_fonts/extract-used-fonts.txt) {{< /tab >}} {{< /tabs >}} --- ## Licensing and Subscription Path: /editor/python-net/licensing-and-subscription/ Sometimes, in order to study the system better, you want to dive into the code as fast as possible. To make this easier, GroupDocs.Editor provides different plans for purchase or offers a Free Trial and a 30-day Temporary License for evaluation. {{< alert style="info" >}} Note that there are a number of general policies and practices that guide you on how to evaluate, properly license, and purchase our products. You can find them in the ["Purchase Policies and FAQ"](https://purchase.groupdocs.com/policies) section. {{< /alert >}} ## Free Trial or Temporary License You can try GroupDocs.Editor without buying a license. ### Free Trial The evaluation version is the same as the purchased one – the evaluation version simply becomes licensed when you set the license. You can set the license in a number of ways that are described in the next sections of this article. The evaluation version comes with limitations: - Only the first 2 pages are processed. - Trial badges are placed in the document at the top of each page. ### Temporary License If you wish to test GroupDocs.Editor without the limitations of the trial version, you can also request a 30-day Temporary License. For more details, see the ["Get a Temporary License"](https://purchase.groupdocs.com/temporary-license) page. ## How to set a license The license file contains details such as the product name, the number of developers it is licensed to, the subscription expiry date, and so on. It contains a digital signature, so don't modify the file. Even inadvertent addition of an extra line break into the file will invalidate it. You need to set a license before utilizing the GroupDocs.Editor API if you want to avoid its evaluation limitations. The license can be loaded from a file or a stream object. The easiest way to set a license is to put the license file in your working directory and specify the file name, as shown in the examples below. You can also let GroupDocs.Editor apply a license automatically by pointing the `GROUPDOCS_LIC_PATH` environment variable at your license file. ### Setting License from File The code below explains how to set the product license from a file. {{< tabs "code-example-set-license-from-file">}} {{< tab "set_license_from_file.py" >}} ```python import os from groupdocs.editor import License def set_license_from_file(): # Path to the license file license_path = os.path.abspath("./GroupDocs.Editor.lic") # Set the license only if the file exists if os.path.exists(license_path): License().set_license(license_path) print("License set successfully.") else: print("License file not found. Running in evaluation mode.") if __name__ == "__main__": set_license_from_file() ``` {{< /tab >}} {{< /tabs >}} ### Setting License from Stream The following example shows how to load a license from a stream (a binary file object). {{< tabs "code-example-set-license-from-stream">}} {{< tab "set_license_from_stream.py" >}} ```python import os from groupdocs.editor import License def set_license_from_stream(): # Path to the license file license_path = os.path.abspath("./GroupDocs.Editor.lic") # Set the license from a stream only if the file exists if os.path.exists(license_path): with open(license_path, "rb") as license_stream: License().set_license(license_stream) print("License set successfully.") else: print("License file not found. Running in evaluation mode.") if __name__ == "__main__": set_license_from_stream() ``` {{< /tab >}} {{< /tabs >}} {{< alert style="info" >}}Calling [License](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/license).[set_license](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/license/) multiple times is not harmful but simply wastes processor time. Call `License().set_license(...)` once in your application's startup code, before using any GroupDocs.Editor classes. {{< /alert >}} ### Setting Metered License {{< alert style="info" >}}You can also set a [Metered](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/metered) license as an alternative to a license file. It is a licensing mechanism that is used alongside the existing licensing method. It is useful for customers who want to be billed based on the usage of the API features. For more details, please refer to the [Metered Licensing FAQ](https://purchase.groupdocs.com/faqs/licensing/metered) section.{{< /alert >}} Here are the simple steps to use the `Metered` class. 1. Create an instance of the [Metered](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/metered) class. 2. Pass the public and private keys to the `set_metered_key` method. 3. Do processing (perform the task). 4. Call the `get_consumption_quantity` method of the `Metered` class. 5. It will return the amount/quantity of API requests that you have consumed so far. 6. Call the `get_consumption_credit` method of the `Metered` class. 7. It will return the credit that you have consumed so far. The following sample code demonstrates how to use the [Metered](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/metered) class. {{< tabs "code-example-set-metered-license">}} {{< tab "set_metered_license.py" >}} ```python from groupdocs.editor import Metered def set_metered_license(): public_key = "" # Your public license key private_key = "" # Your private license key # Set the metered key only if both keys are provided if public_key and private_key: metered = Metered() metered.set_metered_key(public_key, private_key) # Get the amount (MB) consumed amount_consumed = Metered.get_consumption_quantity() print("Amount (MB) consumed:", amount_consumed) # Get the count of credits consumed credits_consumed = Metered.get_consumption_credit() print("Credits consumed:", credits_consumed) else: print("Metered keys not set. Running in evaluation mode.") if __name__ == "__main__": set_metered_license() ``` {{< /tab >}} {{< /tabs >}} --- ## Quick Start Guide Path: /editor/python-net/quick-start-guide/ This guide provides a quick overview of how to set up and start using GroupDocs.Editor for Python via .NET. The library loads a document, converts it to editable HTML/CSS, lets you edit that markup, and saves it back to the original format — or to a different one. ## Prerequisites To proceed, make sure you have: 1. **Configured** environment as described in the [System Requirements]({{< ref "editor/python-net/getting-started/system-requirements.md" >}}) topic. 2. **Optionally** you may [Get a Temporary License](https://purchase.groupdocs.com/temporary-license/) to test all the product features. ## Set Up Your Development Environment For best practices, use a virtual environment to manage dependencies in Python applications. Learn more about virtual environment at [Create and Use Virtual Environments](https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/#create-and-use-virtual-environments) documentation topic. ### Create and Activate a Virtual Environment Create a virtual environment: {{< tabs "example1">}} {{< tab "Windows" >}} ```ps py -m venv .venv ``` {{< /tab >}} {{< tab "Linux" >}} ```bash python3 -m venv .venv ``` {{< /tab >}} {{< tab "macOS" >}} ```bash python3 -m venv .venv ``` {{< /tab >}} {{< /tabs >}} Activate a virtual environment: {{< tabs "example2">}} {{< tab "Windows" >}} ```ps .venv\Scripts\activate ``` {{< /tab >}} {{< tab "Linux" >}} ```bash source .venv/bin/activate ``` {{< /tab >}} {{< tab "macOS" >}} ```bash source .venv/bin/activate ``` {{< /tab >}} {{< /tabs >}} ### Install `groupdocs-editor-net` Package After activating the virtual environment, run the following command in your terminal to install the latest version of the package: {{< tabs "example3">}} {{< tab "Windows" >}} ```ps py -m pip install groupdocs-editor-net ``` {{< /tab >}} {{< tab "Linux" >}} ```bash python3 -m pip install groupdocs-editor-net ``` {{< /tab >}} {{< tab "macOS" >}} ```bash python3 -m pip install groupdocs-editor-net ``` {{< /tab >}} {{< /tabs >}} Ensure the package is installed successfully. You should see the message ```bash Successfully installed groupdocs-editor-net-* ``` ## Example 1: Edit a Word document To quickly test the library, let's load a DOCX file, edit its HTML, and save it back to DOCX. {{< tabs "demo_app_edit_word_document">}} {{< tab "edit_word_document.py" >}} ```python import os from groupdocs.editor import Editor, EditableDocument, License from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions def edit_word_document(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Load the document with Editor("./sample-document.docx") as editor: # Convert the document to editable HTML editable = editor.edit() html = editable.get_embedded_html() # Edit the HTML markup (rename the document title) edited_html = html.replace("Title of the document", "Title of the edited document") # Build an editable document from the modified markup and save it back to DOCX after_edit = EditableDocument.from_markup(edited_html) editor.save(after_edit, "./edited-document.docx", WordProcessingSaveOptions(WordProcessingFormats.DOCX)) if __name__ == "__main__": edit_word_document() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/getting-started/quick-start-guide/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "edited-document.docx" >}} ```text Binary file (DOCX, 49 KB) ``` [Download full output](/editor/python-net/_output_files/getting-started/quick-start-guide/edit_word_document/edited-document.docx) {{< /tab >}} {{< /tabs >}} Your folder tree should look similar to the following directory structure: ```Directory 📂 demo-app ├──edit_word_document.py ├──sample-document.docx └──GroupDocs.Editor.lic (Optionally) ``` ### Run the App {{< tabs "run_the_app_edit_word_document">}} {{< tab "Windows" >}} ```ps py edit_word_document.py ``` {{< /tab >}} {{< tab "Linux" >}} ```bash python3 edit_word_document.py ``` {{< /tab >}} {{< tab "macOS" >}} ```bash python3 edit_word_document.py ``` {{< /tab >}} {{< /tabs >}} After running the app you can deactivate virtual environment by executing `deactivate` or closing your shell. ### Explanation - `Editor("./sample-document.docx")`: Loads the document into the editor. - `editor.edit()`: Converts the document to an `EditableDocument` and `get_embedded_html()` returns a self-contained HTML string. - `EditableDocument.from_markup(edited_html)`: Wraps the modified markup back into an editable document. - `editor.save(..., WordProcessingSaveOptions(WordProcessingFormats.DOCX))`: Saves the edited document back to DOCX. ## Example 2: Convert a document to another format In this example we convert a DOCX file to PDF. GroupDocs.Editor converts through its HTML intermediate — saving an `EditableDocument` with a different `*SaveOptions` produces a different output format. {{< tabs "demo_app_convert_word_to_pdf">}} {{< tab "convert_word_to_pdf.py" >}} ```python import os from groupdocs.editor import Editor, License from groupdocs.editor.options import PdfSaveOptions def convert_word_to_pdf(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Load the document with Editor("./sample-document.docx") as editor: # Convert the document to editable HTML editable = editor.edit() # Save the editable document as PDF editor.save(editable, "./sample-document.pdf", PdfSaveOptions()) if __name__ == "__main__": convert_word_to_pdf() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/getting-started/quick-start-guide/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "sample-document.pdf" >}} ```text Binary file (PDF, 155 KB) ``` [Download full output](/editor/python-net/_output_files/getting-started/quick-start-guide/convert_word_to_pdf/sample-document.pdf) {{< /tab >}} {{< /tabs >}} Your folder tree should look similar to the following directory structure: ```Directory 📂 demo-app ├──convert_word_to_pdf.py ├──sample-document.docx └──GroupDocs.Editor.lic (Optionally) ``` ### Run the App {{< tabs "run_the_app_convert_word_to_pdf">}} {{< tab "Windows" >}} ```ps py convert_word_to_pdf.py ``` {{< /tab >}} {{< tab "Linux" >}} ```bash python3 convert_word_to_pdf.py ``` {{< /tab >}} {{< tab "macOS" >}} ```bash python3 convert_word_to_pdf.py ``` {{< /tab >}} {{< /tabs >}} ### Explanation - `editor.edit()`: Converts the source document to an `EditableDocument`. - `PdfSaveOptions()`: Selects PDF as the output format. - `editor.save(editable, "./sample-document.pdf", PdfSaveOptions())`: Writes the document out as PDF through the HTML intermediate. ## Example 3: Read document information Sometimes you only need a document's metadata. `get_document_info()` returns format, page count, size, and encryption status without a full edit pass. {{< tabs "demo_app_get_document_info">}} {{< tab "get_document_info.py" >}} ```python import os from groupdocs.editor import Editor, License def get_document_info(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Load the document and read its metadata with Editor("./sample-document.docx") as editor: info = editor.get_document_info() print("Format:", info.format.name) print("Extension:", info.format.extension) print("Pages:", info.page_count) print("Size, bytes:", info.size) print("Encrypted:", info.is_encrypted) if __name__ == "__main__": get_document_info() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/getting-started/quick-start-guide/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "get-document-info.txt" >}} ```text Format: Office Open XML WordProcessingML Macro-Free Document (DOCX) Extension: docx Pages: 3 Size, bytes: 49455 Encrypted: False ``` [Download full output](/editor/python-net/_output_files/getting-started/quick-start-guide/get_document_info/get-document-info.txt) {{< /tab >}} {{< /tabs >}} Your folder tree should look similar to the following directory structure: ```Directory 📂 demo-app ├──get_document_info.py ├──sample-document.docx └──GroupDocs.Editor.lic (Optionally) ``` ### Run the App {{< tabs "run_the_app_get_document_info">}} {{< tab "Windows" >}} ```ps py get_document_info.py ``` {{< /tab >}} {{< tab "Linux" >}} ```bash python3 get_document_info.py ``` {{< /tab >}} {{< tab "macOS" >}} ```bash python3 get_document_info.py ``` {{< /tab >}} {{< /tabs >}} ### Explanation - `editor.get_document_info()`: Returns a lightweight view of the document's metadata. - The view supports `snake_case` attribute access (`info.page_count`, `info.format.name`) as well as dict-style access for the underlying PascalCase keys (`info["PageCount"]`). ## Next Steps After completing the basics, explore additional resources to enhance your usage: - [Supported Document Formats]({{< ref "editor/python-net/getting-started/supported-document-formats.md" >}}): Review the full list of supported file types. - [Licensing and Subscription]({{< ref "editor/python-net/getting-started/licensing-and-subscription.md" >}}): Check details on licensing and evaluation. - [Developer Guide]({{< ref "editor/python-net/developer-guide" >}}): Dive into loading, editing, and saving documents of every supported family. - [Technical Support]({{< ref "editor/python-net/technical-support" >}}): Contact support for assistance if you encounter issues. --- ## Saving EditableDocument to stream Path: /editor/python-net/saving-editabledocument-to-stream/ > This article shows and explains advanced techniques and approaches while working with EditableDocument, including saving its HTML markup and the accompanying resources. ## Introduction The [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class is one of the most important among all types in GroupDocs.Editor. It is focused on producing HTML markup and saving it together with its resources, and it exposes several ways to do so. This article covers them. ## Saving the HTML markup to a file The simplest way to persist an [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) is its [`save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/save) method. When invoked with only the HTML file path, it writes the HTML markup, and the accompanying resources are placed next to it automatically. ```python from groupdocs.editor import Editor with Editor("document.docx") as editor: editable = editor.edit() # Write the HTML markup to a file editable.save("document.html") ``` ## Obtaining the HTML markup as a string When the markup is needed in memory rather than on disk — for example, to push it into a stream, a database, or an HTTP response — it can be obtained as a string with [`get_content()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/getcontent), [`get_body_content()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/getbodycontent), or the fully self-contained [`get_embedded_html()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/getembeddedhtml). The resulting string can then be written into any binary stream, such as an `io.BytesIO`: ```python import io from groupdocs.editor import Editor with Editor("document.docx") as editor: editable = editor.edit() # All resources are embedded inside a single self-contained string embedded_html = editable.get_embedded_html() # Write the markup into an in-memory binary stream stream = io.BytesIO() stream.write(embedded_html.encode("utf-8")) stream.seek(0) ``` When [`get_embedded_html()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/getembeddedhtml) is used, the produced string is fully autonomous: stylesheets are embedded into the markup and images are serialized with base64 encoding directly into the `src` attributes, so no external resources are required. ## Complete code example The example below loads a document, opens it for editing, and saves its HTML markup to a file. {{< tabs "code-example-saving-editabledocument-to-stream">}} {{< tab "saving_editabledocument_to_stream.py" >}} ```python import os from groupdocs.editor import Editor, License def saving_editabledocument_to_stream(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) with Editor("./sample-document.docx") as editor: editable = editor.edit() # Save the HTML markup to a file editable.save("output.html") # The same markup is also available in memory as a string embedded_html = editable.get_embedded_html() print("Embedded HTML length:", len(embedded_html)) editable.dispose() if __name__ == "__main__": saving_editabledocument_to_stream() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/editabledocument/saving-editabledocument-to-stream/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "output.html" >}} ```text Sample Document ver.1

Title of the document

Subtitle #1

}} {{< /tabs >}} --- ## Font embedding options Path: /editor/python-net/font-embedding-options/ ## Introduction GroupDocs.Editor for Python via .NET is able, along with font extraction, to embed fonts into the output WordProcessing document. This feature can be treated as similar to the Microsoft Word feature to embed fonts into the saved document after its editing. In counterpart to the _font extraction_ mechanism, which is responsible for extracting font resources from the input WordProcessing document into the intermediate [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument), the font embedding mechanism is responsible for transferring fonts from the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) into the output WordProcessing document through embedding. When user obtains a document content, edited in the WYSIWYG-editor, and creates a new [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance with the edited document content, he then needs to invoke the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor).[`save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method, which will generate an output WordProcessing document from the specified [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) and [`WordProcessingSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) class. The [`WordProcessingSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) class is the point where font embedding, which is disabled by default, can be enabled and tuned. When font embedding is enabled, GroupDocs.Editor analyzes a document content of the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) and forms a list of all fonts, which are used in the document body, for example, in paragraphs, labels and so on. After this GroupDocs.Editor tries to find all of these fonts in the font resources of the input [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) (the `fonts` collection). If the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) contains all font resources, that are used in the document body, they will be embedded into the output WordProcessing document. However, if there are some fonts, used in the document content, which have no corresponding font resources in the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument), then GroupDocs.Editor tries to find them in the OS (where GroupDocs.Editor is running), extract and embed into the output document. There may be a situation, when during the document editing in the WYSIWYG-editor the end-user deleted some text, for example a paragraph, with some specific font, and after the deletion this font is no longer used, while the corresponding font resource is still left in the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument). During font embedding such dangling font resources are detected and ignored, so they will not be passed into the output document. When working with fonts, GroupDocs.Editor must work with so-called _system fonts_. System fonts are the fonts which are the most vital for the operating system; for example, they are used in Windows Explorer, Console, and system built-in applications of the operating system. When performing font embedding, user can define whether to embed the system fonts into the output document or not. Including system fonts may be useful if the user is on an East Asian system and wants to create a document that is readable by others who do not have fonts for that language on their system. For example, a user on a Japanese system could choose to embed the fonts in a document so that the Japanese document would be readable on all systems. ## Usage In order to support font embedding, the [`WordProcessingSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) class contains a `font_embedding` property. This property is of the `FontEmbeddingOptions` enum type with three possible values. By default, when creating a new instance of the [`WordProcessingSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) class, the value of the `font_embedding` property is set to `NotEmbed`, — in this case GroupDocs.Editor does not embed fonts at all. The `FontEmbeddingOptions` enum contains two values for embedding fonts, which are almost the same, but with one slight difference: * `EmbedAll`. As described above, when this option is chosen, GroupDocs.Editor analyzes the document content for used fonts, and then looks for these fonts in the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) and, if required, in the operating system. This option resembles the "Embed fonts in the file" option with all sub-options turned off in Microsoft Word 2007 and higher. * `EmbedWithoutSystem`. This option is almost the same as the previous one, but with one little distinction: it does not embed system fonts. This option resembles the "Embed fonts in the file" option with the enabled "Do not embed common system fonts" sub-option in Microsoft Word 2007 and higher. The snippet below illustrates how the `font_embedding` property is assigned on the [`WordProcessingSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) instance (the `FontEmbeddingOptions` enum comes from the `groupdocs.editor.options` module): ```python from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions, FontEmbeddingOptions save_options = WordProcessingSaveOptions(WordProcessingFormats.DOCX) # By default fonts are not embedded # save_options.font_embedding == FontEmbeddingOptions.NOT_EMBED # Embed all used fonts, including system fonts save_options.font_embedding = FontEmbeddingOptions.EMBED_ALL # Embed all used fonts, but except system fonts save_options.font_embedding = FontEmbeddingOptions.EMBED_WITHOUT_SYSTEM ``` ## Complete example The runnable example below loads the sample document, edits it, and saves the result to DOCX. To enable font embedding, assign the `font_embedding` property as shown in the snippet above. {{< tabs "code-example-font-embedding-options">}} {{< tab "embed_fonts.py" >}} ```python import os from groupdocs.editor import Editor, EditableDocument, License from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions def embed_fonts(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) with Editor("./sample-document.docx") as editor: # Open the document for editing and produce the modified content original = editor.edit() modified = EditableDocument.from_markup(original.get_embedded_html()) # Prepare save options (font_embedding can be assigned here) save_options = WordProcessingSaveOptions(WordProcessingFormats.DOCX) editor.save(modified, "./fonts-output.docx", save_options) print("Saved the edited document to DOCX") original.dispose() modified.dispose() if __name__ == "__main__": embed_fonts() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-word/font-embedding-options/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "fonts-output.docx" >}} ```text Binary file (DOCX, 49 KB) ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-word/font-embedding-options/embed_fonts/fonts-output.docx) {{< /tab >}} {{< /tabs >}} --- ## How to Run Examples Path: /editor/python-net/how-to-run-examples/ {{< alert style="warning" >}}Before running an example make sure that GroupDocs.Editor for Python via .NET has been installed successfully.{{< /alert >}} We offer multiple solutions for how you can run GroupDocs.Editor examples, by building your own or by cloning our ready-to-run Python examples repository. Please choose one from the following list: ## Build a project from scratch * Create and activate a virtual environment, then install **GroupDocs.Editor for Python via .NET** following this [guide]({{< ref "editor/python-net/getting-started/installation.md" >}}). {{< tabs "create_venv">}} {{< tab "Windows" >}} ```ps py -m venv .venv .venv\Scripts\activate py -m pip install groupdocs-editor-net ``` {{< /tab >}} {{< tab "Linux" >}} ```bash python3 -m venv .venv source .venv/bin/activate python3 -m pip install groupdocs-editor-net ``` {{< /tab >}} {{< tab "macOS" >}} ```bash python3 -m venv .venv source .venv/bin/activate python3 -m pip install groupdocs-editor-net ``` {{< /tab >}} {{< /tabs >}} * Code your first application with **GroupDocs.Editor for Python via .NET** like this: {{< tabs "code-example-run_first_example">}} {{< tab "run_first_example.py" >}} ```python import os from groupdocs.editor import Editor, License from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingEditOptions, WordProcessingSaveOptions def run_first_example(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Load the document with Editor("./sample-document.docx") as editor: # Obtain an editable document from the original DOCX document edit_options = WordProcessingEditOptions() editable_document = editor.edit(edit_options) # Pass the editable document to a WYSIWYG editor and edit there... # ... # Save the edited document back to a WordProcessing format - DOCX, for example save_options = WordProcessingSaveOptions(WordProcessingFormats.DOCX) editor.save(editable_document, "./edited-document.docx", save_options) if __name__ == "__main__": run_first_example() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/getting-started/how-to-run-examples/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "edited-document.docx" >}} ```text Binary file (DOCX, 49 KB) ``` [Download full output](/editor/python-net/_output_files/getting-started/how-to-run-examples/run_first_example/edited-document.docx) {{< /tab >}} {{< /tabs >}} * Run your script. The edited document will appear in your working directory. {{< tabs "run_first_example_run">}} {{< tab "Windows" >}} ```ps py run_first_example.py ``` {{< /tab >}} {{< tab "Linux" >}} ```bash python3 run_first_example.py ``` {{< /tab >}} {{< tab "macOS" >}} ```bash python3 run_first_example.py ``` {{< /tab >}} {{< /tabs >}} ## Run the examples from the repository The complete examples package of **GroupDocs.Editor for Python via .NET** is hosted on [GitHub](https://github.com/groupdocs-editor/GroupDocs.Editor-for-Python-via-.NET). You can either download the ZIP file or clone the repository using your favourite git client. Clone the repository: ```bash git clone https://github.com/groupdocs-editor/GroupDocs.Editor-for-Python-via-.NET ``` Create and activate a virtual environment, then install the dependencies listed in `requirements.txt`: {{< tabs "install_requirements">}} {{< tab "Windows" >}} ```ps cd GroupDocs.Editor-for-Python-via-.NET py -m venv .venv .venv\Scripts\activate py -m pip install -r requirements.txt ``` {{< /tab >}} {{< tab "Linux" >}} ```bash cd GroupDocs.Editor-for-Python-via-.NET python3 -m venv .venv source .venv/bin/activate python3 -m pip install -r requirements.txt ``` {{< /tab >}} {{< tab "macOS" >}} ```bash cd GroupDocs.Editor-for-Python-via-.NET python3 -m venv .venv source .venv/bin/activate python3 -m pip install -r requirements.txt ``` {{< /tab >}} {{< /tabs >}} Run all the examples at once: ```bash python run_all_examples.py ``` Or run an individual example by passing its file name: ```bash python .py ``` The repository contains all the sample documents and resources used in the examples, so the scripts run out of the box. ## Contribute If you would like to add or improve an example, we encourage you to contribute to the project. All examples in this repository are open source and can be freely used in your own applications. To contribute, you can fork the repository, edit the code, and create a pull request. We will review the changes and include them in the repository if found helpful. --- ## Create and edit new WordProcessing document Path: /editor/python-net/create-document/ This guide shows how to build a new document from HTML markup and save it in a specific format using the `Editor` and `EditableDocument` classes from GroupDocs.Editor for Python via .NET. It is possible to create a new document in all major document formats, including text (DOCX), workbooks (XLSX), presentations (PPTX), e-books (EPUB) and emails (EML). See the full list of [supported document formats](https://docs.groupdocs.com/editor/python-net/supported-document-formats/), taking note of the "Create" column, which indicates whether it is possible to create a new document of a particular format. ## Steps to Create and Edit a Document ### 1. Build an editable document from HTML markup The content of a new document is described as HTML markup. Wrap that markup into an [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) using the [`from_markup`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/frommarkup) class method: ```python from groupdocs.editor import EditableDocument html = "

New document

Created with GroupDocs.Editor.

" editable = EditableDocument.from_markup(html) ``` ### 2. Modify the HTML content If you obtained the markup from an existing source, you can modify it before building the editable document: ```python updated_html = html.replace("New document", "My new document") updated_doc = EditableDocument.from_markup(updated_html) ``` ### 3. Save the final document Choose the appropriate save options for the desired output format and save the editable document. Saving requires an [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) instance, whose [`save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method writes the editable document to a file or stream: ```python from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions save_options = WordProcessingSaveOptions(WordProcessingFormats.DOCX) save_options.enable_pagination = True ``` ## Complete code example The example below builds an editable document from an HTML markup string, modifies its content, and saves the result to a DOCX file. {{< tabs "code-example-create-document">}} {{< tab "create_document.py" >}} ```python import os from groupdocs.editor import Editor, EditableDocument, License from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions def create_document(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Describe the content of the new document as HTML markup html = "

New document

Created with GroupDocs.Editor.

" # Modify the HTML content if needed updated_html = html.replace("New document", "My new document") # Build an editable document from the modified markup editable = EditableDocument.from_markup(updated_html) # Prepare the save options for the WordProcessing family save_options = WordProcessingSaveOptions(WordProcessingFormats.DOCX) save_options.enable_pagination = True # Load a sample document to obtain an Editor instance, then save the new document with Editor("./sample-document.docx") as editor: editor.save(editable, "./new-document.docx", save_options) editable.dispose() if __name__ == "__main__": create_document() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/create-document/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "new-document.docx" >}} ```text Binary file (DOCX, 8 KB) ``` [Download full output](/editor/python-net/_output_files/developer-guide/create-document/create_document/new-document.docx) {{< /tab >}} {{< /tabs >}} ## Result The code builds a DOCX document from HTML markup, modifies its body, and saves the final version to disk. You can optionally save it to an in-memory stream or return it via an API. ## See also * [`Editor` class](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/) * [`EditableDocument.from_markup`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/frommarkup/) * [`editor.save`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save/) --- ## Locales for output document Path: /editor/python-net/locales-for-output-document/ The [WordProcessingSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) class contains 3 very similar properties that represent a culture (locale): ```python locale # left-to-right text locale_bi # right-to-left (bidirectional) text locale_far_east # East-Asian (Far-East) text ``` In most cases the output WordProcessing document, which is generated from the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance by the [`editor.save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method, contains valid locales for the textual content. But in some cases it is necessary to forcibly and explicitly set some specific locale for the output document. These three options provide such possibility. However, keep in mind that they are suitable only when the document should have a single locale. If the document is multi-language and contains, for example, English and Spanish text, setting the locale to English ("en-GB", for example) will mark the Spanish text as English too, so spell checking in MS Word, for example, will not work properly for such text. By default all these three locale-related properties are not specified, their values are `None`. In such case MS Word (or another program) will detect (or choose) the document locale according to its own settings or other factors. Additionally, if the document is multi-language, it is strongly encouraged to enable the [`enable_language_information`]({{< ref "editor/python-net/developer-guide/edit-document/edit-word/enabling-language-information.md" >}}) property in the [WordProcessingEditOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingeditoptions) class while editing the original document, and not to use these 3 properties. The [`locale`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions/) property is intended to set the locale for usual left-to-right text, which consists of letters. The [`locale_bi`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions/) property should be used when text is right-to-left (RTL), for example, [Arabic script](https://en.wikipedia.org/wiki/Arabic_script) or [Hebrew alphabet](https://en.wikipedia.org/wiki/Hebrew_alphabet). The [`locale_far_east`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions/) property should be used for East-Asian (Far-East) text, including [CJK characters](https://en.wikipedia.org/wiki/CJK_characters). The snippet below illustrates how the locale properties are assigned on the [WordProcessingSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) instance before saving. The locale value is a culture object, obtained from the corresponding type in the underlying .NET globalization layer: ```python from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions save_options = WordProcessingSaveOptions(WordProcessingFormats.DOCX) # Set the locale for the output document (culture object for "en-US") save_options.locale = en_us_culture # Set the locale for right-to-left text save_options.locale_bi = arabic_culture # Set the locale for East-Asian text save_options.locale_far_east = japanese_culture ``` ## Complete example The runnable example below performs the safe core workflow — loading the sample document, editing it, and saving the result to DOCX. The locale properties can additionally be assigned on the save options as shown in the snippet above. {{< tabs "code-example-locales-for-output-document">}} {{< tab "set_output_locale.py" >}} ```python import os from groupdocs.editor import Editor, EditableDocument, License from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions def set_output_locale(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) with Editor("./sample-document.docx") as editor: original = editor.edit() modified = EditableDocument.from_markup(original.get_embedded_html()) # Prepare save options (locale properties can be assigned here) save_options = WordProcessingSaveOptions(WordProcessingFormats.DOCX) editor.save(modified, "./localized-document.docx", save_options) print("Saved the edited document to DOCX") original.dispose() modified.dispose() if __name__ == "__main__": set_output_locale() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-word/locales-for-output-document/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "localized-document.docx" >}} ```text Binary file (DOCX, 49 KB) ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-word/locales-for-output-document/set_output_locale/localized-document.docx) {{< /tab >}} {{< /tabs >}} --- ## Save document Path: /editor/python-net/save-document/ > This article describes how to obtain edited document content from the client, process it and save it to the resultant document of some specified format. In a nutshell, the core process of document editing, where a user types some text across the pages of the document, inserts images, makes some edits, removes words or paragraphs, or moves some document parts from one location to another, is performed in some 3rd party software with GUI outside of GroupDocs.Editor. This software may be, for example, but not limited to: - a web-based WYSIWYG HTML-editor, that is usually a pure client-side application, written in JavaScript and running in the browser. This may be, for example, a TinyMCE or CKEditor; - a desktop application; - a mobile application, running on Android or iOS. The core requirement for this application is to be able to open, view and edit HTML content. GroupDocs.Editor on its side accepts various document formats (WordProcessing, Spreadsheet, Presentation, and many more) and prepares them for editing in external applications by converting them to the HTML markup and placing it in the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) container. So when calling the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor).[`edit()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/edit) method, the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) with the **original content** inside it is generated. Then the user obtains this **original content** from the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument), pushes it to the external HTML-editor, makes edits, and when these edits are done, the user has the **modified content**. In the GroupDocs.Editor terminology, saving a document means obtaining the **modified content** and generating the document in the output format (WordProcessing, Spreadsheet, Presentation, and many more) from it. And this article explains how to do this. In short, saving a document implies the next three steps: 1. Obtain the **modified content** from somewhere (HTML-editor or any other storage or method) and create an instance of [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) from it. [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) actually serves as an object-oriented wrapper of the document content. 2. Select a desired output format, into which the **modified content** should be saved, and optional parameters. 3. Save the **modified content** to the document of the previously chosen format by specified file path or into a specified stream using the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor).[`save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method. These three steps are explained in detail and with code samples below. ## Obtaining the modified content Different content editors provide the content in different forms. Some may return HTML markup as a string, while the external resources like stylesheet(s) and/or image(s) are located in some specific folder as files. Others may return both main content and resources as a collection of byte streams. Some HTML-editors may generate a single string, which already contains all HTML markup with resources baked into it using base64 encoding. There may be unlimited possibilities to do that, and thus the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class has different class methods (factories), which obtain the **modified content** of the HTML document in different forms on input and according to this generate the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance: 1. [`from_file`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/fromfile) is designed for opening HTML-documents from disk — it obtains a path to a `.html` file (that contains HTML-markup) and a path to the corresponding resource folder that contains different resources, like stylesheets, images, font files, audio files and so on. 2. [`from_markup`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/frommarkup) is designed for opening HTML-documents from memory — it obtains the HTML-markup as a string with resources baked into it. 3. [`from_markup_and_resource_folder`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/frommarkupandresourcefolder) is designed for opening HTML-documents from mixed storages — it obtains the HTML-markup as a string, but resources are obtained from a path to the corresponding resource folder, which, unlike the previous method, is mandatory (such a directory should exist). ## Create and adjust saving options When the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance with the **modified content** is created, it's time to save it to the resultant document of some defined format, and GroupDocs.Editor needs to know this format. Like with load and edit options, every family format has its own save options class. These classes are listed below. | Format family | Example formats | Save options class | Format class | | --- | --- | --- | --- | | WordProcessing | DOC, DOCX, DOCM, DOT, ODT | [WordProcessingSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) | [WordProcessingFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/wordprocessingformats) | | Spreadsheet | XLS, XLSX, XLSM, XLSB | [SpreadsheetSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions) | [SpreadsheetFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/spreadsheetformats) | | Delimiter-Separated Values (DSV) | CSV, TSV | [DelimitedTextSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/delimitedtextsaveoptions) | [SpreadsheetFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/spreadsheetformats/) | | Presentation | PPT, PPTX, PPS, POT | [PresentationSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions) | [PresentationFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/presentationformats) | | Plain Text documents | TXT | [TextSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/textsaveoptions) | [TextualFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/textualformats) | | Fixed-layout format | PDF | [PdfSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/pdfsaveoptions) | [FixedLayoutFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/fixedlayoutformats) | | Fixed-layout format | XPS | [XpsSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/xpssaveoptions) | [FixedLayoutFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/fixedlayoutformats) | | Email | EML, EMLX, TNEF, MSG, HTML, MHTML, ICS, VCF, PST, MBOX, OFT | [EmailSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/emailsaveoptions) | [EmailFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/emailformats) | | e-Books | ePub, Mobi, AZW3 | [EbookSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/ebooksaveoptions/) | [EBookFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/ebookformats) | So, let's say it is necessary to save the **modified content** to a document of DOCX format. DOCX is part of the WordProcessing family. So an instance of the [WordProcessingSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) class should be created, and the [`WordProcessingFormats.DOCX`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/wordprocessingformats/) value should be specified in its constructor. Like this: ```python from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions save_options = WordProcessingSaveOptions(WordProcessingFormats.DOCX) ``` If it is also necessary to encode the resultant document with a password, it can be done using one line of code: ```python save_options.password = "some-password" ``` Of course, different formats have different options. For example, there is a `password` property in [WordProcessingSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions), [SpreadsheetSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions), [PresentationSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/presentationsaveoptions), but there is nothing similar in [XpsSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/xpssaveoptions), [EmailSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/emailsaveoptions), [TextSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/textsaveoptions), and [EbookSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/ebooksaveoptions/), because these formats do not support password protection. Also need to mention that the format of the input document with the **original content** and the format of the resultant document with the **modified content** may be different. For example, the original document can be in some WordProcessing format, while the output document can be TXT or PDF. Or the original document can be a PDF, while the output is DOCX. The same transitions are allowed between Spreadsheets and DSV (two-ways). But, of course, they are not allowed where formats are theoretically incompatible in their essence, like WordProcessing and Spreadsheet. ## Saving modified content to the document Finally, when the instance of the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class with modified content inside is created, and the format of the resultant document is defined, it is possible to generate this resultant document using the [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor).[`save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method. This method has two overloads. These overloads differ only in the way the output document is specified: as a path, where the file should be created, or as a byte stream, into which the document content should be written. All other parameters are the same. ```python editor.save(input_document, file_path, save_options) editor.save(input_document, output_stream, save_options) ``` Here: - The 1st parameter — `input_document` — is an [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance with the **modified content** inside, created on the 1st step. - The 2nd parameter is a stream, into which the resultant document of the defined format should be written, or a string, that represents a file path, where the resultant document of the defined format should be stored. - The 3rd parameter is an instance of some save options, which defines the format of the resultant document and additional adjustments, created on the 2nd step. There is also a simplified overload that analyzes the file extension of the `file_path` argument and infers the default saving options automatically: ```python editor.save(input_document, file_path) ``` ## Complete code example Because the WYSIWYG HTML-editor is not a part of GroupDocs.Editor, it is hard to provide a lightweight code example with fully functional editing. So in this sample the content will be edited programmatically, using the string `replace` method. The example loads a DOCX document, edits its content, and saves the **modified content** to several output formats — RTF, DOCM, and TXT. {{< tabs "code-example-save-document">}} {{< tab "save_document.py" >}} ```python import os from groupdocs.editor import Editor, EditableDocument, License from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions, TextSaveOptions def save_document(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Create the Editor by loading an input document in DOCX format with Editor("./sample-document.docx") as editor: # Open the document for editing and obtain an EditableDocument original = editor.edit() # Get the original content as a string original_content = original.get_embedded_html() # Get the modified content by editing the original content modified_content = original_content.replace("Title of the document", "Title of the edited document") # Create an EditableDocument from the modified content modified = EditableDocument.from_markup(modified_content) # Save the modified content to several output formats editor.save(modified, "./edited-document.rtf", WordProcessingSaveOptions(WordProcessingFormats.RTF)) editor.save(modified, "./edited-document.docm", WordProcessingSaveOptions(WordProcessingFormats.DOCM)) editor.save(modified, "./edited-document.txt", TextSaveOptions()) # Release the EditableDocument instances original.dispose() modified.dispose() if __name__ == "__main__": save_document() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/save-document/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "save-document-outputs.zip" >}} ```text edited-document.docm (49 KB) edited-document.rtf (1648 KB) edited-document.txt (5 KB) ``` [Download full output](/editor/python-net/_output_files/developer-guide/save-document/save_document/save-document-outputs.zip) {{< /tab >}} {{< /tabs >}} In this example, three different save options are created and three different [Editor](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor).[`save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) calls are made for saving the modified content to the RTF, DOCM, and TXT formats. --- ## Agents and LLM Integration Path: /editor/python-net/agents-and-llm-integration/ ## AI agent and LLM friendly GroupDocs.Editor for Python via .NET is designed to work seamlessly with AI agents, LLMs, and automated code generation tools. The library ships machine-readable documentation in multiple formats — including an `AGENTS.md` file inside the pip package itself — so that AI assistants can discover and use the API without manual guidance. ## MCP server GroupDocs provides an [MCP (Model Context Protocol) server](https://docs.groupdocs.com/mcp) that enables LLMs to query the documentation on demand instead of loading it all at once. This saves tokens and lets your AI assistant fetch only the information it needs for the current task. To connect your AI tool to the MCP server, add the GroupDocs endpoint to your MCP configuration: {{< tabs "mcp-setup">}} {{< tab "Claude Code / Claude Desktop" >}} ```json // Claude Code: ~/.claude/settings.json (or project .mcp.json) // Claude Desktop: ~/Library/Application Support/Claude/claude_desktop_config.json { "mcpServers": { "groupdocs-docs": { "url": "https://docs.groupdocs.com/mcp" } } } ``` {{< /tab >}} {{< tab "GitHub Copilot" >}} ```json // .vscode/mcp.json in your project root { "servers": { "groupdocs-docs": { "url": "https://docs.groupdocs.com/mcp" } } } ``` {{< /tab >}} {{< tab "Cursor" >}} ```json // .cursor/mcp.json in your project root { "mcpServers": { "groupdocs-docs": { "url": "https://docs.groupdocs.com/mcp" } } } ``` {{< /tab >}} {{< tab "Generic MCP" >}} ```json // Any MCP-compatible client { "mcpServers": { "groupdocs-docs": { "url": "https://docs.groupdocs.com/mcp" } } } ``` {{< /tab >}} {{< /tabs >}} See [https://docs.groupdocs.com/mcp](https://docs.groupdocs.com/mcp) for full setup instructions and the list of available tools. ## AGENTS.md — built into the package The `groupdocs-editor-net` pip package includes an `AGENTS.md` file at `groupdocs/editor/AGENTS.md`. AI coding assistants that scan installed packages (such as Claude Code, Cursor, GitHub Copilot) can automatically discover the API surface, usage patterns, and troubleshooting tips. After installing the package, you can find it at: ```bash pip show -f groupdocs-editor-net | grep AGENTS ``` The full content of that file is reproduced in the [AGENTS.md reference](#agentsmd-reference) section below. ## Machine-readable documentation Every documentation page is available as a plain Markdown file that AI tools can fetch and process directly: | Resource | URL | |---|---| | Full documentation (single file) | `https://docs.groupdocs.com/editor/python-net/llms-full.txt` | | Full documentation (all products) | `https://docs.groupdocs.com/llms-full.txt` | | Individual page (any page) | Append `.md` to the page URL, e.g. `https://docs.groupdocs.com/editor/python-net.md` | | Quick start guide | `https://docs.groupdocs.com/editor/python-net/quick-start-guide.md` | ### How to use with AI tools Point your AI assistant to the full documentation file for comprehensive context: ``` Fetch https://docs.groupdocs.com/editor/python-net/llms-full.txt and use it as a reference for GroupDocs.Editor for Python via .NET API. ``` Or reference individual pages for focused tasks: ``` Read https://docs.groupdocs.com/editor/python-net/quick-start-guide.md and help me edit a DOCX and save it back in Python. ``` ## Why GroupDocs.Editor is a good building block for AI document pipelines LLMs and RAG systems work best with structured text, not binary office files. GroupDocs.Editor converts Word, Excel, PowerPoint, PDF, email, eBook, and text documents into clean, editable HTML/CSS — and writes the edited result back to the original format. That round-trip makes it a natural component in agent-driven document workflows: - **Extract structured content** — convert a document to HTML and pull headings, tables, and lists for downstream parsers or embeddings. - **Programmatic edits** — let an agent rewrite the HTML body (fix text, inject content, redact) and save it straight back to DOCX/XLSX/PPTX without losing formatting. - **Format conversion via HTML** — save an `EditableDocument` with a different `*SaveOptions` to convert (for example DOCX → PDF or DOCX → Markdown) for OCR, vision models, or token-efficient input. - **Inspect before processing** — read format, page count, size, and encryption status with `get_document_info()` so the agent can branch on metadata. A typical agent step looks like: ```python from groupdocs.editor import Editor, EditableDocument from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions, MarkdownSaveOptions with Editor("inbox/incoming.docx") as editor: # Step 1: inspect — let the agent decide what to do based on metadata info = editor.get_document_info() print("format:", info.format.name, "pages:", info.page_count) # Step 2: convert to editable HTML, let the agent edit the markup editable = editor.edit() edited_html = editable.get_body_content().replace("DRAFT", "FINAL") # Step 3a: save the edited markup back to the original format editor.save(EditableDocument.from_markup(edited_html), "staging/incoming.docx", WordProcessingSaveOptions(WordProcessingFormats.DOCX)) # Step 3b: or export clean Markdown for a RAG pipeline editor.save(editable, "staging/incoming.md", MarkdownSaveOptions()) ``` For end-to-end examples — including loading, editing each document family, working with the `EditableDocument`, and saving/converting — see the [Developer Guide]({{< ref "editor/python-net/developer-guide" >}}). Every code block on those pages has a runnable counterpart in the [examples repository](https://github.com/groupdocs-editor/GroupDocs.Editor-for-Python-via-.NET). ## AGENTS.md reference The content below is the same `AGENTS.md` file that ships inside the `groupdocs-editor-net` package. Copy it into your project as `AGENTS.md` or point your AI assistant to this page. ````markdown # GroupDocs.Editor for Python via .NET -- AGENTS.md > Instructions for AI agents working with this package. Load a document, convert it to editable HTML/CSS, edit the markup, then save it back to the original format or convert to another -- Word, Excel, PowerPoint, PDF, email, eBook, and text/markup formats, all without MS Office or OpenOffice installed. ## Install ```bash pip install groupdocs-editor-net ``` **Python**: 3.5 - 3.14 | **Platforms**: Windows, Linux, macOS ## Resources | Resource | URL | |---|---| | Documentation | https://docs.groupdocs.com/editor/python-net/ | | LLM-optimized docs | https://docs.groupdocs.com/editor/python-net/llms-full.txt | | API reference | https://reference.groupdocs.com/editor/python-net/ | | Code examples | https://docs.groupdocs.com/editor/python-net/developer-guide/ | | Release notes | https://releases.groupdocs.com/editor/python-net/release-notes/ | | PyPI | https://pypi.org/project/groupdocs-editor-net/ | | Free support forum | https://forum.groupdocs.com/c/editor/ | | Temporary license | https://purchase.groupdocs.com/temporary-license | ## MCP Server If your environment has MCP configured, you can connect your AI tool to the GroupDocs documentation server for on-demand API lookups: ```json { "mcpServers": { "groupdocs-docs": { "url": "https://docs.groupdocs.com/mcp" } } } ``` Works with Claude Code (`~/.claude/settings.json`), Cursor (`.cursor/mcp.json`), VS Code Copilot (`.vscode/mcp.json`), and any MCP-compatible client. If MCP is unavailable, fall back to the LLM-optimized docs URL above and this file -- both are shipped inside the wheel. ## Imports ```python from groupdocs.editor import ( License, Metered, Editor, EditableDocument, FormFieldManager, EncryptedException, IncorrectPasswordException, PasswordRequiredException, InvalidFormatException, ) from groupdocs.editor.formats import ( WordProcessingFormats, SpreadsheetFormats, PresentationFormats, FixedLayoutFormats, EBookFormats, EmailFormats, TextualFormats, FormatFamilies, ) from groupdocs.editor.options import ( # Load options WordProcessingLoadOptions, SpreadsheetLoadOptions, PresentationLoadOptions, PdfLoadOptions, # Edit options WordProcessingEditOptions, SpreadsheetEditOptions, PresentationEditOptions, PdfEditOptions, EbookEditOptions, EmailEditOptions, MarkdownEditOptions, TextEditOptions, XmlEditOptions, DelimitedTextEditOptions, # Save options WordProcessingSaveOptions, SpreadsheetSaveOptions, PresentationSaveOptions, PdfSaveOptions, HtmlSaveOptions, MhtmlSaveOptions, MarkdownSaveOptions, XpsSaveOptions, TextSaveOptions, EbookSaveOptions, EmailSaveOptions, DelimitedTextSaveOptions, ) from groupdocs.editor.metadata import ( IDocumentInfo, WordProcessingDocumentInfo, SpreadsheetDocumentInfo, PresentationDocumentInfo, FixedLayoutDocumentInfo, TextualDocumentInfo, EmailDocumentInfo, EbookDocumentInfo, MarkdownDocumentInfo, ) ``` ## Load + Edit + Save (the core workflow) `Editor` is the entry point. The flow is always: **open → `edit()` → manipulate HTML → `save()`**. Use it as a context manager so the native document handle is released. ```python from groupdocs.editor import Editor, EditableDocument from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingLoadOptions, WordProcessingSaveOptions with Editor("input.docx", WordProcessingLoadOptions()) as editor: editable = editor.edit() # -> EditableDocument html = editable.get_embedded_html() # full self-contained HTML edited_html = html.replace("Hello", "Goodbye") after_edit = EditableDocument.from_markup(edited_html) save_opts = WordProcessingSaveOptions(WordProcessingFormats.DOCX) editor.save(after_edit, "output.docx", save_opts) ``` **Editor constructor.** `Editor(file_path)`, `Editor(file_path, load_options)`, or `Editor(stream, load_options)`. The load-options type should match the input family (`WordProcessingLoadOptions` for DOCX, `SpreadsheetLoadOptions` for XLSX, etc.). Omitting load options lets the engine auto-detect. **`editor.save(input_document, file_path, save_options)`** writes the (possibly modified) `EditableDocument` to disk. The `save_options` type — not the file extension — decides the output format. ## Per-format quick recipes ### Word Processing (DOC, DOCX, RTF, ODT, …) ```python from groupdocs.editor.options import WordProcessingLoadOptions, WordProcessingEditOptions with Editor("input.docx", WordProcessingLoadOptions()) as editor: eo = WordProcessingEditOptions() eo.enable_pagination = False eo.enable_language_information = True editable = editor.edit(eo) body = editable.get_body_content() # … only ``` ### Spreadsheet (XLS, XLSX, ODS, CSV, …) — edit one worksheet at a time ```python from groupdocs.editor.options import SpreadsheetLoadOptions, SpreadsheetEditOptions with Editor("book.xlsx", SpreadsheetLoadOptions()) as editor: eo = SpreadsheetEditOptions() eo.worksheet_index = 0 # 0-based eo.exclude_hidden_worksheets = True html = editor.edit(eo).get_content() ``` ### Presentation (PPT, PPTX, ODP, …) — edit one slide at a time ```python from groupdocs.editor.options import PresentationLoadOptions, PresentationEditOptions with Editor("deck.pptx", PresentationLoadOptions()) as editor: eo = PresentationEditOptions() eo.slide_number = 0 # 0-based eo.show_hidden_slides = True html = editor.edit(eo).get_content() ``` ### PDF input → HTML ```python from groupdocs.editor.options import PdfLoadOptions, PdfEditOptions lo = PdfLoadOptions() # lo.password = "..." for encrypted PDFs with Editor("input.pdf", lo) as editor: eo = PdfEditOptions() eo.enable_pagination = True html = editor.edit(eo).get_content() ``` ### Email (EML, MSG, MBOX, …) ```python from groupdocs.editor.options import EmailEditOptions, EmailSaveOptions, MailMessageOutput with Editor("message.eml") as editor: eo = EmailEditOptions() eo.mail_message_output = MailMessageOutput.ALL # body, subject, to/cc/bcc, attachments, … html = editor.edit(eo).get_content() ``` ## Round-trip vs. convert There is no separate "convert" call — saving an `EditableDocument` with a *different* save-options family converts via the HTML intermediate. Same input, different `save_options` ⇒ different output format. ```python from groupdocs.editor import Editor, EditableDocument from groupdocs.editor.options import ( WordProcessingSaveOptions, PdfSaveOptions, MarkdownSaveOptions, ) from groupdocs.editor.formats import WordProcessingFormats with Editor("input.docx") as editor: editable = editor.edit() editor.save(editable, "same.docx", WordProcessingSaveOptions(WordProcessingFormats.DOCX)) # round-trip editor.save(editable, "out.pdf", PdfSaveOptions()) # DOCX -> PDF editor.save(editable, "out.md", MarkdownSaveOptions()) # DOCX -> Markdown ``` To feed *modified* markup back into a save call, wrap it with `EditableDocument.from_markup(html)` (or `from_markup_and_resource_folder(html, folder)` / `from_file(html_path, folder)` when the HTML references external images/fonts on disk). ## EditableDocument resources An `EditableDocument` exposes its extracted assets as collections you can iterate (`for r in coll` / `len(coll)`): | Property | Contents | |---|---| | `images` | embedded raster/vector images | | `fonts` | extracted fonts | | `css` | stylesheets | | `audio` | audio resources (e.g. from presentations) | | `all_resources` | everything above, combined | ```python with Editor("input.docx") as editor: editable = editor.edit() print("images:", len(editable.images), "css:", len(editable.css)) # write HTML + every resource (images/fonts/css) into a folder: editable.save("page.html", "page_resources") ``` ## Document info without editing `get_document_info()` returns a lightweight **`StructView`** — a `dict` subclass exposing both `snake_case` attribute access and the raw PascalCase dict keys. Fields: `format` (nested: `name`, `extension`, `mime`, `format_family`), `page_count`, `size`, `is_encrypted`. ```python with Editor("input.docx") as editor: info = editor.get_document_info() # password="..." for encrypted files print("pages:", info.page_count, "size:", info.size, "encrypted:", info.is_encrypted, "format:", info.format.name) # dict access still works for back-compat: info["PageCount"], info["Format"]["Name"] ``` ## Licensing ```python from groupdocs.editor import License # From file License().set_license("path/to/license.lic") # From stream with open("license.lic", "rb") as f: License().set_license(f) ``` Or auto-apply: `export GROUPDOCS_LIC_PATH="path/to/license.lic"` **Evaluation vs licensed.** Without a license the library still runs, but output is restricted: PDF output carries an evaluation watermark, other formats show an equivalent evaluation mark, and there is a page/document-count cap. Set `GROUPDOCS_LIC_PATH` (or call `License().set_license(...)`) and re-run to clear it. A 30-day full license is free: https://purchase.groupdocs.com/temporary-license ## API Reference ### Editor | Method | Returns | Description | |---|---|---| | `Editor(file_path / stream [, load_options])` | | Open by path or binary stream; optional `*LoadOptions` matching the input family. Use as a context manager. | | `edit([edit_options])` | `EditableDocument` | Convert the document to editable HTML/CSS; optional `*EditOptions` (pagination, worksheet/slide selection, …). | | `save(input_document, file_path, save_options)` | `None` | Write the `EditableDocument` out; the `*SaveOptions` type decides the output format. | | `get_document_info([password])` | `StructView` | `dict` subclass with both `snake_case` attrs and PascalCase keys: `format` (nested), `page_count`, `size`, `is_encrypted`. No full edit pass needed. | | `form_field_manager` | `FormFieldManager` | Read/update form fields (Word processing). | ### EditableDocument | Method | Returns | Description | |---|---|---| | `get_content()` | `str` | Full HTML document. | | `get_body_content([external_images_template])` | `str` | `` inner markup only. | | `get_css_content([img_prefix, font_prefix])` | `list` | CSS stylesheet(s) as strings. | | `get_embedded_html()` | `str` | Self-contained HTML with images/CSS inlined. | | `save(html_file_path[, resources_folder_path])` | `None` | Persist HTML (+ resources) to disk. | | `dispose()` | `None` | Release native resources (handled by `with`). | | `images` / `fonts` / `css` / `audio` / `all_resources` | collection | Extracted resources. | | `from_markup(html)` *(classmethod)* | `EditableDocument` | Build an editable doc from modified HTML. | | `from_markup_and_resource_folder(html, folder)` *(classmethod)* | `EditableDocument` | …with on-disk resources. | | `from_file(html_path, folder)` *(classmethod)* | `EditableDocument` | …from an HTML file + resource folder. | ### License / Metered `License().set_license(path_or_stream)` · `Metered().set_metered_key(public, private)` · `Metered.get_consumption_quantity()` · `Metered.get_consumption_credit()` ## Key Patterns - **Properties**: use `snake_case` -- auto-mapped to .NET `PascalCase` - **Context managers**: `with Editor(...) as e:` ensures the document handle is released; `EditableDocument` is disposable too - **Options families**: pick the `*LoadOptions` / `*EditOptions` / `*SaveOptions` that matches the document family; `*SaveOptions` controls the *output* format - **Modified markup**: round-trip edited HTML through `EditableDocument.from_markup(html)` / `from_file(...)` before `editor.save(...)` - **Streams**: pass `open("file", "rb")` or `io.BytesIO(data)` where .NET expects a Stream; `BytesIO` is updated after `save(stream)` - **Enums**: case-insensitive, lazy-loaded (e.g., `WordProcessingFormats.DOCX`, `MailMessageOutput.ALL`) - **Collections**: `for r in editable.images` and `len(editable.css)` work on .NET collections - **Callbacks**: Python functions work for handler interfaces whose methods return `None`. Returning a .NET `Stream` from a Python callback is **not** supported by the binding -- use the file-path / resource-folder `save` overloads instead. ## Platform Requirements | Platform | Requirements | |---|---| | Windows | None | | Linux | `apt install libgdiplus libfontconfig1 ttf-mscorefonts-installer` | | macOS | `brew install mono-libgdiplus` | ## Troubleshooting **Output is watermarked / a few pages only** -- you are running unlicensed (evaluation mode). Apply a license / set `GROUPDOCS_LIC_PATH`. **`PasswordRequiredException` / `IncorrectPasswordException`** -- the document is encrypted. Set the password on the load options: `lo = WordProcessingLoadOptions(); lo.password = "..."; Editor(path, lo)` (or pass `password=` to `get_document_info`). **`System.Drawing.Common is not supported`** -- install libgdiplus: `sudo apt install libgdiplus` (Linux) / `brew install mono-libgdiplus` (macOS) **`Gdip` type initializer exception** -- outdated libgdiplus: `brew reinstall mono-libgdiplus` (macOS) **Garbled text / missing fonts** -- install fonts: `sudo apt install ttf-mscorefonts-installer fontconfig && sudo fc-cache -f` **`DllNotFoundException: libSkiaSharp`** -- a stale system copy conflicts with the bundled version. Rename it: `sudo mv /usr/local/lib/libSkiaSharp.dylib /usr/local/lib/libSkiaSharp.dylib.bak` **`DOTNET_SYSTEM_GLOBALIZATION_INVARIANT` errors** -- do NOT set this. Install ICU: `sudo apt install libicu-dev` **`TypeLoadException`** -- reinstall: `pip install --force-reinstall groupdocs-editor-net` **Still stuck?** Post your question at https://forum.groupdocs.com/c/editor/ -- the development team responds directly. ```` ## See also - [Quick Start Guide]({{< ref "editor/python-net/getting-started/quick-start-guide" >}}) — your first edit in five minutes - [Developer Guide]({{< ref "editor/python-net/developer-guide" >}}) — runnable examples for every API surface - [API Reference](https://reference.groupdocs.com/editor/python-net) — full class and method documentation --- ## Memory optimization option Path: /editor/python-net/memory-optimization-option/ By default [**GroupDocs.Editor**](https://products.groupdocs.com/editor/python-net) tries to perform computations and complete the task as fast as possible, and if this challenge requires a lot of memory to be used, GroupDocs.Editor does it. However, in some very specific cases, when the processing document is very huge, or the user machine has a very limited amount of free memory, an out-of-memory error may occur. In order to solve such a problem the [WordProcessingSaveOptions](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) class contains the [`optimize_memory_usage`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions/) property: ```python optimize_memory_usage # bool ``` By default it has a `False` value, which means that the memory optimization is disabled for the sake of the best possible performance. By setting it to `True` the user can enable another document generating mechanism, which can significantly decrease memory consumption while generating large documents at the cost of slower generation time while performing the [`editor.save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method. ## Complete example The example below loads the sample document, edits it, and saves the result to DOCX with the `optimize_memory_usage` flag enabled. {{< tabs "code-example-memory-optimization-option">}} {{< tab "optimize_memory_usage.py" >}} ```python import os from groupdocs.editor import Editor, EditableDocument, License from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions def optimize_memory_usage(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) with Editor("./sample-document.docx") as editor: original = editor.edit() modified = EditableDocument.from_markup(original.get_embedded_html()) # Enable memory optimization for the saving process save_options = WordProcessingSaveOptions(WordProcessingFormats.DOCX) save_options.optimize_memory_usage = True editor.save(modified, "./optimized-document.docx", save_options) print("Saved the edited document with memory optimization enabled") original.dispose() modified.dispose() if __name__ == "__main__": optimize_memory_usage() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-word/memory-optimization-option/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "optimized-document.docx" >}} ```text Binary file (DOCX, 49 KB) ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-word/memory-optimization-option/optimize_memory_usage/optimized-document.docx) {{< /tab >}} {{< /tabs >}} --- ## Export styles during document editing Path: /editor/python-net/styles-export/ The [family of WordProcessing formats](https://docs.fileformat.com/word-processing/), commonly represented by the [Office Open XML](https://docs.fileformat.com/word-processing/docx/) formats and [Open Document Format (ODT)](https://docs.fileformat.com/word-processing/odt/) formats, have a concept of _styles_. Here a _style_ is a defined set of formatting options, which can be applied to some span of text, paragraphs, lists, or tables. Styles can be built-in (defined by the word processor program, MS Word, for example), or user-defined. Every style has a unique name and can be derived from another style. In MS Word styles are usually located in the "Styles" area in the "Main" tab; users can select a desired text or paragraph of the document and then click on the preferred style in order to apply the style formatting. Every style has the following characteristics and parameters: 1. It has a unique name. 2. It can be a character style (applied for textual content within a paragraph), a paragraph style (applied to paragraph(s)), a list style (applied for list items), or a table style (applied to the whole table). 3. It can be either built-in in MS Word or user-defined. * For built-in styles only: it can be a heading style or not. 4. It can be a Quick Style or not; if yes, this style is shown in the Quick Style gallery inside the MS Word UI. 5. It may be based on some other style or not. GroupDocs.Editor supports styles on export. This feature is not an option, it is always turned on. GroupDocs.Editor exports the WordProcessing styles by transforming them into CSS rulesets and referencing these rulesets from the HTML markup. Here is an example of CSS markup, which contains an MS Word built-in style named "Heading 1": ```css .Heading_1, .Quick-style, .BuiltIn-style, .Heading-style, .Paragraph-style { -aw-style-name: heading1; -gd-style-name: 'Heading 1'; font-family: Cambria; font-size: 14pt; font-weight: bold; font-style: normal; color: rgb(54, 95, 145); text-align: left; margin-top: 24pt; margin-bottom: 0pt; line-height: 134.16667%; } ``` The original style name is preserved in the custom property "-gd-style-name". Because it is a built-in style, it also has a style identifier in the custom property "-aw-style-name". The class selectors inside the grouped selector of this ruleset also store the style's parameters: - The first class selector is a unique style name, adjusted to meet the "CSS identifier" requirements. - The second class selector shows whether it is a quick style. - The third class selector shows whether it is a built-in style or a user style. - The fourth class selector shows whether it is a built-in heading style. - The fifth class selector defines the style type: Character, Paragraph, List, or Table. Here is an example of CSS markup, which contains a custom user-defined character style named "Подзаголовок Знак": ```css .Подзаголовок_Знак, .User-style, .Character-style { -gd-style-name: 'Подзаголовок Знак'; font-family: Cambria; font-size: 12pt; font-weight: normal; font-style: italic; color: rgb(79, 129, 189); letter-spacing: 0.75pt; } ``` Despite the fact that the HTML markup has been modified slightly, and the CSS markup has been modified drastically, the feature was developed in such a way to not disturb or distort the representation of the document in the browser. So when opening and comparing in the web-browser two edited documents side-by-side, there will be no visual difference. ## Complete example The CSS markup, which contains all the exported WordProcessing styles, can be obtained from the [EditableDocument](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance using the [`get_css_content()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/) method. It returns a list of stylesheet strings. The example below loads the sample document, edits it, and prints the number of exported stylesheets and their total size. {{< tabs "code-example-styles-export">}} {{< tab "export_styles.py" >}} ```python import os from groupdocs.editor import Editor, License def export_styles(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) with Editor("./sample-document.docx") as editor: # Open the document for editing editable = editor.edit() # Obtain the CSS content with all exported WordProcessing styles css = editable.get_css_content() # list of stylesheet strings print("Number of exported stylesheets:", len(css)) print("Total CSS length, characters:", sum(len(sheet) for sheet in css)) editable.dispose() if __name__ == "__main__": export_styles() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-word/styles-export/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "export-styles.txt" >}} ```text Number of exported stylesheets: 1 Total CSS length, characters: 34134 ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-word/styles-export/export_styles/export-styles.txt) {{< /tab >}} {{< /tabs >}} --- ## Extracting document metainfo Path: /editor/python-net/extracting-document-metainfo/ > This demonstration shows and explains the usage of the [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo) method, that extracts meta info from the document. ## Introduction In some situations it is required to grab meta info from a document before actually editing it. For example, the user wants to edit the last tab of a multi-tabbed spreadsheet, but he doesn't know how many tabs the document contains. Or it is unclear for the user whether the document is password-protected or not. For such situations [**GroupDocs.Editor**](https://products.groupdocs.com/editor/python-net) provides a [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo) method, that returns detailed meta info (metadata) about the specified document. ## Using the method In order to grab the meta info from a document, it should firstly be loaded into the `Editor` class. Then [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo) should be called. This method accepts one optional parameter — the password as a string. If the document is encoded and the user knows the password, he can specify it here. For other cases, the password can be omitted. The code example below demonstrates the usage: ```python from groupdocs.editor import Editor with Editor("document.docx") as editor: info_without_password = editor.get_document_info() info_with_password = editor.get_document_info(password="password") ``` There can be several scenarios here regarding whether the document is encoded or not, and whether the user specified a password: 1. If a password is specified, but the document is not password-protected, or the document format doesn't support encoding at all, the password will be ignored. 2. If the document is password-protected, but a password is not specified, the [`PasswordRequiredException`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/passwordrequiredexception) will be thrown while calling [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo). 3. If the document is password-protected, and a password is specified, but it is incorrect, the [`IncorrectPasswordException`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/incorrectpasswordexception) will be thrown while calling [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo). ## Explaining the resulting type The [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo) method returns a lightweight view of the document metadata. It supports `snake_case` attribute access as well as dict-style access for the underlying PascalCase keys. It contains the next properties: 1. `page_count`. This is a positive number, that returns the page count for WordProcessing, PDF and XPS documents, tabs (worksheets) count for Spreadsheets, slides count for Presentations and a number `1` for pageless documents like XML or TXT. 2. `size`. The document size in bytes. 3. `is_encrypted`. A boolean flag that indicates whether the document is encrypted with a password or not. If the document is of a type that doesn't support encryption at all, like CSV or XML, this property always returns `False`. 4. `format`. Returns info about the format itself. Internally GroupDocs.Editor provides a dedicated metadata type for every family format, all of which expose the four properties above: 1. [WordProcessingDocumentInfo](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/wordprocessingdocumentinfo) — common for all WordProcessing family formats. 2. [SpreadsheetDocumentInfo](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/spreadsheetdocumentinfo) — common for all Spreadsheet family formats. 3. [PresentationDocumentInfo](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/presentationdocumentinfo) — common for all Presentation family formats. 4. [TextualDocumentInfo](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/textualdocumentinfo) — common for all textual types, including all DSV (like CSV and TSV), XML, HTML, and plain text. 5. [FixedLayoutDocumentInfo](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/fixedlayoutdocumentinfo) — common for all documents with a fixed-layout format, this includes only PDF and XPS. 6. [EmailDocumentInfo](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/emaildocumentinfo) — common for all Email family formats, like EML, MSG, VCF, PST, MBOX and others. 7. [EbookDocumentInfo](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/ebookdocumentinfo) — common for all eBook family formats like MOBI and ePub. 8. [MarkdownDocumentInfo](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/markdowndocumentinfo) — a special type dedicated especially to the Markdown (MD) textual format. One important thing to note: if [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo) returns a `None` value, this means that the specified document is not supported by GroupDocs.Editor and thus cannot be opened for editing or saved. ## Explaining the document format The metadata view contains a `format` property. The format descriptor indicates one particular document format and stores the format name, extension, and MIME-code. It delivers the next properties: 1. `name` — provides the name of the format. 2. `extension` — provides the format extension. 3. `mime` — provides the MIME-code for the particular format. 4. `format_family` — provides the family format the format belongs to. The format descriptors are grouped by family: 1. [WordProcessingFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/wordprocessingformats) — holds all formats from the WordProcessing family. 2. [SpreadsheetFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/spreadsheetformats) — holds all formats from the Spreadsheet family. 3. [PresentationFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/presentationformats) — holds all formats from the Presentation family. 4. [TextualFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/textualformats) — holds all formats with a text-based nature. 5. [FixedLayoutFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/fixedlayoutformats) — holds all formats from the fixed-layout family. This includes only PDF and XPS. 6. [EBookFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/ebookformats) — holds all eBook (Electronic book) formats like Mobi and ePub. 7. [EmailFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/emailformats) — holds all email (electronic mail) formats like EML and MSG. ## Complete code example The example below loads a document, reads its metadata without performing a full edit pass, and prints the most useful fields. {{< tabs "code-example-extracting-document-metainfo">}} {{< tab "extracting_document_metainfo.py" >}} ```python import os from groupdocs.editor import Editor, License def extracting_document_metainfo(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Load the document and read its metadata with Editor("./sample-document.docx") as editor: info = editor.get_document_info() print("Format:", info.format.name) print("Extension:", info.format.extension) print("MIME:", info.format.mime) print("Pages:", info.page_count) print("Size, bytes:", info.size) print("Encrypted:", info.is_encrypted) if __name__ == "__main__": extracting_document_metainfo() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/extracting-document-metainfo/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "extracting-document-metainfo.txt" >}} ```text Format: Office Open XML WordProcessingML Macro-Free Document (DOCX) Extension: docx MIME: application/vnd.openxmlformats-officedocument.wordprocessingml.document Pages: 3 Size, bytes: 49455 Encrypted: False ``` [Download full output](/editor/python-net/_output_files/developer-guide/extracting-document-metainfo/extracting_document_metainfo/extracting-document-metainfo.txt) {{< /tab >}} {{< /tabs >}} --- ## Technical Support Path: /editor/python-net/technical-support/ GroupDocs provides unlimited free technical support for all of its products. Support is available to all users, including evaluation. The support is provided at the [Free Support Forum](https://forum.groupdocs.com/) and the [Paid Support Helpdesk](https://helpdesk.groupdocs.com/). {{< alert style="info" >}} Please note that GroupDocs does not provide technical support over the phone. Phone support is only available for sales and purchase questions. {{< /alert >}} ## GroupDocs Free Support Forum If you need help with GroupDocs.Editor, consider the following: * Make sure you are using the latest GroupDocs.Editor version before reporting an issue. See the [GroupDocs.Editor PyPI page](https://pypi.org/project/groupdocs-editor-net/) to find out about the latest version. * Have a look through the forums, this documentation, and the API Reference before reporting an issue — perhaps your question has already been answered. * Post your question at the [GroupDocs.Editor Free Support Forum](https://forum.groupdocs.com/c/editor), and we'll assist you. Questions are answered directly by the GroupDocs.Editor development team. * When expecting a reply on the forums, please allow for time zone differences. ## Paid Support Helpdesk The paid support issues have higher priority compared to free support requests. * Post your question at the [Paid Support Helpdesk](https://helpdesk.groupdocs.com/) to set a higher priority for the issue. ## Report an Issue or Feature Request When posting your issue, question, or feature request with GroupDocs.Editor, follow these simple steps to make sure it is resolved in the most efficient way: * Include the original document and possibly the code snippet that is causing the problem. If you need to attach a few files, zip them into one. It is safe to attach your documents to the GroupDocs forums because only you and the GroupDocs developers will have access to the attached files. * Add information about the environment you are facing the issue in. * Try to report one issue per thread. If you have another issue, question, or feature request, please report it in a separate thread. --- ## Enabling inline CSS styles Path: /editor/python-net/inline-styles/ In GroupDocs.Editor, the editing operation implies that the source document is converted to the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance, and this instance can emit HTML- and CSS-markup to be placed in the client-side WYSIWYG-editor, then the end-user edits this document in the web-browser, and finally the document is saved — this implies converting document content from HTML/CSS back to the original (or some other) format. By default the HTML-version of a document, which is obtained from the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) class, has its CSS styles placed in an external stylesheet file. The HTML file in this case contains a reference to the external stylesheet. GroupDocs.Editor provides an ability to enable the "inline styles" mode for all documents which belong to the [family of WordProcessing formats](https://docs.fileformat.com/word-processing/). In this mode only [WordProcessing styles are exported to the external stylesheet]({{< ref "editor/python-net/developer-guide/edit-document/edit-word/styles-export.md" >}}), while all other styles — text and paragraph formatting, list and table settings — are exported directly to the HTML markup. They become _inline CSS styles_, which means that in this case the styles are saved in the ["style" HTML attribute](https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/style) for every HTML element. As a result, an HTML document generated with the enabled inline styles option has larger HTML-markup and smaller CSS-markup, compared to one where this option is disabled. The inline styles option is represented as a public boolean property in the [`WordProcessingEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingeditoptions) class. By default it is disabled (`False`). ```python from groupdocs.editor.options import WordProcessingEditOptions edit_options = WordProcessingEditOptions() edit_options.use_inline_styles = True ``` The example below shows opening and editing a single WordProcessing document twice: the first time in default style, and then with enabled inline styles. {{< tabs "code-example-inline-styles">}} {{< tab "use_inline_styles.py" >}} ```python import os from groupdocs.editor import Editor, License from groupdocs.editor.options import WordProcessingEditOptions def use_inline_styles(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Default edit options, where all styles are external options_external_styles = WordProcessingEditOptions() # Custom edit options, where most of the styles are inline options_inline_styles = WordProcessingEditOptions() options_inline_styles.use_inline_styles = True with Editor("./sample-document.docx") as editor: doc_external = editor.edit(options_external_styles) doc_inline = editor.edit(options_inline_styles) # Get only the HTML-markup for both variants html_external = doc_external.get_content() html_inline = doc_inline.get_content() print("External styles HTML length:", len(html_external)) print("Inline styles HTML length:", len(html_inline)) doc_external.dispose() doc_inline.dispose() if __name__ == "__main__": use_inline_styles() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-word/inline-styles/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "use-inline-styles.txt" >}} ```text External styles HTML length: 31654 Inline styles HTML length: 48162 ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-word/inline-styles/use_inline_styles/use-inline-styles.txt) {{< /tab >}} {{< /tabs >}} It is worth emphasizing that no matter where the CSS styles are located — inside the HTML-markup or in the external file — the final representation of the HTML-document in the web-browser will be identical. Also please take into account that this feature exists only for the WordProcessing formats family. --- ## Edit PDF Path: /editor/python-net/edit-pdf/ > This example demonstrates the standard open-edit-save pipeline with PDF documents, using different options on every step. ## Introduction The [PDF documents, or documents in a Portable Document Format](https://docs.fileformat.com/pdf/), developed by Adobe Corp, are widely used all over the Internet and in document management systems. The PDF format has a crucial distinction from other formats such as [DOCX](https://docs.fileformat.com/word-processing/docx/), [TXT](https://docs.fileformat.com/word-processing/txt/), or [HTML](https://docs.fileformat.com/web/html/)/[CSS](https://docs.fileformat.com/web/css/) — it is a so-called fixed-layout format. The main purpose of PDF is to be platform-independent and store the exact representation of a document — wherever and whenever this document is opened, it should provide per-character and even per-pixel fidelity. This means that a document, once created, is "baked" in terms of its representation and editability. While you can freely edit any DOCX document by adding, removing, or moving any part of its content, PDF documents stay "frozen". Internally, a PDF document consists of pages, where every page contains a set of glyphs (visual characters), each having coordinates of where it is located on the page. Concluding: - Editing PDF documents like ordinary DOCX, TXT, or HTML is an extremely difficult and complex task. - The quality of editing a PDF document may be very close to what we can do with usual text documents, but it will never be 100%, especially when the input PDF has quite complex formatting and content. - Due to the complexity of the PDF format and the process of making it editable, this operation requires a lot of processing time and memory. ## In two words Editing PDF documents is the same as editing any other document: 1. Load a PDF document into the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class with [`PdfLoadOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/pdfloadoptions), specifying a password if needed. 2. Edit the document using the [`editor.edit()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/edit) method with [`PdfEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/pdfeditoptions) and obtain an instance of [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument). 3. Send the document content to the client-side, edit it there with a WYSIWYG-editor, and send the modified (edited) content back to the server-side. 4. Create an instance of [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) with the modified content and call [`editor.save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) using [`PdfSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/pdfsaveoptions). ## Loading The [`PdfLoadOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/pdfloadoptions) class is responsible for loading PDF files into the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor). It has only one property — a `password` of a string type. By default it is `None` — no password is specified. This property is vital when an input document is encoded with a password. If a document is not encoded, the property value is ignored whether it was specified or not. When the input PDF is not password-protected, `PdfLoadOptions` is not necessary at all — GroupDocs.Editor will automatically detect the PDF format and apply the default `PdfLoadOptions` by itself. However, specifying even a default `PdfLoadOptions` will speed up the document processing, because in this case GroupDocs.Editor will not spend processing time on the automatic format detection routine. ```python from groupdocs.editor import Editor from groupdocs.editor.options import PdfLoadOptions # Create the default PDF loading options load_options = PdfLoadOptions() # Set a password load_options.password = "some_password" # Load a PDF without PDF load options editor1 = Editor("protected.pdf") # Load a PDF with PDF load options editor2 = Editor("protected.pdf", load_options) ``` ## Editing Like for other format families in GroupDocs.Editor, there is a special [`PdfEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/pdfeditoptions) class for editing PDF documents. The most useful properties are: 1. The `skip_images` boolean flag. By default it has a `False` value — images are not skipped and are preserved. However, if you need only textual information from the document, you can set this flag to `True`. 2. The `enable_pagination` boolean flag. This flag sets the document conversion mode: the **float** (default value is `False`) or **paginal** (`True`). When the float mode is selected, the document content is converted to a pageless (float) HTML document. When the paginal mode is selected, the pages of the document are preserved in the generated HTML document, like in a PDF viewer. 3. The `pages` property, which allows setting a page range that should be processed. By default all pages of the input document are processed. If a default `PdfEditOptions` instance is acceptable for you, you may omit creating `PdfEditOptions` at all — just call the parameterless [`editor.edit()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/edit) overload and GroupDocs.Editor will internally generate and apply the default `PdfEditOptions` for the input PDF document. The runnable example below loads a PDF document, edits it with adjusted `PdfEditOptions`, and obtains the HTML content from the resultant [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument). {{< tabs "code-example-edit-pdf">}} {{< tab "edit_pdf.py" >}} ```python import os from groupdocs.editor import Editor, License from groupdocs.editor.options import PdfLoadOptions, PdfEditOptions def edit_pdf(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Prepare PDF load options (optional, speeds up format detection) load_options = PdfLoadOptions() # Load an input PDF document into the Editor with Editor("./sample-document.pdf", load_options) as editor: # Create and adjust the PDF edit options edit_options = PdfEditOptions() edit_options.enable_pagination = True edit_options.skip_images = False # Edit the PDF and obtain an EditableDocument editable = editor.edit(edit_options) # Obtain the HTML content (in practice it is sent to the WYSIWYG-editor) content = editable.get_content() print("Generated HTML content length:", len(content)) editable.dispose() if __name__ == "__main__": edit_pdf() ``` {{< /tab >}} {{< tab "sample-document.pdf" >}} {{< tab-text >}} `sample-document.pdf` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-pdf/sample-document.pdf) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "edit-pdf.txt" >}} ```text Generated HTML content length: 110427 ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-pdf/edit_pdf/edit-pdf.txt) {{< /tab >}} {{< /tabs >}} ## Saving Like for other document formats, there is a special class responsible for saving PDF documents — the [`PdfSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/pdfsaveoptions) class. It has the following properties: 1. `password` — allows you to protect the output PDF document with a specified password. By default it is `None` — password protection is not applied. 2. `compliance` — allows setting the PDF standards compliance level for the output PDF. 3. `optimize_memory_usage` — a boolean flag that modifies the generation of the output PDF document so that the process takes less memory at the cost of longer processing time. By default it has a `False` value. 4. `font_embedding` — responsible for embedding font resources into the resultant PDF document. Unlike `PdfLoadOptions` and `PdfEditOptions`, which are optional, `PdfSaveOptions` is mandatory even if all its values are default. After editing, an [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) is created from the modified content and is then passed, together with the save options, to the [`editor.save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method: ```python from groupdocs.editor import Editor, EditableDocument from groupdocs.editor.options import PdfEditOptions, PdfSaveOptions with Editor("./sample-document.pdf") as editor: original = editor.edit(PdfEditOptions()) # Send the content to the WYSIWYG-editor and obtain the edited content (omitted here) edited = EditableDocument.from_markup(original.get_embedded_html()) save_options = PdfSaveOptions() save_options.password = "some_password" save_options.optimize_memory_usage = True editor.save(edited, "./edited-document.pdf", save_options) original.dispose() edited.dispose() ``` ## Different output formats Keep in mind that when an input PDF was edited and you are going to save it, it is not necessary to save it exactly in the PDF format — you are free to choose any compatible format, like all WordProcessing formats, the text format, or eBook formats. ```python from groupdocs.editor import Editor, EditableDocument from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import PdfSaveOptions, WordProcessingSaveOptions, TextSaveOptions with Editor("./sample-document.pdf") as editor: edited = editor.edit() # Save to PDF editor.save(edited, "./edited-document.pdf", PdfSaveOptions()) # Save to DOCX editor.save(edited, "./edited-document.docx", WordProcessingSaveOptions(WordProcessingFormats.DOCX)) # Save to TXT editor.save(edited, "./edited-document.txt", TextSaveOptions()) edited.dispose() ``` ## Obtaining PDF document info The article [Extracting document metainfo]({{< ref "editor/python-net/developer-guide/extracting-document-metainfo.md" >}}) describes the [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo) method, which allows you to detect the document format and extract its metadata without editing it. This mechanism also works with PDF documents. When [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo) is called for an [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) instance that is loaded with a PDF document, the method returns a metadata view corresponding to the `FixedLayoutDocumentInfo` type — a common type for all fixed-layout documents, PDF and XPS in particular. It exposes the `format`, `page_count`, `size`, and `is_encrypted` properties. If the input PDF is encoded, its correct password should be specified in the [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo) method. ```python from groupdocs.editor import Editor with Editor("./sample-document.pdf") as editor: info = editor.get_document_info() print("Format:", info.format.name) print("Page count:", info.page_count) print("Size:", info.size) print("Is encrypted:", info.is_encrypted) ``` --- ## Output format and password Path: /editor/python-net/output-format-and-password/ The [`WordProcessingSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) class is responsible for tuning and adjusting the saving process, when an [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance, that contains already edited document content, should be converted to the output document of some WordProcessing format. In other words, the [`WordProcessingSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) class specifies how exactly the edited HTML should be saved to the output document. Unlike load and edit options, the [`WordProcessingSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) class has one mandatory constructor parameter — `output_format`. The `output_format` parameter has a [`WordProcessingFormats`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/wordprocessingformats) type. In turn, this type contains all supported formats from the [family of WordProcessing formats](https://docs.fileformat.com/word-processing/), which are supported for saving documents. Each field of [`WordProcessingFormats`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/wordprocessingformats) represents one separate WordProcessing format — DOC, DOCX, RTF, ODT etc. So, with the `output_format` parameter the user should select a specific format of the output WordProcessing document, which should be generated by the [`editor.save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method. ```python from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions save_options = WordProcessingSaveOptions(WordProcessingFormats.RTF) ``` This constructor parameter is mandatory, because it is unacceptable to create a [`WordProcessingSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) instance without an output format; otherwise, how would GroupDocs.Editor "know" into which specific format the document should be saved? However, the output format can be changed later, after creating an instance of the [`WordProcessingSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) class, with the `output_format` property. This property also has a getter, that allows to obtain the output format, specified previously in the constructor. ```python save_options.output_format = WordProcessingFormats.DOCX ``` Almost all WordProcessing formats, especially those with a binary nature, support file encoding with a password. If such a document is encoded, it is required to specify a password for opening it. The [`WordProcessingSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) class has the following property, which allows to set a password: ```python save_options.password = "p@ss" ``` By default the value of this property is `None`, which means that a password will not be applied. If the user specifies a string in this property, the output document will be encoded and protected with this string as a password. If the user has specified some password in this property at some step, but then wants to dismiss the password and not encode the document, he can set the value to `None` or an empty string — both these values will be interpreted as "do not set a password" when calling the [`editor.save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method. ## Complete example The example below loads the sample document, edits it, and saves the result to the RTF format encrypted with a password. {{< tabs "code-example-output-format-and-password">}} {{< tab "set_output_format_and_password.py" >}} ```python import os from groupdocs.editor import Editor, EditableDocument, License from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions def set_output_format_and_password(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) with Editor("./sample-document.docx") as editor: original = editor.edit() modified = EditableDocument.from_markup(original.get_embedded_html()) # Select the output format and protect the document with a password save_options = WordProcessingSaveOptions(WordProcessingFormats.RTF) save_options.password = "p@ss" editor.save(modified, "./protected-output.rtf", save_options) print("Saved a password-protected RTF document") original.dispose() modified.dispose() if __name__ == "__main__": set_output_format_and_password() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-word/output-format-and-password/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "protected-output.rtf" >}} ```text Binary file (RTF, 1.6 MB) ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-word/output-format-and-password/set_output_format_and_password/protected-output.rtf) {{< /tab >}} {{< /tabs >}} --- ## Generating page preview for WordProcessing document Path: /editor/python-net/generating-page-preview/ GroupDocs.Editor for Python via .NET allows to generate a preview for an arbitrary page in a loaded WordProcessing document, like DOC, DOCX, DOCM, RTF, ODT etc. The preview is generated in the SVG format, which is a vector graphics format and is supported in numerous image viewers and also by any modern browser. Using this feature the users can view and inspect any page of the imported WordProcessing document without actually editing it. This preview cannot be edited by GroupDocs.Editor itself, it can only be obtained in SVG format and saved as usual — to a byte stream or file. This feature works regardless of the licensing mode of GroupDocs.Editor: it works the same for both trial and licensed mode, there are no trial limitations for this feature. While generating the page preview, GroupDocs.Editor doesn't write off the consumed bytes or credits. With this feature, GroupDocs.Editor supports generating a preview for all three major office format families: WordProcessing, Spreadsheet and Presentation. For generating the page preview for a particular WordProcessing document the user must perform the following steps: - Load a desired WordProcessing file into the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/) class. - Obtain the document info via the [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/) method (specifying a password if the document is protected). - Because the loaded file has a format from the WordProcessing family (like DOCX, RTF, etc.), the obtained document info exposes the page count and a method to generate a preview for a given page. - Invoke the preview method and specify a zero-based _index_ (do not confuse with the page _numbers_, which are 1-based) of the desired page. If the specified index is less than 0 or exceeds the number of pages within a given document, an out-of-range error will be raised. - The preview method returns a page preview as an SVG vector image, which has all necessary methods and properties to obtain the content of an SVG image in any desired form, save it to disk, stream, and so on. The snippet below illustrates the page preview workflow. Because the page-preview API surface is illustrative here, treat this as a reference outline rather than a copy-paste runnable script: ```python from groupdocs.editor import Editor with Editor("./document.docx") as editor: # Obtain document info for the loaded WordProcessing file info = editor.get_document_info() # Get the number of all pages pages_count = info.page_count # Iterate through all pages and generate a preview on every iteration for page_index in range(pages_count): # Generate one preview as an SVG image by page index one_svg_preview = info.generate_preview(page_index) # Save the SVG preview to a file one_svg_preview.save("page-{0}.svg".format(page_index)) ``` All WordProcessing formats are able to store raster images. So when a specific page of some loaded document has one or more raster images, and for this page the SVG preview is generated, this/these raster image(s) will be embedded inside the SVG in base64 format using the [data URI scheme](https://en.wikipedia.org/wiki/Data_URI_scheme). If the end-user needs to obtain a preview of the page in a raster format instead of vector, the returned SVG image object can convert its SVG content to the PNG format and save it. ## Complete example The runnable example below performs the safe core workflow — loading the sample document, opening it for editing, and obtaining the HTML content (the page-preview API itself is shown illustratively above). {{< tabs "code-example-generating-page-preview">}} {{< tab "generate_page_preview.py" >}} ```python import os from groupdocs.editor import Editor, License def generate_page_preview(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) with Editor("./sample-document.docx") as editor: # Open the document for editing and obtain its HTML representation editable = editor.edit() html = editable.get_content() print("Loaded WordProcessing document, HTML content length:", len(html)) editable.dispose() if __name__ == "__main__": generate_page_preview() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-word/generating-page-preview/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "generate-page-preview.txt" >}} ```text Loaded WordProcessing document, HTML content length: 31654 ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-word/generating-page-preview/generate_page_preview/generate-page-preview.txt) {{< /tab >}} {{< /tabs >}} --- ## How to edit CSV file Path: /editor/python-net/how-to-edit-csv-file/ > This demonstration shows and explains all the necessary moments and options regarding processing DSV spreadsheets (Delimiter-Separated Values) like CSV and others. ## Introduction DSV (Delimiter-Separated Values) files are a specific form of text-based spreadsheets with delimiters (separators). Due to their nature, [**GroupDocs.Editor**](https://products.groupdocs.com/editor/python-net) processes this class of documents separately from usual binary spreadsheets. In contrast to usual spreadsheets, DSV documents due to their textual nature have only a single tab (worksheet) and cannot be encoded. Any non-empty string may be treated as a separator, so the user always needs to specify it explicitly. The most common types of DSV are: 1. CSV (Comma-Separated Values) 2. TSV (Tab-Separated Values) 3. Semicolon-separated values 4. Whitespace-separated values 5. ...and any other GroupDocs.Editor supports DSV with any separator, which can be a single character or a set of characters (string). ## Loading a CSV file for edit Unlike WordProcessing and Spreadsheet documents, DSV documents are loaded into the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class without any loading options. They are simple text files by their nature, so there is nothing to adjust: ```python from groupdocs.editor import Editor editor = Editor("spreadsheet.csv") ``` ## Edit a CSV file In order to open any DSV document for editing, the user must use the [`DelimitedTextEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/delimitedtexteditoptions) class, whose single constructor has one mandatory parameter — a separator (delimiter) string, which should not be `None` or an empty string. There are also several optional properties: - `convert_date_time_data` and `convert_numeric_data` are boolean flags that indicate how to treat numbers. GroupDocs.Editor can recognize digits within cells and treat them as numbers or date-time values. By default this recognition is disabled, but the user can turn it on. - `treat_consecutive_delimiters_as_one` is a boolean flag that determines how consecutive delimiters should be treated — as several (default, `False`) or as a single one (`True`). - `optimize_memory_usage` is a boolean flag with a different purpose. By default, GroupDocs.Editor algorithms are tuned for maximum performance. However, in some rare cases the user may need to load a very large DSV file. By enabling this flag, the user switches GroupDocs.Editor to use other processing algorithms, which consume a relatively low amount of memory at the cost of lower performance. The runnable example below loads a CSV file, edits it with comma as the delimiter and numeric recognition enabled, and then saves the edited content to a TSV file (a DSV with a tab separator). {{< tabs "code-example-how-to-edit-csv-file">}} {{< tab "edit_csv.py" >}} ```python import os from groupdocs.editor import Editor, EditableDocument, License from groupdocs.editor.options import DelimitedTextEditOptions, DelimitedTextSaveOptions def edit_csv(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Load an input CSV file into the Editor with Editor("./cars.csv") as editor: # Create the DSV edit options with a comma separator edit_options = DelimitedTextEditOptions(",") edit_options.convert_numeric_data = True edit_options.treat_consecutive_delimiters_as_one = True # Edit the CSV document and obtain an EditableDocument editable = editor.edit(edit_options) # Edit the content programmatically (in practice this is done in a WYSIWYG-editor) html = editable.get_content() edited = EditableDocument.from_markup(html) # Create DSV save options with a tab separator (TSV) save_options = DelimitedTextSaveOptions("\t") save_options.trim_leading_blank_row_and_column = True save_options.keep_separators_for_blank_row = False # Save the edited content to the TSV format editor.save(edited, "./edited-cars.tsv", save_options) editable.dispose() edited.dispose() if __name__ == "__main__": edit_csv() ``` {{< /tab >}} {{< tab "cars.csv" >}} {{< tab-text >}} `cars.csv` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/how-to-edit-csv-file/cars.csv) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "edited-cars.tsv" >}} ```text 1997 Ford E350 ac, abs, moon 3000 1999 Chevy Venture «Extended Edition» 4900 1996 Jeep Grand Cherokee MUST SELL! air, moon roof, loaded 4799 2014 SsangYong Kyron good car! 1998 ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/how-to-edit-csv-file/edit_csv/edited-cars.tsv) {{< /tab >}} {{< /tabs >}} ## Save a CSV file after editing After being edited, an input DSV can be saved back to a DSV (not necessarily with the same separator) or to any supported Spreadsheet document. In order to save a document to the DSV format, the user must use the [`DelimitedTextSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/delimitedtextsaveoptions) class, which, like [`DelimitedTextEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/delimitedtexteditoptions), has one constructor with a mandatory string parameter — a separator (delimiter), which should not be `None` or an empty string. There are also other properties: 1. `encoding` — allows specifying the encoding of the generated DSV. By default, if not specified, it is UTF-8. 2. `trim_leading_blank_row_and_column` — a boolean flag that indicates whether leading blank rows and columns should be trimmed, like what MS Excel does. 3. `keep_separators_for_blank_row` — a boolean flag that indicates whether separators should be output for a blank row. The default value is `False`, which means the content for a blank row will be empty. The edited content can also be saved to a Spreadsheet format such as XLSM. For this, a [`SpreadsheetSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/spreadsheetsaveoptions) instance with the desired output format is used: ```python from groupdocs.editor.formats import SpreadsheetFormats from groupdocs.editor.options import SpreadsheetSaveOptions xlsm_save_options = SpreadsheetSaveOptions(SpreadsheetFormats.XLSM) editor.save(edited, "./edited-cars.xlsm", xlsm_save_options) ``` --- ## How to edit XML file Path: /editor/python-net/edit-xml/ > This example demonstrates opening, editing, and saving XML documents, using different options and adjustments. ## Introduction GroupDocs.Editor supports importing documents in the [XML (eXtensible Markup Language)](https://docs.fileformat.com/web/xml/) format. This article describes the XML processing mechanism and the available editing options. ## Loading XML documents Loading XML documents into the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class is usual and the same as for other formats. There are no dedicated load options for the XML format; it is enough to specify the file itself through a file path or a byte stream. If loading through a file path, the file extension does not matter, so you may freely load an XML file not only with the `*.xml` extension, but with any other extension like `*.csproj`, `*.svg`, or any other — only the valid internal structure matters. Also, please note that you cannot treat HTML files like XML — only [XHTML](https://en.wikipedia.org/wiki/XHTML) can be treated like valid XML. ```python from groupdocs.editor import Editor # Load from a file path editor_from_path = Editor("sample.xml") # Load from a binary stream with open("sample.xml", "rb") as stream: editor_from_stream = Editor(stream) ``` ## Editing XML documents Like for other format families in GroupDocs.Editor, there is a special [`XmlEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/xmleditoptions) class for editing XML documents. As always, it is not mandatory when editing a document, so the parameterless [`editor.edit()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/edit) overload may be used — GroupDocs.Editor will automatically detect the format and apply the default options. The [`XmlEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/xmleditoptions) class has different properties. The most useful and important ones are described below: - `encoding` — allows setting the encoding that is applied while opening the input XML file (any XML is first of all a text file). By default all XML files are UTF-8, so the default value of this option is also UTF-8. - `fix_incorrect_structure` — a boolean flag. GroupDocs.Editor can handle without error any XML document: corrupted, truncated, or with an invalid structure. If `fix_incorrect_structure` is enabled (`True`), GroupDocs.Editor scans the XML document and tries to fix its structure — it escapes prohibited characters, properly closes unclosed tags, opens unopened tags, fixes overlapping tags, and so on. By default it is disabled (`False`). - `recognize_uris` — a boolean flag that enables the mechanism of recognizing and preparing URIs (web addresses). By default it is disabled (`False`). When enabled (`True`), GroupDocs.Editor scans the XML document for any valid URIs and represents them as external links in the resultant HTML using the A element. - `recognize_emails` — a boolean flag, very similar to `recognize_uris`, but for email addresses. By default it is disabled. When enabled (`True`), all valid email addresses are represented with the [mailto](https://en.wikipedia.org/wiki/Mailto) scheme and the A element. - `trim_trailing_whitespaces` — a boolean flag that enables truncation of trailing whitespaces in text nodes. By default it is disabled (`False`) — trailing whitespaces are preserved. - `attribute_values_quote_type` — allows redefining the quote type used in attribute values in the resultant HTML (single quote or double quote). By default double quotes are used. The runnable example below loads an XML file, edits it with adjusted boolean options, and obtains the resulting HTML content. {{< tabs "code-example-edit-xml">}} {{< tab "edit_xml.py" >}} ```python import os from groupdocs.editor import Editor, License from groupdocs.editor.options import XmlEditOptions def edit_xml(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Load an input XML file into the Editor with Editor("./sample.xml") as editor: # Create and adjust the XML edit options edit_options = XmlEditOptions() edit_options.fix_incorrect_structure = True edit_options.recognize_uris = True edit_options.recognize_emails = True edit_options.trim_trailing_whitespaces = True # Edit the XML document and obtain an EditableDocument editable = editor.edit(edit_options) # Obtain the HTML content (in practice it is sent to the WYSIWYG-editor) content = editable.get_content() print("Generated HTML content length:", len(content)) editable.dispose() if __name__ == "__main__": edit_xml() ``` {{< /tab >}} {{< tab "sample.xml" >}} {{< tab-text >}} `sample.xml` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-xml/sample.xml) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "edit-xml.txt" >}} ```text Generated HTML content length: 9345 ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-xml/edit_xml/edit-xml.txt) {{< /tab >}} {{< /tabs >}} The resulting [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) may be passed to the WYSIWYG-editor or any other HTML editing software, or simply saved to disk with the [`save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/save) method: ```python with Editor("./sample.xml") as editor: edited = editor.edit() edited.save("./edited.html") edited.dispose() ``` ## Advanced highlight and format options The [`XmlEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/xmleditoptions) class also has two compound properties, `highlight_options` and `format_options`, which are wrappers around the `XmlHighlightOptions` and `XmlFormatOptions` types respectively. An already created instance is set in each of these properties, and only the members of those instances are meant to be changed — a new instance cannot be assigned. `highlight_options` controls the fonts (name, size, color, weight, style, and decoration) used to represent XML tags, attribute names, attribute values, inner text, HTML comments, and CDATA sections in the resultant HTML. `format_options` controls how the XML hierarchy is laid out — whether each attribute goes on a new line, whether leaf text nodes go on a new line, and the size of the left indent per nesting level. These properties operate on complex CSS-related value types, so adjust their members carefully. The snippet below is a schematic, non-runnable illustration of accessing these compound properties: ```python from groupdocs.editor.options import XmlEditOptions edit_options = XmlEditOptions() # Access the already-created compound sub-options (do not assign a new instance) highlight_options = edit_options.highlight_options format_options = edit_options.format_options # Adjust members of the compound options using CSS-related value types # (see the API reference for the exact value types and their constructors) ``` ## Getting document metainfo The article [Extracting document metainfo]({{< ref "editor/python-net/developer-guide/extracting-document-metainfo.md" >}}) describes the [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo) method, which allows you to detect the document format and extract its metadata without editing it. The XML format is supported as well. When [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo) is called for an [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) instance loaded with an XML document, the method returns a metadata view corresponding to the `TextualDocumentInfo` type — a common type for all document formats of a textual nature, like HTML, XML, and TXT. It exposes the `format`, `page_count`, `size`, `is_encrypted`, and `encoding` properties. For XML documents `page_count` always returns 1 and `is_encrypted` always returns `False`. ```python from groupdocs.editor import Editor with Editor("./sample.xml") as editor: info = editor.get_document_info() print("Format:", info.format.name) print("Page count:", info.page_count) print("Size:", info.size) print("Is encrypted:", info.is_encrypted) ``` --- ## How to edit e-Book file Path: /editor/python-net/edit-ebook/ ## Introduction GroupDocs.Editor for Python via .NET supports 3 formats from the e-Book family: 1. [MOBI](https://docs.fileformat.com/ebook/mobi/) (MobiPocket), 2. [AZW3](https://docs.fileformat.com/ebook/azw3/), also known as Kindle Format 8 (KF8), 3. [ePub](https://docs.fileformat.com/ebook/epub/) (Electronic Publication). All three formats are fully supported on both import (load) and export (save). ## Load e-Book files for edit GroupDocs.Editor for Python via .NET does not contain loading options either for the whole e-Book formats family or for the specific e-Book formats — users should specify e-Books through a file path or a byte stream without any loading options at all. ```python from groupdocs.editor import Editor # Load from a file path editor_from_path = Editor("book.epub") # Load from a binary stream with open("book.epub", "rb") as stream: editor_from_stream = Editor(stream) ``` ## Edit e-Book files There is a common edit options class for the whole e-Book formats family — the [`EbookEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/ebookeditoptions) class. The content of this class resembles the content of the [`WordProcessingEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingeditoptions) class, because [`EbookEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/ebookeditoptions) contains a subset of options from it — `enable_pagination` and `enable_language_information` — and, as in [`WordProcessingEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingeditoptions), they are disabled (`False`) by default. - `enable_pagination` — allows enabling or disabling pagination in the resultant HTML document. By default it is disabled (`False`). This option controls how exactly the content of the e-Book will be converted to the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) representation while edited — in the _float_ (`False`) or in the _paged_ (`True`) mode. - `enable_language_information` — allows exporting (`True`) or not exporting (`False`) the language information to the resultant HTML markup. By default it is disabled (`False`). This is useful when an e-Book contains text in different languages, and you want to preserve this language-specific metainformation while editing the document in the WYSIWYG-editor. Like for all supported document formats, the [`EbookEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/ebookeditoptions) are optional, and the user may call the parameterless [`edit()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/edit) method — in this case the default [`EbookEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/ebookeditoptions) is implicitly applied. ## Save e-Book files after edit Saving e-Books is performed like for all other formats. When the e-Book content was edited by the client in the WYSIWYG-editor and sent back to the server-side, it should be passed to the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument), and then this instance should be passed to the [`editor.save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method. For saving in one of the e-Book formats the [`EbookSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/ebooksaveoptions) class should be used. This class is common for all supported e-Book formats within the e-Book family: MOBI, AZW3, and ePub. It has one constructor with a mandatory parameter — the desired output format, which should be specified as one of the [`EBookFormats`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/ebookformats) values: `MOBI`, `AZW3`, or `EPUB`. Once the instance was created, this format can be obtained and changed using the `output_format` property. The [`EbookSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/ebooksaveoptions) class also has two other properties: - `split_heading_level` of integer type controls how (if at all) to split the content of the e-Book into packages in the resultant file. It does not affect the representation of a file opened in any e-Book reader; rather, it is about the internal structure of the e-Book file. The default value is `2`. Setting it to `0` disables splitting. - `export_document_properties` of boolean type controls whether to export built-in and custom document properties inside the resultant e-Book file. The default `False` value disables exporting document properties, so the resultant document will be a bit smaller in size. The runnable example below loads an ePub file, edits it with default options, and saves the edited version back to the ePub format. {{< tabs "code-example-edit-ebook">}} {{< tab "edit_ebook.py" >}} ```python import os from groupdocs.editor import Editor, EditableDocument, License from groupdocs.editor.formats import EBookFormats from groupdocs.editor.options import EbookEditOptions, EbookSaveOptions def edit_ebook(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Load an input ePub file into the Editor with Editor("./sample-ebook.epub") as editor: # Create and adjust the e-Book edit options edit_options = EbookEditOptions() edit_options.enable_pagination = True edit_options.enable_language_information = True # Edit the e-Book and obtain an EditableDocument editable = editor.edit(edit_options) # Edit the content programmatically (in practice this is done in a WYSIWYG-editor) html = editable.get_embedded_html() edited = EditableDocument.from_markup(html.replace("

", " (edited)

", 1)) # Create ePub save options and tune them save_options = EbookSaveOptions(EBookFormats.EPUB) save_options.export_document_properties = True save_options.split_heading_level = 3 # Save the edited document back to the ePub format editor.save(edited, "./edited-ebook.epub", save_options) editable.dispose() edited.dispose() if __name__ == "__main__": edit_ebook() ``` {{< /tab >}} {{< tab "sample-ebook.epub" >}} {{< tab-text >}} `sample-ebook.epub` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-ebook/sample-ebook.epub) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "edited-ebook.epub" >}} ```text Binary file (EPUB, 29 KB) ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-ebook/edit_ebook/edited-ebook.epub) {{< /tab >}} {{< /tabs >}} The same approach is used to save into the AZW3 or MOBI formats — just create the [`EbookSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/ebooksaveoptions) instance with `EBookFormats.AZW3` or `EBookFormats.MOBI` in the constructor. ## Extracting metainfo from e-Book files Like for all supported formats, GroupDocs.Editor for Python via .NET provides the ability to detect document metainfo for all supported e-Book formats by using the [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo) method of the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class. When a valid e-Book was loaded into the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) instance, [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo) returns a metadata view corresponding to the `EbookDocumentInfo` type, which defines the `format`, `page_count`, `size`, and `is_encrypted` properties. - The `format` property returns an [`EBookFormats`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/ebookformats) value, which for e-Books can be `MOBI`, `AZW3`, or `EPUB`. - The `page_count` property returns an **approximate** number of pages in the case of MOBI or AZW3, or the number of chapters in the case of ePub. The returned number should be treated carefully and approximately. - The `size` property returns the number of bytes of the e-Book file. - The `is_encrypted` property always returns `False`, because e-Books cannot be encrypted with a password. ```python from groupdocs.editor import Editor with Editor("./sample-ebook.epub") as editor: info = editor.get_document_info() print("Format:", info.format.name) print("Page count:", info.page_count) print("Size:", info.size) ``` --- ## How to edit Mobi file Path: /editor/python-net/how-to-edit-mobi-file/ ## Introduction Mobi format is an E-Book format, developed by the French company MobiPocket and based on XML. E-Books in this format can contain text with rich formatting, images, and different annotations like bookmarks, notes, highlights, corrections and so on. Mobi books can have DRM protection. GroupDocs.Editor for Python via .NET is able to open (load) Mobi documents for editing, edit them, and save (export) documents back to the Mobi format, so Mobi is fully supported: on import, export and auto-detection. The closely related [AZW3 format](https://docs.fileformat.com/ebook/azw3/), also known as Kindle Format 8 (KF8), which may be considered a successor to Mobi, is supported on both import and export as well. ## Loading a Mobi file for editing Despite being a distinct format, which doesn't belong to any of the existing format families, Mobi has no dedicated loading options. So for loading it into an instance of the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class, users should simply specify a path to the Mobi file or a stream with its content in the constructor: ```python from groupdocs.editor import Editor # Load from a file path editor_from_path = Editor("book.mobi") # Load from a binary stream with open("book.mobi", "rb") as stream: editor_from_stream = Editor(stream) ``` There are no loading options, because Mobi has nothing to tune up during loading — it cannot have password protection, and can be processed only in one way. ## Editing a Mobi file Because Mobi belongs to the e-Book format family, it uses common edit options for all e-Book formats — the [`EbookEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/ebookeditoptions). This class may be described as a truncated version of the [`WordProcessingEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingeditoptions) class, because [`EbookEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/ebookeditoptions) contains a subset of options from [`WordProcessingEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingeditoptions) — `enable_pagination` and `enable_language_information` and, as in [`WordProcessingEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingeditoptions), they are disabled (`False`) by default. They have exactly the same meaning as their "siblings" from [`WordProcessingEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingeditoptions). `enable_pagination` allows switching between float and paginal mode in the resultant HTML document. `enable_language_information` allows enabling the export of language information in HTML. This is very useful for books, which have parts of text written in different languages. An example of usage is below (let's assume that an [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) instance with a loaded Mobi document is already created): ```python from groupdocs.editor.options import EbookEditOptions edit_options = EbookEditOptions() edit_options.enable_pagination = True edit_options.enable_language_information = True opened = editor.edit(edit_options) # save it or pass it to the WYSIWYG-editor ``` ## Saving a Mobi file after editing For all the format families the saving procedure is the same — it is required to obtain the content of the edited document on the server-side, create an instance of the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) from it, and then pass it to the [`editor.save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method. Because the Mobi format belongs to the e-Book formats family, in order to save a document in the Mobi format an [`EbookSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/ebooksaveoptions) class is required. This class has a mandatory constructor with a single parameter — the desired [e-Book format](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/ebookformats/). For Mobi it should be `EBookFormats.MOBI`. All other parameters in the `EbookSaveOptions` class are in fact optional, but are nevertheless described below: - `split_heading_level` of integer type controls the internal structure of the generated Mobi file: whether its internal content is divided into packages and, if yes, then how. This parameter exists because the content of a Mobi file is stored in packages, usually one package per chapter. By default this parameter is `2` — the most optimal. - `export_document_properties` of boolean type also relates to the internal structure of the Mobi file — it decides whether to embed the built-in and custom document properties inside the resultant Mobi file (`True`) or not (`False`). By default it is `False` — do not embed. Concluding: in order to save the edited document in the Mobi format, the user should create an instance of the [`EbookSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/ebooksaveoptions) class with the `EBookFormats.MOBI` argument in the constructor parameter, and then pass this instance to the [`editor.save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method along with the other arguments. The code example below demonstrates the full round-trip: opening a Mobi document, editing it, and saving the edited version to the Mobi format. {{< tabs "code-example-working-with-mobi-documents">}} {{< tab "edit_mobi_document.py" >}} ```python import os from groupdocs.editor import Editor, EditableDocument, License from groupdocs.editor.formats import EBookFormats from groupdocs.editor.options import EbookEditOptions, EbookSaveOptions def edit_mobi_document(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Create an Editor instance and load an input Mobi file with Editor("./sample-ebook.mobi") as editor: # Create edit options for the e-Book family edit_options = EbookEditOptions() edit_options.enable_pagination = True edit_options.enable_language_information = True # Edit the Mobi document and obtain an EditableDocument editable = editor.edit(edit_options) # Edit the content programmatically (in practice this is done in a WYSIWYG-editor) html = editable.get_embedded_html() edited = EditableDocument.from_markup(html.replace("Title of the document", "Title of the edited document")) # Create Mobi save options and tune them save_options = EbookSaveOptions(EBookFormats.MOBI) save_options.export_document_properties = True # Save the edited document in the Mobi format editor.save(edited, "./edited-ebook.mobi", save_options) editable.dispose() edited.dispose() if __name__ == "__main__": edit_mobi_document() ``` {{< /tab >}} {{< tab "sample-ebook.mobi" >}} {{< tab-text >}} `sample-ebook.mobi` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/working-with-mobi-documents/sample-ebook.mobi) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "edited-ebook.mobi" >}} ```text Binary file (MOBI, 28 KB) ``` [Download full output](/editor/python-net/_output_files/developer-guide/working-with-mobi-documents/edit_mobi_document/edited-ebook.mobi) {{< /tab >}} {{< /tabs >}} ## Detecting a Mobi file As for documents of all supported types, Mobi documents can be detected using the [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo) method of the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class. When a valid Mobi document was loaded into the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) instance, [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo) returns a metadata view corresponding to the [`EbookDocumentInfo`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.metadata/ebookdocumentinfo) type, which defines four properties: `format`, `page_count`, `size`, and `is_encrypted`. - The `format` property returns an [`EBookFormats`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/ebookformats) value, which for files with a `*.mobi` extension can be `MOBI` or `AZW3`. - The `page_count` property returns an **approximate** number of pages in the case of MOBI or AZW3, or the number of chapters in the case of ePub. For Mobi it is approximate, because the Mobi format internally is a set of HTML documents (chapters), which are not separated into pages and have no strict page dimensions. So, for returning a page count for a Mobi document, GroupDocs.Editor assumes a standard A4 page size in portrait orientation, splits the existing document content onto such "papers", and then calculates the count. The returned number should be treated very carefully and approximately. - The `size` property returns the number of bytes of the Mobi document. - The `is_encrypted` property always returns `False`, because Mobi documents cannot be encrypted with a password, like PDF or Office Open XML. --- ## Edit TXT Path: /editor/python-net/edit-txt/ > This demonstration shows how to load, open for editing, and save text documents, and explains the edit and save options and their purpose. ## How to edit a TXT file? Textual documents are simple Plain Text flat files (TXT) that contain no images, pages, paragraphs, lists, tables, and so on. However, users can create some primitive formatting like lists with leading markers, left indents with whitespaces, tables with pseudo-graphics, paragraphs with line breaks, and so on. [**GroupDocs.Editor**](https://products.groupdocs.com/editor/python-net) can recognize some of these structures. Another feature that GroupDocs.Editor provides is the ability to save an edited TXT document not only back to TXT, but also to WordProcessing. ## Loading a text file for edit Unlike WordProcessing and Spreadsheet documents, text documents are loaded into the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class without any loading options. They are simple text files by their nature, so there is nothing to adjust: ```python from groupdocs.editor import Editor editor = Editor("file.txt") ``` ## Edit a text file In order to open a text document for editing by creating an [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance, the [`TextEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/texteditoptions) class is used. This class has several options (properties), described below: 1. `encoding` — allows setting the character encoding of the input text document. By default, if not specified, it is UTF-8. 2. `recognize_lists` — a boolean flag that indicates how exactly numbered list items are recognized. If this option is set to `False` (default value), the list recognition algorithm detects list paragraphs when list numbers end with either a dot, a right bracket, or bullet symbols. If this option is set to `True`, whitespaces are also used as list number delimiters. 3. `leading_spaces` — a flag that indicates how consecutive leading spaces should be handled: convert to a left indent (default value), preserve as consecutive spaces, or trim. 4. `trailing_spaces` — a flag that indicates how consecutive trailing spaces should be handled: truncate (default value) or trim. 5. `enable_pagination` — a boolean flag that allows enabling the paged view of the document. By their nature all text documents are pageless, however, GroupDocs.Editor allows splitting them into pages, like MS Word does. 6. `direction` — a flag which allows specifying the direction of the text flow in the input plain text document. By default it is left-to-right. The runnable example below demonstrates using this options class to open the input text document for editing, getting the HTML content, editing it programmatically, and then saving the edited content back to a TXT file. {{< tabs "code-example-edit-txt">}} {{< tab "edit_txt.py" >}} ```python import os from groupdocs.editor import Editor, EditableDocument, License from groupdocs.editor.options import TextEditOptions, TextSaveOptions def edit_txt(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Load an input text file into the Editor with Editor("./sample.txt") as editor: # Create and adjust the text edit options edit_options = TextEditOptions() edit_options.enable_pagination = True edit_options.recognize_lists = True # Edit the text file and obtain an EditableDocument editable = editor.edit(edit_options) # Edit the content programmatically (in practice this is done in a WYSIWYG-editor) html = editable.get_content() edited = EditableDocument.from_markup(html.replace("This is a sample plain text file", "This is an edited sample plain text file")) # Create text save options and tune them save_options = TextSaveOptions() save_options.preserve_table_layout = True # Save the edited content back to the TXT format editor.save(edited, "./edited.txt", save_options) editable.dispose() edited.dispose() if __name__ == "__main__": edit_txt() ``` {{< /tab >}} {{< tab "sample.txt" >}} {{< tab-text >}} `sample.txt` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-txt/sample.txt) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "edited.txt" >}} ```text This is an edited sample plain text file   There is one empty line above this text. Two spaces. Three spaces. New line, which is preceded by 5 consecutive spaces. External link: https://rozetka.com.ua/final_pm_2d_black/p34087151/#tab=characteristics New line again, but at this time 1 tab char and 1 space char. One tab. Two tabs. Three tabs.     [TRUNCATED] ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-txt/edit_txt/edited.txt) {{< /tab >}} {{< /tabs >}} ## Save a text file after edit After being edited, a text document can be saved back as TXT or as WordProcessing. For saving back to the TXT format the user must use the [`TextSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/textsaveoptions) class, which has the following properties: 1. `encoding` — the character encoding of the text document, which will be applied when saving it. By default, if not specified, it is UTF-8. 2. `add_bidi_marks` — a boolean flag that determines whether to add bi-directional marks before each BiDi run when saving in plain text format. 3. `preserve_table_layout` — a boolean flag that specifies whether GroupDocs.Editor should try to preserve the layout of tables when saving in plain text format. The default value is `False`. The edited content can also be saved to a WordProcessing format. For this, a [`WordProcessingSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/wordprocessingsaveoptions) instance with the desired output format is used: ```python from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions word_save_options = WordProcessingSaveOptions(WordProcessingFormats.DOCM) editor.save(edited, "./edited.docm", word_save_options) ``` As a result, after running these examples the user will have versions of the edited document in the TXT and DOCM formats. --- ## Edit Email Path: /editor/python-net/edit-email/ > This example demonstrates the standard open-edit-save pipeline with Email documents, using different options on every step. ## Introduction There is a group of [Email file formats](https://docs.fileformat.com/email/), which are usually intended for storing individual mail letters, contact data, personal information, calendars, and so on. There are plenty of them, because almost every email program uses its own set of such formats. This article explains how to edit different Email files, because due to their nature their editing differs from editing common text documents. ## Loading Loading Email documents is usual and the same as for other formats. Like with text, XPS, and e-Book formats, and unlike PDF and Office formats, there are no loading options for Email documents — just specify a path to the file or a byte stream with the document content in the constructor of the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class, and that's all. ```python from groupdocs.editor import Editor # Load from a file path editor_from_path = Editor("message.eml") # Load from a binary stream with open("message.eml", "rb") as stream: editor_from_stream = Editor(stream) ``` ## Editing Like for all other formats, there is a special [`EmailEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/emaileditoptions) class for adjusting the editing of Email documents. An instance of this class can be specified in the [`editor.edit()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/edit) method. It is also possible to use the parameterless overload of [`editor.edit()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/edit) — in this case GroupDocs.Editor automatically applies the default [`EmailEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/emaileditoptions). The [`EmailEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/emaileditoptions) class has only one member — a `mail_message_output` property of the [`MailMessageOutput`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/mailmessageoutput) type. `MailMessageOutput` is a flagged enum that controls which parts of the mail message should be delivered to the output [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) and then to the emitted HTML. The `MailMessageOutput` enum has both *atomic* values that are responsible for a single part of the mail message (`SUBJECT`, `FROM`, `TO`, `CC`, `BCC`, `DATE`, `BODY`, `ATTACHMENTS`), as well as *combined* values, like `COMMON` and `ALL`. It is important to mention that currently GroupDocs.Editor cannot process the content of attachments — it can only display a list of attachment names. So the `MailMessageOutput.ATTACHMENTS` flag, when specified, emits a list of attachment names, if they are present in the mail message. The runnable example below loads an EML file, edits it with the `MailMessageOutput.ALL` value, and obtains the resulting HTML content. {{< tabs "code-example-edit-email">}} {{< tab "edit_email.py" >}} ```python import os from groupdocs.editor import Editor, License from groupdocs.editor.options import EmailEditOptions, MailMessageOutput def edit_email(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Load an input Email document into the Editor with Editor("./sample-email.eml") as editor: # Create edit options that process all parts of the mail message edit_options = EmailEditOptions(MailMessageOutput.ALL) # Edit the Email document and obtain an EditableDocument editable = editor.edit(edit_options) # Obtain the HTML content (in practice it is sent to the WYSIWYG-editor) content = editable.get_content() print("Generated HTML content length:", len(content)) editable.dispose() if __name__ == "__main__": edit_email() ``` {{< /tab >}} {{< tab "sample-email.eml" >}} {{< tab-text >}} `sample-email.eml` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-email/sample-email.eml) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "edit-email.txt" >}} ```text Generated HTML content length: 1255 ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-email/edit_email/edit-email.txt) {{< /tab >}} {{< /tabs >}} It is also possible to use any user-defined combination of atomic values. For example, you can create a combination that processes only metadata, but not the body and attachments list: ```python from groupdocs.editor.options import EmailEditOptions, MailMessageOutput metadata_only = (MailMessageOutput.SUBJECT | MailMessageOutput.FROM | MailMessageOutput.TO | MailMessageOutput.CC | MailMessageOutput.BCC | MailMessageOutput.DATE) edit_options = EmailEditOptions(metadata_only) ``` ## Saving Saving edited Email documents in general follows the same principles as for other document formats, but with several distinctions. As usual, after obtaining edited HTML content from the client-side WYSIWYG-editor, an instance of the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) should be created, and then it should be passed to the [`editor.save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method. For Email documents the save options class is [`EmailSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/emailsaveoptions). Like [`EmailEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/emaileditoptions), this class has only one member — a `mail_message_output` property of the [`MailMessageOutput`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/mailmessageoutput) type. The only distinction is that in `EmailEditOptions` the `mail_message_output` property controls which parts of the mail message are processed while generating the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) from the input file, while in `EmailSaveOptions` it controls which parts are processed while generating the output Email file from the edited [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument). The [`EmailSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/emailsaveoptions) class does not allow specifying the format for the output document — it is automatically the same as the format of the original Email document loaded into the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class. For example, if you loaded an MSG file, then [`editor.save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) will also generate an MSG file, no matter which file extension you specify. ```python from groupdocs.editor import Editor, EditableDocument from groupdocs.editor.options import EmailEditOptions, EmailSaveOptions, MailMessageOutput with Editor("./sample-email.eml") as editor: # Edit with all content original = editor.edit(EmailEditOptions(MailMessageOutput.ALL)) # Send the content to the WYSIWYG-editor and obtain the edited content (omitted here) edited = EditableDocument.from_markup(original.get_embedded_html()) # Save only the common parts of the mail message save_options = EmailSaveOptions(MailMessageOutput.COMMON) editor.save(edited, "./edited-email.eml", save_options) original.dispose() edited.dispose() ``` ## Obtaining Email document info The article [Extracting document metainfo]({{< ref "editor/python-net/developer-guide/extracting-document-metainfo.md" >}}) describes the [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo) method, which allows you to detect the document format and extract its metadata without editing it. This mechanism also works with Email documents. When [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo) is called for an [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) instance that is loaded with an Email document, the method returns a metadata view corresponding to the `EmailDocumentInfo` type — a common type for all Email formats like EML, EMLX, MSG, and so on. It exposes the `format`, `page_count`, `size`, and `is_encrypted` properties. For Email documents `page_count` always returns 1 (email documents have no paged view), and `is_encrypted` always returns `False` (email documents cannot be encrypted with a password). ```python from groupdocs.editor import Editor with Editor("./sample-email.eml") as editor: info = editor.get_document_info() print("Format:", info.format.name) print("Size:", info.size) print("Is encrypted:", info.is_encrypted) ``` --- ## Working with formats Path: /editor/python-net/working-with-formats/ > This article describes the classes and enums of [**GroupDocs.Editor**](https://products.groupdocs.com/editor/python-net), which represent all supportable family formats and individual formats. GroupDocs.Editor supports different document formats, all of them are conditionally divided into several family formats: 1. WordProcessing formats, which include DOC, DOCX, DOCM, RTF, ODT etc. 2. Spreadsheet formats, which include XLS, XLT, XLSX, XLSM, XLTX, ODS etc. 3. Delimiter-Separated Values (DSV) formats, also known as delimited text, that are a text-based form of spreadsheets, and include CSV, TSV, semicolon-delimited, whitespace-delimited etc. 4. Presentation formats, which include PPT, PPS, POT, PPTX, PPTM etc. 5. Text-based formats, which include TXT, HTML, XML etc. 6. Fixed-layout formats, which include PDF and XPS formats, where their representation is "baked" to be uniform on every platform. 7. eBook formats, which include Mobi, AZW3 and ePub. 8. Email formats, which include MSG, EML, PST and others. For representing these format families, the [`groupdocs.editor.formats`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/) module provides a dedicated enum per family: 1. [WordProcessingFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/wordprocessingformats), which is common for all WordProcessing formats. 2. [SpreadsheetFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/spreadsheetformats), which is common for all Spreadsheet formats. 3. [PresentationFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/presentationformats), which is common for all Presentation formats. 4. [TextualFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/textualformats), which is common for all text-based formats, including plain text (TXT), markup formats (XML and HTML), and all Delimiter-Separated Values (DSV) formats. 5. [EBookFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/ebookformats), which is common for all E-book formats, including MOBI, AZW3, and ePub. 6. [FixedLayoutFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/fixedlayoutformats), which is common for only PDF and XPS (this includes OpenXPS). 7. [EmailFormats](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.formats/emailformats), which is common for all electronic mail formats. These enums are case-insensitive and lazy-loaded. Every enum exposes a set of members, each one representing a specific format within the given family format — for example, `WordProcessingFormats.DOCX`, `SpreadsheetFormats.XLSX`, or `EBookFormats.MOBI`. A format member is the value you pass to the matching save options constructor to select the output format. ## Fetching a format You import the family enum from the `groupdocs.editor.formats` module and access the desired format member by name. The member is then used wherever a target format is required, most notably in the save options constructor: ```python from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions # Fetch one format within the WordProcessing family docm = WordProcessingFormats.DOCM # Use it to select the output format save_options = WordProcessingSaveOptions(docm) ``` The same approach applies to all other families: ```python from groupdocs.editor.formats import ( SpreadsheetFormats, PresentationFormats, EBookFormats, EmailFormats, FixedLayoutFormats, TextualFormats, ) from groupdocs.editor.options import ( SpreadsheetSaveOptions, PresentationSaveOptions, EbookSaveOptions, ) xlsx_options = SpreadsheetSaveOptions(SpreadsheetFormats.XLSX) pptx_options = PresentationSaveOptions(PresentationFormats.PPTX) mobi_options = EbookSaveOptions(EBookFormats.MOBI) ``` ## Reading format information The format descriptor exposed by [`get_document_info()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/getdocumentinfo) provides text properties that reflect the format name, MIME code, and file extension: ```python from groupdocs.editor import Editor with Editor("document.docx") as editor: info = editor.get_document_info() print("Name:", info.format.name) print("Extension:", info.format.extension) print("MIME:", info.format.mime) ``` ## Complete code example The example below loads a document, reads the format descriptor that GroupDocs.Editor detected, and selects an output format from the WordProcessing family to convert the document. {{< tabs "code-example-working-with-formats">}} {{< tab "working_with_formats.py" >}} ```python import os from groupdocs.editor import Editor, License from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions def working_with_formats(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) with Editor("./sample-document.docx") as editor: # Read the detected format descriptor info = editor.get_document_info() print("Detected format:", info.format.name, "(", info.format.extension, ")") # Pick a target format from the WordProcessing family target_format = WordProcessingFormats.DOCM save_options = WordProcessingSaveOptions(target_format) # Convert the document to the selected format editable = editor.edit() editor.save(editable, "./converted-document.docm", save_options) if __name__ == "__main__": working_with_formats() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/working-with-formats/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "converted-document.docm" >}} ```text Binary file (DOCM, 49 KB) ``` [Download full output](/editor/python-net/_output_files/developer-guide/working-with-formats/working_with_formats/converted-document.docm) {{< /tab >}} {{< /tabs >}} --- ## Working with HTML resources Path: /editor/python-net/working-with-html-resources/ > This article describes the capabilities of GroupDocs.Editor while working with HTML resources, which are an integral part of an HTML document. ## What HTML resources are Almost all existing document formats (except plain TXT and some others) contain a concept of _resources_. This usually includes images, specific fonts (which are not installed in the operating system) and so on — depending on the specific document format. In order to edit documents in a browser, GroupDocs.Editor must convert them to HTML and only then send them to the client-side WYSIWYG-editor. In this case GroupDocs.Editor must work with HTML resources, which may be divided into several groups: _images_, _stylesheets (CSS)_, _fonts_, and _audio_. When a document is opened for editing with [`editor.edit()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/edit), the resulting [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) exposes its extracted assets as collections that you can iterate (with `for` loops) and measure (with `len()`): | Property | Contents | |---|---| | `images` | embedded raster (JPEG, PNG, GIF) and vector (SVG, WMF/EMF) images | | `fonts` | extracted fonts (WOFF, WOFF2, TTF, OTF, EOT) | | `css` | stylesheets | | `audio` | audio resources (for example, from presentations) | | `all_resources` | everything above, combined | ## Iterating over resources You can inspect, count, and enumerate the resources of an opened document directly from the [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) instance: ```python from groupdocs.editor import Editor with Editor("document.docx") as editor: editable = editor.edit() print("images:", len(editable.images)) print("css:", len(editable.css)) print("fonts:", len(editable.fonts)) for image in editable.images: print("image resource:", image) ``` ## Saving HTML together with its resources The [`save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/save) method of [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) writes the HTML markup and every extracted resource (images, fonts, css) into a folder, so that the markup references its assets on disk: ```python with Editor("document.docx") as editor: editable = editor.edit() # Write HTML plus all resources into a folder editable.save("page.html", "page_resources") ``` ## Feeding modified markup with resources back When a customer edits a document in a WYSIWYG-editor and obtains the edited HTML markup along with its resources, the markup must be wrapped back into an [`EditableDocument`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument) before it can be saved. Depending on how the resources are stored, you can use one of the class methods: 1. [`from_markup(html)`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/frommarkup) — for HTML markup held in memory as a string, with resources baked into it. 2. [`from_markup_and_resource_folder(html, folder)`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/frommarkupandresourcefolder) — for HTML markup as a string whose resources live in an existing folder on disk. 3. [`from_file(html_path, folder)`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editabledocument/fromfile) — for an HTML file on disk plus its corresponding resource folder. ```python from groupdocs.editor import EditableDocument # Markup with baked-in resources edited = EditableDocument.from_markup(edited_html) # Markup string + on-disk resource folder edited = EditableDocument.from_markup_and_resource_folder(edited_html, "page_resources") # HTML file + resource folder edited = EditableDocument.from_file("page.html", "page_resources") ``` ## Complete code example The example below loads a document, opens it for editing, reports how many resources of each kind were extracted, and saves the HTML together with all of its resources into a folder. {{< tabs "code-example-working-with-html-resources">}} {{< tab "working_with_html_resources.py" >}} ```python import os from groupdocs.editor import Editor, License def working_with_html_resources(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) with Editor("./sample-document.docx") as editor: editable = editor.edit() # Inspect the extracted resources print("Images:", len(editable.images)) print("Stylesheets:", len(editable.css)) print("Fonts:", len(editable.fonts)) print("All resources:", len(editable.all_resources)) # Persist the HTML together with every resource into a folder editable.save("output.html", "output_resources") editable.dispose() if __name__ == "__main__": working_with_html_resources() ``` {{< /tab >}} {{< tab "sample-document.docx" >}} {{< tab-text >}} `sample-document.docx` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/working-with-html-resources/sample-document.docx) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "output.html" >}} ```text Sample Document ver.1

Title of the document

Subtitle #1

}} {{< /tabs >}} --- ## Edit Markdown documents Path: /editor/python-net/edit-markdown/ > This example demonstrates the standard open-edit-save pipeline with Markdown (MD) documents, using different options on every step. ## Introduction [Markdown](https://docs.fileformat.com/word-processing/md/) is a lightweight markup language, which has become popular lately. Markdown files with an `*.md` extension are actually plain text files that contain special syntax, and support text formatting, tables, lists, images, and so on. There are several dialects of Markdown, including GFM, CommonMark, Multi-Markdown, and so on. GroupDocs.Editor for Python via .NET fully supports the Markdown format on both import and export, as well as its auto-detection. GroupDocs.Editor supports the following Markdown features, which mostly follow the CommonMark specification: headings, blockquotes, code blocks, horizontal rules, bold emphasis, italic emphasis, strikethrough formatting, numbered and bulleted lists, tables, internal images (stored with base64 encoding), and external images. ## Loading Loading Markdown documents into the [`Editor`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor) class is usual and the same as for other formats. There are no dedicated load options for the Markdown format; it is enough to specify the file itself through a file path or a byte stream. ```python from groupdocs.editor import Editor # Load from a file path editor_from_path = Editor("article.md") # Load from a binary stream with open("article.md", "rb") as stream: editor_from_stream = Editor(stream) ``` ## Editing There is a special class [`MarkdownEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/markdowneditoptions) for editing Markdown files. As always, it is not mandatory when editing a document, so the parameterless [`editor.edit()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/edit) overload may be used — GroupDocs.Editor will automatically detect the format and apply the default options. However, specifying the custom [`MarkdownEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/markdowneditoptions) may be vital when an input Markdown file has external images. "External" means that images are stored somewhere else, and in the Markdown code there are _links_ to these images. In order to point GroupDocs.Editor to all external images, the `image_load_callback` property of [`MarkdownEditOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/markdowneditoptions) should be specified with an implementation of the image load callback interface, which provides the image data to GroupDocs.Editor when it encounters a link to an external image. This is an advanced scenario that requires a callback object; a schematic, non-runnable illustration is shown below. ```python from groupdocs.editor.options import MarkdownEditOptions # Assume image_loader is an object that implements the Markdown image load callback, # returning binary data for each external image referenced from the Markdown file edit_options = MarkdownEditOptions() edit_options.image_load_callback = image_loader opened = editor.edit(edit_options) ``` ## Saving GroupDocs.Editor also supports saving into the Markdown format. Like for any other format, for saving into Markdown the user must create an instance of the [`MarkdownSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/markdownsaveoptions) class and specify it in the [`editor.save()`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor/editor/save) method. If the document destined for saving in the Markdown format has images, they can be resolved in one of three ways: 1. Ignore images — they will be absent. 2. Save images inside the Markdown code, where they will be stored in base64 encoding. 3. Save images as separate files in a specified folder, and in the Markdown code there will be references to these image files. For this the [`MarkdownSaveOptions`](https://reference.groupdocs.com/editor/python-net/groupdocs.editor.options/markdownsaveoptions) class has several properties. `export_images_as_base64` is a boolean flag, by default set to `False`. If set to `True`, the content of the images will be injected inside the output Markdown as base64. This flag has the highest priority, and when it is `True` the `images_folder` property is ignored. The `images_folder` property, in turn, works when `export_images_as_base64` is set to `False`; if specified, it should contain a valid path to an existing folder where GroupDocs.Editor should save the images. ## Roundtrip Because the Markdown format is supported on both import and export, it is possible to perform a roundtrip scenario — open a Markdown file for editing, edit it, and then save the edited version back to the Markdown format. The runnable example below demonstrates such a scenario with a Markdown file that has no external images, so no image load callback is needed. {{< tabs "code-example-edit-markdown">}} {{< tab "edit_markdown.py" >}} ```python import os from groupdocs.editor import Editor, EditableDocument, License from groupdocs.editor.options import MarkdownEditOptions, MarkdownSaveOptions def edit_markdown(): # Optionally set a license license_path = os.path.abspath("./GroupDocs.Editor.lic") if os.path.exists(license_path): License().set_license(license_path) # Load an input Markdown file into the Editor with Editor("./sample-article.markdown") as editor: # Create the Markdown edit options edit_options = MarkdownEditOptions() # Edit the Markdown document and obtain an EditableDocument editable = editor.edit(edit_options) # Edit the content programmatically (in practice this is done in a WYSIWYG-editor) html = editable.get_embedded_html() edited = EditableDocument.from_markup(html.replace("

", " (edited)

", 1)) # Create Markdown save options with images embedded as base64 save_options = MarkdownSaveOptions() save_options.export_images_as_base64 = True # Save the edited document back to the Markdown format editor.save(edited, "./edited-article.md", save_options) editable.dispose() edited.dispose() if __name__ == "__main__": edit_markdown() ``` {{< /tab >}} {{< tab "sample-article.markdown" >}} {{< tab-text >}} `sample-article.markdown` is the sample file used in this example. Click [here](/editor/python-net/_sample_files/developer-guide/edit-document/edit-markdown/sample-article.markdown) to download it. {{< /tab-text >}} {{< /tab >}} {{< tab "edited-article.md" >}} ```text # Sample Article This is a **sample Markdown document** used by the GroupDocs.Editor for Python via .NET examples. ## Introduction GroupDocs.Editor converts Markdown to editable HTML and back, so you can edit content programmatically or in any WYSIWYG editor. ## Features - Headings, paragraphs, and emphasis - Ordered and unordered lists - Links such as [GroupDocs](https://www.groupdocs.com) - Inline code and fenced code blocks ## Example List [TRUNCATED] ``` [Download full output](/editor/python-net/_output_files/developer-guide/edit-document/edit-markdown/edit_markdown/edited-article.md) {{< /tab >}} {{< /tabs >}} In this example the same Markdown file is opened for editing and then saved back into the Markdown format with all images embedded as base64. The `export_images_as_base64` flag could be replaced with the `images_folder` property to store images as separate files instead. --- ## Migration Notes Path: /editor/python-net/migration-notes/ ## Overview GroupDocs.Editor for Python via .NET is a thin Python binding over the GroupDocs.Editor for .NET engine. As a result, its object model, classes, and workflow mirror the .NET API one-to-one. There is no separate legacy Python API to migrate away from — if you already know the .NET API (or are following the .NET examples), the Python usage will be immediately familiar. ## Key differences from the .NET API When translating .NET code or documentation to Python, keep the following conventions in mind: * **Naming** — properties and methods use `snake_case` in Python, which is automatically mapped to the .NET `PascalCase`. For example, `Editor.Edit()` becomes `editor.edit()`, `EditableDocument.GetEmbeddedHtml()` becomes `editable.get_embedded_html()`, and the `EnablePagination` property becomes `enable_pagination`. * **Context managers** — use `with Editor(...) as editor:` so the native document handle is released automatically. `EditableDocument` is disposable too, and exposes a `dispose()` method. * **Factories** — the static factory methods become Python class methods: `EditableDocument.from_markup(html)`, `EditableDocument.from_markup_and_resource_folder(html, folder)`, and `EditableDocument.from_file(html_path, folder)`. * **Options families** — pick the load / edit / save options class that matches the document family. The save options type — not the file extension — controls the output format. * **Enums** — format and option enums are case-insensitive and lazy-loaded, for example `WordProcessingFormats.DOCX` and `MailMessageOutput.ALL`. * **Streams** — pass `open("file", "rb")` or `io.BytesIO(data)` where the .NET API expects a `Stream`. ## The core workflow The fundamental load → edit → save pipeline is identical to .NET: ```python from groupdocs.editor import Editor, EditableDocument from groupdocs.editor.formats import WordProcessingFormats from groupdocs.editor.options import WordProcessingSaveOptions with Editor("document.docx") as editor: # Obtain the editable document from the original DOCX document editable = editor.edit() html_content = editable.get_embedded_html() # Pass html_content to a WYSIWYG editor and edit there... # Save the edited document to some WordProcessing format save_options = WordProcessingSaveOptions(WordProcessingFormats.DOCX) editor.save(editable, "edited.docx", save_options) ``` For more code examples and specific use cases, please refer to our [Developer Guide]({{< ref "editor/python-net/developer-guide/_index.md" >}}) documentation or the [GitHub](https://github.com/groupdocs-editor/GroupDocs.Editor-for-Python-via-.NET) samples and showcases.