This demonstration shows and explains different operations with resources, including retrieving them in different scenarios.
Introduction
Almost all documents of any type have resources. These are first of all images; some document formats also hold fonts. Even for a plain text document (TXT), when converting it to HTML for editing, there will be one stylesheet, that is treated as a resource. WordProcessing documents of some formats, Office Open XML usually, can also contain embedded audio files. GroupDocs.Editor allows to work with resources on the editing phase, when the document was loaded into the Editor class and opened for editing by generating the EditableDocument instance, that is produced by the Editor.edit() method. GroupDocs.Editor classifies all resources into several groups:
Images, including: raster (PNG, BMP, JPEG, GIF, ICON) and vector (SVG and WMF).
Fonts, including: TTF, EOT, WOFF, WOFF2.
Textual resources: CSS stylesheets.
Audio files: MP3.
Preparations
Let’s prepare an EditableDocument instance by loading and editing some input WordProcessing document, as always:
fromgroupdocs.editorimportEditorfromgroupdocs.editor.optionsimportWordProcessingLoadOptionseditor=Editor("document.docx",WordProcessingLoadOptions())before_edit=editor.edit()# create an EditableDocument instance
Obtaining resources
Now, when the EditableDocument instance is ready, it is possible to obtain resources from it, and EditableDocument provides several ways for this.
First of all, resources can be retrieved by their type. EditableDocument exposes an iterable collection for every resource type:
images — all images, raster and vector.
fonts — all fonts.
css — the CSS stylesheets, where each item represents one stylesheet.
audio — the MP3 audio files.
Secondly, completely all resources may be obtained with a single property — all_resources. It returns everything above, combined, and in fact is a concatenation of the previous collections.
All these collections can be iterated with for loops and measured with len():
There is also a dedicated way for the stylesheets. The reason is that stylesheets can contain external resources too, presented as links with URLs — for example images, fonts, and other stylesheets. In such a case it may be necessary to adjust such a link. For coping with this, EditableDocument contains the get_css_content() method. Without arguments it returns the stylesheets as-is; it can also accept prefixes for external images and fonts referenced from the stylesheets:
The example below loads a document, opens it for editing, and reports how many resources of each kind were extracted.
importosfromgroupdocs.editorimportEditor,Licensedefworking_with_resources():# Optionally set a licenselicense_path=os.path.abspath("./GroupDocs.Editor.lic")ifos.path.exists(license_path):License().set_license(license_path)withEditor("./sample-document.docx")aseditor:before_edit=editor.edit()# Inspect the extracted resources by their typeprint("Images:",len(before_edit.images))print("Stylesheets:",len(before_edit.css))print("Fonts:",len(before_edit.fonts))print("All resources:",len(before_edit.all_resources))# Enumerate the stylesheets of the documentforone_stylesheetinbefore_edit.css:print("stylesheet resource:",one_stylesheet)before_edit.dispose()if__name__=="__main__":working_with_resources()
sample-document.docx is the sample file used in this example. Click here to download it.