This article explains the most common and fundamental principles of GroupDocs.Editor, how it works, what is its purpose, and how it should be used.
GroupDocs.Editor is a Java GUI-less class library, which means that it has only programmatic interface (API). This fact means that in order to edit a document user must use GroupDocs.Editor in conjunction with some 3rd-party editor application, through which GUI the end-user is able to edit document content. For GroupDocs.Editor it is not important which exactly editor software is used. But because GroupDocs.Editor is aimed on web-development, it has the only requirement — 3rd-party editor should be compatible with HTML documents.
In order to edit a document with GroupDocs.Editor, user must perform several sequential steps: load document into GroupDocs.Editor (using optional load options), open document for editing (with optional edit options), generate HTML markup with resources (using different options and settings), and pass this markup to the 3rd-party WYSIWYG HTML-editor. Then end-user edits the document content, and when he finish editing and submits the edited document, this modified markup should be transferred back to the GroupDocs.Editor and converted to the output document of desired format.
From the GroupDocs.Editor perspective, this pipeline can be conditionally divided onto three main stages, that are described below.
Loading document into the GroupDocs.Editor
On the loading document stage user should create an instance of Editor class and pass an input document (through file path or byte stream) along with document load options. Loading options are not required and GroupDocs.Editor can automatically detect document format and select the most appropriate default options for the given format. But it is recommended to specify them explicitly. They are inevitable when trying to load password-protected documents.
String inputFilePath = "C:\\input_path\\document.docx"; //path to some document Editor editor = new Editor(inputFilePath); //passing path to the constructor, default WordProcessingLoadOptions will be applied automatically
After this stage document is ready to be opened and edited.
Opening a document for editing
Because GroupDocs.Editor is GUI-less library, document cannot be edited directly into it. But in order to edit document in WYSIWYG HTML-editor, GroupDocs.Editor needs to generate an HTML-version of a document, because any WYSIWYG editor can work only with HTML/CSS markup. When instance of Editor class is created on the 1st stage, user should open document for editing
by calling an Edit() method of Editor class. This method returns an instance of EditableDocument class. This class can be described as a converted version of input document, that is stored in internal intermediate format, compatible with all formats, that GroupDocs.Editor supports. With EditableDocumentuser can obtain HTML markup of the input document with different options, stylesheets, images, fonts, save HTML-document to disk, and other things. It is implied that HTML-markup, emitted by EditableDocument, then is passed into the client-side WYSIWYG HTML-editor, where end-user can actually edit the document.
WordProcessingEditOptions editOptions = new WordProcessingEditOptions(); editOptions.setEnableLanguageInformation(true); EditableDocument readyToEdit = editor.edit(editOptions);
After this stage document is ready to be passed to the WYSIWYG HTML-editor and its content can be edited by the end-user.
Saving a document
Saving a document is a final stage, which occurs when document content was edited in the WYSIWYG HTML-editor (or any other software, this has no difference for GroupDocs.Editor) and should be saved back as a document of some format (like DOCX, PDF, or XLSX, for example). At this stage user should create a new instance of EditableDocument class with HTML-markup and resources of edited version of the original document, that was obtained from end-user.
EditableDocument class contains several static methods, that allows to create its instances from HTML documents, that may be presented in different forms. And when
EditableDocument instance is ready, it is possible to save it as an ordinary document using a Save() method of Editor class.
EditableDocument afterEdit = EditableDocument.fromMarkup("<body>HTML content of the document...</body>", null); String outputFilePath = "C:\\output_path\\document.rtf"; WordProcessingSaveOptions saveOptions = new WordProcessingSaveOptions(WordProcessingFormats.Rtf); editor.save(afterEdit, outputFilePath, saveOptions);
Unlike with previous load options and edit options, save options are mandatory, because GroupDocs.Editor needs to know exact document format for saving.
Detecting document type
Sometimes it is necessary to detect a document type and extract its metadata before sending it for editing. For such scenarios GroupDocs.Editor allows to detect document type and extract its the most necessary metainfo depending on document type:
- Is document encoded or not;
- Exact document format;
- Document size;
- Number of pages (tabs);
- Text encoding, if document is textual.
On every stage user can adjust (tune) the processing by different options:
- ILoadOptions for loading document.
- IEditOptions for opening document for editing.
- ISaveOptions for saving edited document.
Some of these options may be optional in specific cases, some are mandatory. For example, it is possible to load a document into the
Editor class without loading options, — in such case GroupDocs.Editor will try to detect the document format automatically and apply the most appropriate default options for detected document format.
Describing family formats
All document formats, which GroupDocs.Editor supports, are grouped into family formats. Each family format has lot of common features, so there is no options for each format — only for family format. Relation between formats, family formats, import/export formats and options is illustrated in the table below.
|Family format||Supported formats||Load||Save||Load options||Edit options||Save options||Metadata|
|WordProcessing||DOC, DOCX, DOCM, DOT,DOTX, DOTM, RTF,WordprocessingML Flat XML, ODT, OTT, Word 2003 XML||WordProcessingLoadOptions||WordProcessingEditOptions||WordProcessingSaveOptions||WordProcessingDocumentInfo|
|Spreadsheet||XLS, XLT, XLSX, XLSM, XLSB, XLTX, XLTM, XLAM,SpreadsheetML XML, ODS, FODS, SXC, DIF||SpreadsheetLoadOptions||SpreadsheetEditOptions||SpreadsheetSaveOptions||SpreadsheetDocumentInfo|
|DSV||CSV, TSV, semicolon-separated,whitespace-separated, arbitrary separator||N/A||DelimitedTextEditOptions||DelimitedTextSaveOptions||N/A|
|Presentation||PPT, PPTX, PPTM, PPS, PPSX, PPSM,POT, POTX, POTM, ODP, OTP||PresentationLoadOptions||PresentationEditOptions||PresentationSaveOptions||PresentationDocumentInfo|
|XML||Any XML document||N/A||XmlEditOptions||N/A||TextualDocumentInfo|
|TXT||Any text document||N/A||TextEditOptions||TextSaveOptions||TextualDocumentInfo|
Detailed information about every stage of document processing along with source code examples, options explanations and so on, can be found in the next articles:
Complete description of EditableDocument class, all its possibilities, members and purpose, along with source code example, is located in the next articles:
- Working with EditableDocument
- Get HTML markup in different forms
- Save HTML to folder
- Working with resources
- Create EditableDocument from file or markup
Detailed review of all supported family formats together with explaining their load/edit/save options, illustrated with source code, can be found in the next articles: