Introduction

This article explains the most common and fundamental principles of GroupDocs.Editor, how it works, what is its purpose, and how it should be used.

GroupDocs.Editor is a .NET GUI-less class library, which means that it has only programmatic interface (API). This fact means that in order to edit a document user must use GroupDocs.Editor in conjunction with some 3rd-party editor application, through which GUI the end-user is able to edit document content. For GroupDocs.Editor it is not important which exactly editor software is used. But because GroupDocs.Editor is aimed on web-development, it has the only requirement — 3rd-party editor should be compatible with HTML documents.

In order to edit a document with GroupDocs.Editor, user must perform several sequential steps: load document into GroupDocs.Editor (using optional load options), open document for editing (with optional edit options), generate HTML markup with resources (using different options and settings), and pass this markup to the 3rd-party WYSIWYG HTML-editor. Then end-user edits the document content, and when he finish editing and submits the edited document, this modified markup should be transferred back to the GroupDocs.Editor and converted to the output document of desired format.

From the GroupDocs.Editor perspective, this pipeline can be conditionally divided onto three main stages, that are described below.

Loading document into the GroupDocs.Editor

On the loading document stage user should create an instance of the Editor class and pass an input document (through file path or byte stream) along with document load options. Loading options are not required and GroupDocs.Editor can automatically detect document format and select the most appropriate default options for the given format. But it is recommended to specify them explicitly. They are inevitable when trying to load the password-protected documents.

string inputFilePath = "C:\\input_path\\document.docx"; //path to some document
Editor editor = new Editor(inputFilePath); //passing path to the constructor, default WordProcessingLoadOptions will be applied automatically

After this stage document is ready to be opened and edited.

Opening a document for editing

Because GroupDocs.Editor is GUI-less library, document cannot be edited directly into it. But in order to edit document in WYSIWYG HTML-editor, GroupDocs.Editor needs to generate an HTML-version of a document, because any WYSIWYG editor can work only with HTML/CSS markup. When instance of Editor class is created on the 1st stage, user should open document for editing by calling an Edit() method of Editor class. This method returns an instance of EditableDocument class. This class can be described as a converted version of input document, that is stored in internal intermediate format, compatible with all formats, that GroupDocs.Editor supports. With EditableDocumentuser can obtain HTML markup of the input document with different options, stylesheets, images, fonts, save HTML-document to disk, and other things. It is implied that HTML-markup, emitted by EditableDocument, then is passed into the client-side WYSIWYG HTML-editor, where end-user can actually edit the document.

Like with loading, Edit() method obtains optional IEditOptions inheritors, that controls how exactly the document will be opened for edit.

WordProcessingEditOptions editOptions = new WordProcessingEditOptions();
editOptions.EnableLanguageInformation = true;

EditableDocument readyToEdit = editor.Edit(editOptions);

After this stage document is ready to be passed to the WYSIWYG HTML-editor and its content can be edited by the end-user.

Saving a document

Saving a document is a final stage, which occurs when document content was edited in the WYSIWYG HTML-editor (or any other software, this has no difference for GroupDocs.Editor) and should be saved back as a document of some format (like DOCX, PDF, or XLSX, for example). At this stage user should create a new instance of EditableDocument class with HTML-markup and resources of edited version of the original document, that was obtained from end-user. EditableDocument class contains several static methods, that allows to create its instances from HTML documents, that may be presented in different forms. And when EditableDocument instance is ready, it is possible to save it as an ordinary document using a Save() method of Editor class.

EditableDocument afterEdit = EditableDocument.FromMarkup("<body>HTML content of the document...</body>", null);
string outputFilePath = "C:\\output_path\\document.rtf";
Options.WordProcessingSaveOptions saveOptions = new WordProcessingSaveOptions(WordProcessingFormats.Rtf);
editor.Save(afterEdit, outputFilePath, saveOptions);

Unlike with previous load options and edit options, save options are mandatory, because GroupDocs.Editor needs to know exact document format for saving.

Detecting document type

Sometimes it is necessary to detect a document type and extract its metadata before sending it for editing. For such scenarios GroupDocs.Editor allows to detect document type and extract its the most necessary metainfo depending on document type:

  1. Is document encoded or not;
  2. Exact document format;
  3. Document size;
  4. Number of pages (tabs);
  5. Text encoding, if document is textual.

In order to detect document type and gather its meta info, user should load a desired document into the Editor class and then call a GetDocumentInfo() method.

Describing options

On every stage user can adjust (tune) the processing by different options:

  1. ILoadOptions for loading document.
  2. IEditOptions for opening document for editing.
  3. ISaveOptions for saving edited document.

Some of these options may be optional in specific cases, some are mandatory. For example, it is possible to load a document into the Editor class without loading options, — in such case GroupDocs.Editor will try to detect the document format automatically and apply the most appropriate default options for detected document format.

Describing family formats

All document formats, which GroupDocs.Editor supports, are grouped into family formats. Each family format has lot of common features, so there is no options for each format — only for family format. Relation between formats, family formats, import/export formats and options is illustrated in the table below.

Family formatSupported formatsLoadSaveLoad optionsEdit optionsSave optionsMetadata
WordProcessingDOC, DOCX, DOCM, DOT,DOTX, DOTM, RTF,WordprocessingML Flat XML, ODT, OTT, Word 2003 XML(tick)(tick)WordProcessingLoadOptionsWordProcessingEditOptionsWordProcessingSaveOptionsWordProcessingDocumentInfo
SpreadsheetXLS, XLT, XLSX, XLSM, XLSB, XLTX, XLTM, XLAM, SpreadsheetML XML, ODS, FODS, SXC, DIF(tick)(tick)SpreadsheetLoadOptionsSpreadsheetEditOptionsSpreadsheetSaveOptionsSpreadsheetDocumentInfo
DSVCSV, TSV, semicolon-separated, whitespace-separated, arbitrary separator(tick)(tick)N/ADelimitedTextEditOptionsDelimitedTextSaveOptionsN/A
PresentationPPT, PPTX, PPTM, PPS, PPSX, PPSM, POT, POTX, POTM, ODP, OTP(tick)(tick)PresentationLoadOptionsPresentationEditOptionsPresentationSaveOptionsPresentationDocumentInfo
XMLAny XML document(tick)(error)N/AXmlEditOptionsN/ATextualDocumentInfo
TXTAny text document(tick)(tick)N/ATextEditOptionsTextSaveOptionsTextualDocumentInfo
Fixed-layout formatPDF(tick)(tick)PdfLoadOptionsPdfEditOptionsPdfSaveOptionsFixedLayoutDocumentInfo
Fixed-layout formatXPS (including OpenXPS)(tick)(tick)N/AXpsEditOptionsXpsSaveOptionsFixedLayoutDocumentInfo
e-BookMobi, AZW3, ePub(tick)(tick)N/AEbookEditOptionsAzw3SaveOptions / EpubSaveOptionsEbookDocumentInfo
EmailEML, EMLX, TNEF, MSG, HTML, MHTML, ICS, VCF, PST, MBOX, OFT(tick)(tick)N/AEmailEditOptionsEmailSaveOptionsEmailDocumentInfo

Additional materials

Detailed information about every stage of document processing along with source code examples, options explanations and so on, can be found in the next articles:

  1. Load document
  2. Edit document
  3. Save document

Complete description of EditableDocument class, all its possibilities, members and purpose, along with source code example, is located in the next articles:

Detailed review of all supported family formats together with explaining their load/edit/save options, illustrated with source code, can be found in the next articles: