How to edit e-Book file

Introduction

For the version 22.9, the GroupDocs.Editor for .NET supports 3 formats from the e-Book family:

  1. MOBI (MobiPocket),
  2. AZW3, also known as Kindle Format 8 (KF8),
  3. ePub (Electronic Publication).

As for the 22.9 version, the AZW3 and ePub formats are supported on both import (load) and export (save), while MOBI is supported only on import (support of MOBI on export was scheduled for the next release).

Starting from the version 23.4, the MOBI format is fully supported on export.

Load e-Book files for edit

GroupDocs.Editor for .NET doesn’t contain loading options nor for the whole e-Book formats family neither for the specific e-Book formats — users should specify e-Books through file path or byte stream without any loading options at all.

Code example below shows loading of 3 different e-Books in different formats into the 3 different istances of the Editor class from different sources:

string mobiPath = "path/to/A-Room-with-a-View-morrison.mobi");
string azw3Path = "path/to/Around the World in 28 Languages.azw3";
string epubPath = "path/to/Alices Adventures in Wonderland.epub";

GroupDocs.Editor.Editor editorMobi = new Editor(mobiPath);

FileStream azw3Stream = File.OpenRead(azw3Path);
GroupDocs.Editor.Editor editorAzw3 = new Editor(delegate() { return azw3Stream; });

byte[] epubBytes = File.ReadAllBytes(epubPath);
MemoryStream epubStream = new MemoryStream(epubBytes);
GroupDocs.Editor.Editor editorEpub = new Editor(delegate () { return epubStream; });


// ...
// Don't forget to dispose Editors when work is done
editorMobi.Dispose();
editorAzw3.Dispose();
editorEpub.Dispose();

Edit e-Book files

There is a common edit options for the whole e-Book formats family — a EbookEditOptions class. The content of this class resembles the content of the WordProcessingEditOptions class, because EbookEditOptions contains a subset of options from WordProcessingEditOptionsEnablePagination and EnableLanguageInformation and, as in the WordProcessingEditOptions, they are disabled (false) by default.

  • EnablePagination — allows to enable or disable pagination in the resultant HTML document. By default is disabled (false). This option controls how exactly the content of the e-Book will be converted to the EditableDocument representation while edited — in the float (false) or in the paged (true) mode. At the end, this options affects on the structure and representation of the HTML/CSS document, that the end-user edits in the WYSIWYG-editor.
  • EnableLanguageInformation — allows to export (true) or do not export (false) the language information to the resultant HTML markup. By default is disabled (false). This is useful when an e-Book contains text on different languages, and you want to preserve this language-specific metainformation while editing document in the WYSIWYG-editor.

Like for all supported document formats and options, in order to edit the document, user should firstly load it to the Editor class (this step was reviewed in the section above) and then call an Edit(IEditOptions) method. Like for all supported document formats, the EbookEditOptions are optional and user may call a parameterless Edit() method — in this case the default EbookEditOptions will be implicitly applied.

Code example below demonstrates a loading of a single ePub file to the Editor instance and then editing it twice with two different edit options (default and custom) and generating two different EditableDocument instances from an input single ePub file. Then two different HTML markup pieces are generated from these two EditableDocument instances.

string epubPath = "path/to/Alices Adventures in Wonderland.epub";

GroupDocs.Editor.Editor editorEpub = new Editor(epubPath);

Options.EbookEditOptions defaultEditOptions = new Options.EbookEditOptions();

Options.EbookEditOptions customEditOptions = new Options.EbookEditOptions();
customEditOptions.EnablePagination = true;
customEditOptions.EnableLanguageInformation = true;

EditableDocument defaultEdited = editorEpub.Edit(defaultEditOptions);
EditableDocument customEdited = editorEpub.Edit(customEditOptions);

string embeddedHtmlDefaultEdited = defaultEdited.GetEmbeddedHtml();
string embeddedHtmlCustomEdited = customEdited.GetEmbeddedHtml();

// ...
// Don't forget to dispose Editor and EditableDocuments when work is done
defaultEdited.Dispose();
customEdited.Dispose();
editorEpub.Dispose();

Save e-Book files after edit

Note
This section describes the saving of e-book files in the GroupDocs.Editor version 23.4 and newer. In using older versions, this description is invalid and source code is not working.

Starting from the version 23.4 the GroupDocs.Editor for .NET has obtained an ability to save e-books in all 3 supported formats: MOBI, AZW3, and ePub. Saving of the e-books is performed like for all other formats. When e-book content was edited by the client in the WYSIWYG-editor and was sent back to the server-side, it should be passed to the EditableDocument, and then this instance should be passed to the GroupDocs.Editor.Editor.Save() method.

The Editor.Save() method obtains an instance of the ISaveOptions interface, and for saving in some of the e-book formats the EbookSaveOptions class should be used. This class is common for all supported e-book formats within e-book family: MOBI, AZW3 and ePub. It has one constructor with a mandatory parameter — a desired output format, into which the resultant document will be stored. This specific format should be specified as one of the Formats.EBookFormats value: Mobi, Azw3, or Epub. Once the instance was created, this format can be obtained and changed using the OutputFormat property.

The EbookSaveOptions class has also two another properties: SplitHeadingLevel and ExportDocumentProperties.

  • SplitHeadingLevel of the System.Int32 type controls how (if so) to split the content of e-book onto packages in the resultant file. It doesn’t affect the representation of a file, opened in any e-Book reader; rather, it is about an internal structure of the e-Book file. If you dont bother about internal structure of the e-book file, you may leave this property to has the default value (2). Setting it to 0 will disable splitting, so all content of the e-Book will be incorporarted into a single package inside the resultant file.
  • ExportDocumentProperties of the System.Boolean type controls whether to export built-in and custom document properties inside the resultant e-Book file. If you have no plans to reconvert the resultant e-book to some other format, you may leave it intact — the default false value disables the exporting of the document properties, so the resultant document will be a little bit smaller in size.

Code example below demonstrates a loading of a single ePub file to the Editor instance, editing it with default options, and saving to the ePub, AZW3, and Mobi with different options for each one.

string epubPath = Path.Combine(Common.TestHelper.EpubFolder, "Alices Adventures in Wonderland.epub");

string epubOutputPath = "Output_ePub.epub";
string azw3OutputPath = "Output_AZW3.azw3";
string mobiOutputPath = "Output_Mobi.azw3";

GroupDocs.Editor.Editor editor = new Editor(epubPath);

//edit with default EbookEditOptions
EditableDocument edited = editor.Edit();

Options.EbookSaveOptions epubSaveOptions = new Options.EbookSaveOptions(Formats.EBookFormats.Epub);
epubSaveOptions.ExportDocumentProperties = true;
epubSaveOptions.SplitHeadingLevel = 5;

Options.EbookSaveOptions azw3SaveOptions = new Options.EbookSaveOptions(Formats.EBookFormats.Azw3);
azw3SaveOptions.SplitHeadingLevel = 1;

Options.EbookSaveOptions mobiSaveOptions = new Options.EbookSaveOptions(Formats.EBookFormats.Mobi);

editor.Save(edited, epubOutputPath, epubSaveOptions);
editor.Save(edited, azw3OutputPath, azw3SaveOptions);
editor.Save(edited, mobiOutputPath, mobiSaveOptions);

// ...
// Don't forget to dispose Editor and EditableDocument when work is done
edited.Dispose();
editor.Dispose();

Saving e-Book files before version 23.4

Note
This section describes the saving of e-book files in the GroupDocs.Editor versions 22.9 - 23.2. From the version 23.4 (inclusive) and newer this description is invalid and source code is not working.

Starting from the version 22.9 and before the version 23.4 the GroupDocs.Editor for .NET was able to save e-books only in AZW3 and ePub formats, while MOBI was not supported on the export. Due to this fact and unlike other format families and unlike a single EbookEditOptions class, which is common for all e-Book formats, there was no single options class for saving into different e-Book formats at that moment. And because the GroupDocs.Editor for .NET has supported saving into AZW3 and ePub, there were distinct save options classes for each of these two formats:

These classes had one common property - a SplitHeadingLevel of the System.Int32 type. This property controls how (if so) to split the content of AZW3 or ePub e-book onto packages in the resultant file. It doesn’t affect the representation of a file, opened in any e-Book reader; rather, it is about an internal structure of the e-Book file. If you dont bother about internal structure of the ePub or AZW3 file, you may leave this property to has the default value.

EpubSaveOptions also has an ExportDocumentProperties boolean property — it controls whether to export built-in and custom document properties inside the resultant IDPF ePub e-Book. If you have no plans to reconvert the resultant ePub to some other format, you may leave it intact — the default false value disables the exporting of the document properties, so the resultant document will be a little bit smaller in size.

Code example below demonstrates a loading of a single ePub file to the Editor instance, editing it with default options, and saving to the ePub and AZW3 with different options for each one.

string epubPath = "path/to/Alices Adventures in Wonderland.epub";
string epubOutputPath = "Output_ePub.epub";
string azw3OutputPath = "Output_AZW3.azw3";

GroupDocs.Editor.Editor editor = new Editor(epubPath);

//edit with default EbookEditOptions
EditableDocument edited = editor.Edit();

Options.EpubSaveOptions epubSaveOptions = new Options.EpubSaveOptions();
epubSaveOptions.ExportDocumentProperties = true;
epubSaveOptions.SplitHeadingLevel = 5;

Options.Azw3SaveOptions azw3SaveOptions = new Options.Azw3SaveOptions();
azw3SaveOptions.SplitHeadingLevel = 1;

editor.Save(edited, epubOutputPath, epubSaveOptions);
editor.Save(edited, azw3OutputPath, azw3SaveOptions);

// ...
// Don't forget to dispose Editor and EditableDocument when work is done
edited.Dispose();
editor.Dispose();

Extracting metainfo from e-Book files

Like for all supported formats, the GroupDocs.Editor for .NET provides an ability to detected the document metainfo for all supported e-Book formats by using a GetDocumentInfo() method of the Editor class. In case when a valid e-Book was loaded into the Editor instance, a GetDocumentInfo() will return an instance of a Metadata.EbookDocumentInfo class, which inherits from IDocumentInfo interface, which, in turn, defines 4 properties: Format, PageCount, Size, and IsEncrypted.

  • Format property returns a Formats.EBookFormats struct, which for the e-Books can be a Mobi, Azw3, or Epub value.
  • PageCount property returns an approximate number of pages in case of MOBI or AZW3 or a number of chapters in case of ePub. For the Mobi and AZW3, it is approximate, because Mobi/AZW3 format internally is a set of HTML documents (chapters), which are not separated on pages and even have no strict page dimensions, which allows to split content on page blocks and thus calculate the number of pages. This decision was made by Mobi/AZW3 format designers intentionally to allows variable page size (and count) on different devices — from FullHD displays to smartphones. So, for returning a page count for a Mobi/AZW3 document, GroupDocs.Editor assumes standard A4 page size in a portrait orientation, splits existing document content on such “papers”, and then calculates its count. So the returning number should be treated very carefully and approximately, users should not rely on it.
  • Size property returns a number of bytes of e-Book file.
  • IsEncrypted property always returns a false value, because e-Books cannot be encrypted with password, like PDF or Office Open XML.

Code example below demonstrates a loading of 3 different e-Books in different formats (Mobi, AZW3 and ePub) into the 3 different istances of the Editor class and then extracting information about them and checking it with NUnit.

string mobiPath = "Ebook.mobi");
string azw3Path = "Ebook.azw3";
string epubPath = "Ebook.epub";

GroupDocs.Editor.Editor editorMobi = new Editor(mobiPath);
GroupDocs.Editor.Editor editorAzw3 = new Editor(azw3Path);
GroupDocs.Editor.Editor editorEpub = new Editor(epubPath);

GroupDocs.Editor.Metadata.IDocumentInfo mobiInfo = editorMobi.GetDocumentInfo(null);
GroupDocs.Editor.Metadata.IDocumentInfo azw3Info = editorAzw3.GetDocumentInfo(null);
GroupDocs.Editor.Metadata.IDocumentInfo epubInfo = editorEpub.GetDocumentInfo(null);

Assert.AreEqual(Formats.EBookFormats.Mobi, mobiInfo.Format);
Assert.AreEqual(Formats.EBookFormats.Azw3, azw3Info.Format);
Assert.AreEqual(Formats.EBookFormats.Epub, epubInfo.Format);

// ...
// Don't forget to dispose Editors when work is done
editorMobi.Dispose();
editorAzw3.Dispose();
editorEpub.Dispose();