How to edit e-Book file


The GroupDocs.Editor for Java supports 3 formats from the e-Book family:

  1. MOBI (MobiPocket),
  2. AZW3, also known as Kindle Format 8 (KF8),
  3. ePub (Electronic Publication).

As for the 23.9 version, the AZW3 and ePub formats are supported on both import (load) and export (save), while MOBI is supported only on import (support of MOBI on export was scheduled for the next release).

Starting from the version 23.9, the MOBI format is fully supported on export.

Load e-Book files for edit

GroupDocs.Editor for Java doesn’t contain loading options nor for the whole e-Book formats family neither for the specific e-Book formats — users should specify e-Books through file path or byte stream without any loading options at all.

Code example below shows loading of 3 different e-Books in different formats into the 3 different istances of the Editor class from different sources:

String mobiPath = "path/to/");
String azw3Path = "path/to/Around the World in 28 Languages.azw3";
String epubPath = "path/to/Alices Adventures in Wonderland.epub";

Editor editorMobi = new Editor(mobiPath);

FileInputStream azw3Stream = new FileInputStream(azw3Path);
Editor editorAzw3 = new Editor(azw3Stream;);

byte[] epubBytes = Files.readAllBytes(Paths.get(epubPath));
ByteArrayInputStream epubStream = new ByteArrayInputStream(epubBytes);
Editor editorEpub = new Editor(epubStream);

// ...
// Don't forget to dispose Editors when work is done

Edit e-Book files

There is a common edit options for the whole e-Book formats family — a EbookEditOptions class. The content of this class resembles the content of the WordProcessingEditOptions class, because EbookEditOptions contains a subset of options from WordProcessingEditOptionsgetEnablePagination() and getEnableLanguageInformation() and, as in the WordProcessingEditOptions, they are disabled (false) by default.

  • getEnablePagination() — allows to enable or disable pagination in the resultant HTML document. By default is disabled (false). This option controls how exactly the content of the e-Book will be converted to the EditableDocument representation while edited — in the float (false) or in the paged (true) mode. At the end, this options affects on the structure and representation of the HTML/CSS document, that the end-user edits in the WYSIWYG-editor.
  • getEnableLanguageInformation() — allows to export (true) or do not export (false) the language information to the resultant HTML markup. By default is disabled (false). This is useful when an e-Book contains text on different languages, and you want to preserve this language-specific metainformation while editing document in the WYSIWYG-editor.

Like for all supported document formats and options, in order to edit the document, user should firstly load it to the Editor class (this step was reviewed in the section above) and then call an edit(IEditOptions) method. Like for all supported document formats, the EbookEditOptions are optional and user may call a parameterless edit() method — in this case the default EbookEditOptions will be implicitly applied.

Code example below demonstrates a loading of a single ePub file to the Editor instance and then editing it twice with two different edit options (default and custom) and generating two different EditableDocument instances from an input single ePub file. Then two different HTML markup pieces are generated from these two EditableDocument instances.

String epubPath = "path/to/Alices Adventures in Wonderland.epub";

Editor editorEpub = new Editor(epubPath);

EbookEditOptions defaultEditOptions = new EbookEditOptions();

EbookEditOptions customEditOptions = new EbookEditOptions();

EditableDocument defaultEdited = editorEpub.edit(defaultEditOptions);
EditableDocument customEdited = editorEpub.edit(customEditOptions);

String embeddedHtmlDefaultEdited = defaultEdited.getEmbeddedHtml();
String embeddedHtmlCustomEdited = customEdited.getEmbeddedHtml();

// ...
// Don't forget to dispose Editor and EditableDocuments when work is done

Save e-Book files after edit

This section describes the saving of e-book files in the GroupDocs.Editor version 23.9 and newer. In using older versions, this description is invalid and source code is not working.

Starting from the version 23.9 the GroupDocs.Editor for Java has obtained an ability to save e-books in all 3 supported formats: MOBI, AZW3, and ePub. Saving of the e-books is performed like for all other formats. When e-book content was edited by the client in the WYSIWYG-editor and was sent back to the server-side, it should be passed to the EditableDocument, and then this instance should be passed to the method.

The method obtains an instance of the ISaveOptions interface, and for saving in some of the e-book formats the EbookSaveOptions class should be used. This class is common for all supported e-book formats within e-book family: MOBI, AZW3 and ePub. It has one constructor with a mandatory parameter — a desired output format, into which the resultant document will be stored. This specific format should be specified as one of the EBookFormats value: Mobi, Azw3, or Epub. Once the instance was created, this format can be obtained and changed using the OutputFormat property.

The EbookSaveOptions class has also two another properties: SplitHeadingLevel and ExportDocumentProperties.

  • getSplitHeadingLevel() of the System.Int32 type controls how (if so) to split the content of e-book onto packages in the resultant file. It doesn’t affect the representation of a file, opened in any e-Book reader; rather, it is about an internal structure of the e-Book file. If you dont bother about internal structure of the e-book file, you may leave this property to has the default value (2). Setting it to 0 will disable splitting, so all content of the e-Book will be incorporarted into a single package inside the resultant file.
  • getExportDocumentProperties() of the System.Boolean type controls whether to export built-in and custom document properties inside the resultant e-Book file. If you have no plans to reconvert the resultant e-book to some other format, you may leave it intact — the default false value disables the exporting of the document properties, so the resultant document will be a little bit smaller in size.

Code example below demonstrates a loading of a single ePub file to the Editor instance, editing it with default options, and saving to the ePub, AZW3, and Mobi with different options for each one.

String epubPath = "path/to/Alices Adventures in Wonderland.epub";

String epubOutputPath = "Output_ePub.epub";
String azw3OutputPath = "Output_AZW3.azw3";
String mobiOutputPath = "Output_Mobi.azw3";

GroupDocs.Editor.Editor editor = new Editor(epubPath);

//edit with default EbookEditOptions
EditableDocument edited = editor.Edit();

EbookSaveOptions epubSaveOptions = new EbookSaveOptions(EBookFormats.Epub);

EbookSaveOptions azw3SaveOptions = new new EbookSaveOptions(EBookFormats.Azw3);

EbookSaveOptions mobiSaveOptions = new EbookSaveOptions(EBookFormats.Mobi);, epubOutputPath, epubSaveOptions);, azw3OutputPath, azw3SaveOptions);, mobiOutputPath, mobiSaveOptions);

// ...
// Don't forget to dispose Editor and EditableDocument when work is done

Extracting metainfo from e-Book files

Like for all supported formats, the GroupDocs.Editor for Java provides an ability to detected the document metainfo for all supported e-Book formats by using a getDocumentInfo() method of the Editor class. In case when a valid e-Book was loaded into the Editor instance, a getDocumentInfo() will return an instance of a EbookDocumentInfo class, which inherits from IDocumentInfo interface, which, in turn, defines 4 properties: getFormat(), getPageCount(), getSize(), and isEncrypted().

  • getFormat() property returns a EBookFormats struct, which for the e-Books can be a Mobi, Azw3, or Epub value.
  • getPageCount() property returns an approximate number of pages in case of MOBI or AZW3 or a number of chapters in case of ePub. For the Mobi and AZW3, it is approximate, because Mobi/AZW3 format internally is a set of HTML documents (chapters), which are not separated on pages and even have no strict page dimensions, which allows to split content on page blocks and thus calculate the number of pages. This decision was made by Mobi/AZW3 format designers intentionally to allows variable page size (and count) on different devices — from FullHD displays to smartphones. So, for returning a page count for a Mobi/AZW3 document, GroupDocs.Editor assumes standard A4 page size in a portrait orientation, splits existing document content on such “papers”, and then calculates its count. So the returning number should be treated very carefully and approximately, users should not rely on it.
  • getSize() property returns a number of bytes of e-Book file.
  • isEncrypted() property always returns a false value, because e-Books cannot be encrypted with password, like PDF or Office Open XML.

Code example below demonstrates a loading of 3 different e-Books in different formats (Mobi, AZW3 and ePub) into the 3 different istances of the Editor class and then extracting information about them and checking it with NUnit.

String mobiPath = "");
String azw3Path = "Ebook.azw3";
String epubPath = "Ebook.epub";

Editor editorMobi = new Editor(mobiPath);
Editor editorAzw3 = new Editor(azw3Path);
Editor editorEpub = new Editor(epubPath);

IDocumentInfo mobiInfo = editorMobi.getDocumentInfo(null);
IDocumentInfo azw3Info = editorAzw3.getDocumentInfo(null);
IDocumentInfo epubInfo = editorEpub.getDocumentInfo(null);

Assert.assertEquals(EBookFormats.Mobi, mobiInfo.getFormat());
Assert.assertEquals(EBookFormats.Azw3, azw3Info.getFormat());
Assert.assertEquals(EBookFormats.Epub, epubInfo.getFormat());

// ...
// Don't forget to dispose Editors when work is done