Edit Markdown documents

This example demonstrates the standard open-edit-save pipeline with Markdown (MD) documents, using different options on every step.

Introduction

Markdown is a lightweight markup language, which became popular last time. Markdown files with an *.md extension are actually plain text files, contain special syntax, support text formatting, tables, lists, images and so on. There are actually several dialects of markdown, including GFM, CommonMark, Multi-Markdown and so on.

Starting from the version 22.11, GroupDocs.Editor for .NET fully supports the Markdown format on both import, export, and also its auto-detection.

As for the version 22.11, GroupDocs.Editor for .NET supports the following Markdown features, which mostly follow the CommonMark specification and are represented as appropriate styles or direct formatting:

  • Headings
  • Blockquotes
  • Code blocks
  • Horizontal rules
  • Bold emphasis
  • Italic emphasis
  • StrikeThrough formatting
  • Numbered and bulleted lists
  • Tables
  • Internal images, stored with base64 encoding
  • External images

Loading

Loading of the Markdown documents to the Editor class is usual and the same as for other formats. There are no dedicated load options for the Markdown format, it is enough to specify the file itself through file path or byte stream.

Code example below shows loading the same Markdown file to the two Editor instances through file path and a byte stream, and then checking how the GroupDocs.Editor will detect the format of specified file.

const string filename = "sample.md";
string inputPath = System.IO.Path.Combine("markdown folder", filename);

using (FileStream content = File.OpenRead(inputPath))
{
	using (Editor fromStream = new Editor(content)
	using (Editor fromPath = new Editor(inputPath))
	{
		GroupDocs.Editor.Metadata.IDocumentInfo info = fromStream.GetDocumentInfo(null);
		Assert.AreEqual(Formats.TextualFormats.Md, info.Format);
	}
}

Editing

There is a special class MarkdownEditOptions for editing the Markdown files. As always, it is not mandatory when editing a document, so the Editor.Edit() overload without parameter may be used — GroupDocs.Editor will automatically detect the format and apply the default options.

However, specifying the custom MarkdownEditOptions may be vital when an input Markdown file has external images; “external” means that images are stored somewhere else, and in the Markdown code there are links to these images. When MarkdownEditOptions are not specified and correctly tuned (explained below), the GroupDocs.Editor will not be able to locate such images. In order to point all external images, the instance of MarkdownEditOptions should be created, and an ImageLoadCallback property should be correctly specified.

For doing this the user must create its own type, that implements a IMarkdownImageLoadCallback interface. This interface defines a single method ProcessImage, which obtains a MarkdownImageLoadArgs type and must return a value of MarkdownImageLoadingAction enumeration.

Working mechanism is the next. GroupDocs.Editor parses an input Markdown file line by line, character by character. When it encounters the link to the external image, it creates an instance of MarkdownImageLoadArgs with all data about this image — its filename (or relative path, or URL) and a boolean flag, that indicates whether it is a local link (in the filesystem) or a web link. Then, when MarkdownEditOptions have the ImageLoadCallback property specified with the user implementation of IMarkdownImageLoadCallback, then GroupDocs.Editor invokes the ProcessImage method by passing a prepared MarkdownImageLoadArgs into it. Then GroupDocs.Editor “waits” until the user code makes a decision — a return value of the MarkdownImageLoadingAction enumeration.

If user implementation returns a MarkdownImageLoadingAction.Skip value, then the image will be skipped — the resultant document will have an “empty area”, where the image should be.

If user implementation returns a MarkdownImageLoadingAction.Default value, then the GroupDocs.Editor will try to load an image by itself — this is possible, if, for example, a link to image is an absolute path or URL.

If user implementation returns a MarkdownImageLoadingAction.UserProvided, then this means that user code must provide the image data by itself specifying it in the SetData(byte[] data). In that case GroupDocs.Editor will process the specified binary data for this image.

Code example below shows exactly the last scenario, where the end-user creates his own implementation of the IMarkdownImageLoadCallback interface.

Let’s say we have the next file and folder structure:

  • root/input.md
  • root/images/image1.png
  • root/images/image2.jpeg

The input.md is a Markdown file we want to open and edit, that is located in the “root” folder, and it uses (references) two raster images image1.png and image2.jpeg, located in the “images” subfolder.

public void EditingMarkdown()
{
	string inputMdPath = Path.Combine("root", "input.md");
	string imagesFolder = Path.Combine("root", "images");

	// Creating the edit options
	Options.MarkdownEditOptions editOptions = new MarkdownEditOptions();
	editOptions.ImageLoadCallback = new MdImageLoader(imagesFolder);

	using (Editor editor = new Editor(inputMdPath))
	{
		EditableDocument beforeEdit = editor.Edit(editOptions);

		// Make sure there are 2 images here
		Assert.AreEqual(2, beforeEdit.Images.Count);
		Assert.AreEqual("png", beforeEdit.Images[0].Type.FileExtension);
		Assert.AreEqual("jpeg", beforeEdit.Images[1].Type.FileExtension);

		string originalHtmlContent = beforeEdit.GetEmbeddedHtml();

		// Send the 'originalHtmlContent' to the client-side WYSIWYG-editor,
		// obtain the edited version and create a new EditableDocument from it

		EditableDocument afterEdit = EditableDocument.FromMarkup(originalHtmlContent, null);

		// Make sure 2 images are still here
		Assert.AreEqual(2, afterEdit.Images.Count);
		Assert.AreEqual("png", afterEdit.Images[0].Type.FileExtension);
		Assert.AreEqual("jpeg", afterEdit.Images[1].Type.FileExtension);

		// Save to the DOCX, for example
		Options.WordProcessingSaveOptions saveOptions = new WordProcessingSaveOptions(Formats.WordProcessingFormats.Docx);
		 string outputDocxPath = Path.Combine("root", "Output."+ saveOptions.OutputFormat.Extension);

		editor.Save(afterEdit, outputDocxPath, saveOptions);		

	}
}

internal sealed class MdImageLoader : Options.IMarkdownImageLoadCallback
{
	private readonly string _imagesFolder;

	public MdImageLoader(string imagesFolder)
	{
		this._imagesFolder = imagesFolder;
	}

	public MarkdownImageLoadingAction ProcessImage(MarkdownImageLoadArgs args)
	{
		string filePath = Path.Combine(this._imagesFolder, Path.GetFileName(args.ImageFileName));
		using (FileStream content = File.OpenRead(filePath))
		{
			byte[] data = new byte[content.Length];
			content.Read(data, 0, (int)content.Length);
			args.SetData(data);
		}
		return MarkdownImageLoadingAction.UserProvided;
	}
}

In this example there is a user-created MdImageLoader class. It is initialized with the images folder path, and in the ProcessImage method file content is read and pushed to the MarkdownImageLoadArgs class through the SetData(byte[] data) method.

Saving

GroupDocs.Editor also supports saving into the Markdown format. Like for any other format, for saving into markdown the user must create an instance of MarkdownSaveOptions class and specify it in the GroupDocs.Editor.Editor.Save() method.

If the document, destined for the saving in markdown format, has images, they should be “resolved” in a way similar to described above. But there is no callback for saving the images here. Instead of it, user has 3 choices:

  1. Ignore images - they will be absent.
  2. Save images inside the Markdown code, when they will be stored in base64 encoding.
  3. Save images as files separately in the specified folder, and in the markdown code there will be references to these image files.

For doing this the MarkdownSaveOptions class has several properties. ExportImagesAsBase64 is a boolean flag, by default set to false. If setted to true, the content of the images will be injected inside output Markdown as base64. Also, if this flag is set to true, it has the highest priority, and the ImagesFolder property is ignored.

ImagesFolder property, in turn, works when ExportImagesAsBase64 is set to false. This property, if specified, should contain a valid full path to the existing folder, where GroupDocs.Editor should save images.

The example below shows opening an input DOCX file for editing and saving it into 3 different Markdown output files, each of one has its own saving properties.

const string filename = "SampleDoc.docx";
string inputPath = Path.Combine("some-path", filename);

string outputFolder = "Some-full-path-to-output-folder";

string outputMdEmbeddedFilePath = Path.Combine(outputFolder, "Output_Markdown_Embedded.md");
string outputMdExternalFilePath = Path.Combine(outputFolder, "Output_Markdown_External.md");
string outputMdAbsentFilePath = Path.Combine(outputFolder, "Output_Markdown_Absent.md");

GroupDocs.Editor.Editor editor = new Editor(inputPath);                
EditableDocument beforeEdit = editor.Edit();
// Send content to client-side WYSIWYG-editor, edit it there, send back to the server-side
EditableDocument afterEdit = EditableDocument.FromMarkup(beforeEdit.GetEmbeddedHtml(), null);

{// Saving to "embedded" version, where all images are injected inside MD with base64 encoding
	Options.MarkdownSaveOptions mdSaveOptionsEmbedded = new MarkdownSaveOptions();
	mdSaveOptionsEmbedded.ExportImagesAsBase64 = true;
	editor.Save(afterEdit, outputMdEmbeddedFilePath, mdSaveOptionsEmbedded);
}

{// Saving to "external" version, where all images are stored in distinct files, while MD contains links to these files as paths in file system
	Options.MarkdownSaveOptions mdSaveOptionsExternal = new MarkdownSaveOptions();
	mdSaveOptionsExternal.ImagesFolder = outputFolder;
	editor.Save(afterEdit, outputMdExternalFilePath, mdSaveOptionsExternal);
}

{// Saving to "absent" version, where all images are skipped and are not present in MD
	Options.MarkdownSaveOptions mdSaveOptionsAbsent = new MarkdownSaveOptions();
	MemoryStream outputMdAbsentTemp = new MemoryStream();
	editor.Save(afterEdit, outputMdAbsentTemp, mdSaveOptionsAbsent);
	outputMdAbsentTemp.Position = 0;
	FileStream outputMdAbsentStream = File.Create(outputMdAbsentFilePath);
	outputMdAbsentTemp.CopyTo(outputMdAbsentStream);
	outputMdAbsentStream.Close();
	outputMdAbsentTemp.Close();
}
	
// Disposing all resources
beforeEdit.Dispose();
afterEdit.Dispose();
editor.Dispose();

In this example you can see a little trick — for the “absent” version the document content is saved to the MemoryStream, and then copied to the FileStream. This is done, because GroupDocs.Editor has a trick: if both ExportImagesAsBase64 is set to false and ImagesFolder is not set, then GroupDocs.Editor tries to analyze the specified stream. If this stream is a FileStream, the GroupDocs.Editor obtains a path to the folder and saves images in it. So, in order to “deceive” the GroupDocs.Editor in this example, the MemoryStream was used.

Roundtrip

Because Markdown format is supported on import and export, it is possible to perform a roundtrip scenario with it — open a markdown file for editing, edit it and then save the edited version to the Markdown format too. The example below demonstrates such a scenario.

public void MarkdownRoundtrip()
{
	string inputFolderPath = "Some-full-path-to-input-folder";
	string outputFolder = "Some-full-path-to-output-folder";
	string outputMdPath = Path.Combine(outputFolder, "Output.md");

	const string filename = "ComplexImage.md";
	string inputPath = Path.Combine(inputFolderPath, filename);	

	Options.MarkdownEditOptions editOptions = new MarkdownEditOptions();
	editOptions.ImageLoadCallback = new MdImageLoader(inputFolderPath);

	Options.MarkdownSaveOptions saveOptions = new Options.MarkdownSaveOptions();
	saveOptions.TableContentAlignment = MarkdownTableContentAlignment.Center;
	saveOptions.ImagesFolder = outputFolder;

	using (Editor editor = new Editor(inputPath))
	{
		using (EditableDocument doc = editor.Edit(editOptions))
		{
			Assert.AreEqual(3, doc.Images.Count);
			// edit "doc" in WYSIWYG-editor and obtain its edited version

			editor.Save(doc, outputMdPath, saveOptions);
		}
	}
}

internal sealed class MdImageLoader : Options.IMarkdownImageLoadCallback
{
	private readonly string _imagesFolder;

	public MdImageLoader(string imagesFolder)
	{
		this._imagesFolder = imagesFolder;
	}

	public MarkdownImageLoadingAction ProcessImage(MarkdownImageLoadArgs args)
	{
		string filePath = Path.Combine(this._imagesFolder, Path.GetFileName(args.ImageFileName));
		using (FileStream content = File.OpenRead(filePath))
		{
			byte[] data = new byte[content.Length];
			content.Read(data, 0, (int)content.Length);
			args.SetData(data);
		}
		return MarkdownImageLoadingAction.UserProvided;
	}
}