How to Edit CSV File

This guide explains all necessary aspects and options regarding processing DSV spreadsheets (Delimiter-Separated Values) using GroupDocs.Editor for Node.js via Java.

Introduction

DSV (Delimiter-Separated Values) documents are a specific form of text-based spreadsheets that use delimiters (separators) to separate values. Due to their nature, GroupDocs.Editor processes this class of documents separately from usual binary spreadsheets. Unlike typical spreadsheets, DSV documents have only a single tab (worksheet) and cannot be encoded. However, any non-empty string may be treated as a separator, so you always need to specify it explicitly.

The most common types of DSV are:

  1. CSV (Comma-Separated Values)
  2. TSV (Tab-Separated Values)
  3. Semicolon-Separated Values
  4. Whitespace-Separated Values
  5. …and any other delimiter

GroupDocs.Editor supports DSV with any separator, which can be a character or a set of characters (string).

Loading a CSV File for Editing

Unlike WordProcessing and Spreadsheet documents, DSV documents are loaded into the Editor class without any loading options. They are simple text files by their nature, so there is nothing to adjust:

// Import the necessary modules
const groupdocsEditor = require('groupdocs-editor');

// Specify the input CSV file path
const inputCsvPath = "C://input//spreadsheet.csv";

// Create an instance of the Editor
const editor = new groupdocsEditor.Editor(inputCsvPath);

Editing a CSV File

To open any DSV document for editing, you must use the DelimitedTextEditOptions class, whose constructor has one mandatory parameter—a string separator (delimiter) that should not be null or an empty string. There are also several optional properties:

  1. convertDateTimeData: Boolean flag indicating whether to recognize date-time values within cells. By default, this recognition is disabled.
  2. convertNumericData: Boolean flag indicating whether to recognize numeric values within cells. By default, this recognition is disabled.
  3. treatConsecutiveDelimitersAsOne: Boolean flag that determines how consecutive delimiters should be treated—as several (default false) or as a single one (true).
  4. optimizeMemoryUsage: Boolean flag with a completely different purpose. By default, GroupDocs.Editor algorithms are tuned for maximum performance to reduce latency time; memory consumption has a lower priority. However, in some rare cases, you may need to load and open a very large DSV file, several hundreds of MBs or even close to a GB. In such cases, GroupDocs.Editor (or Java, in fact) may throw an OutOfMemoryException. To cope with such use-cases, the optimizeMemoryUsage flag was introduced. By enabling it, you switch GroupDocs.Editor to use alternative processing algorithms, which consume relatively low amounts of memory at the cost of lower performance.

The example below demonstrates using the DelimitedTextEditOptions for editing a CSV document where the comma character is the delimiter:

// Import the necessary modules
const groupdocsEditor = require('groupdocs-editor');

// Prepare DelimitedTextEditOptions
const editOptions = new groupdocsEditor.DelimitedTextEditOptions(",");
editOptions.setConvertDateTimeData(false);
editOptions.setConvertNumericData(true);
editOptions.setTreatConsecutiveDelimitersAsOne(true);
editOptions.setOptimizeMemoryUsage(true);

// Open the document for editing
const document = editor.edit(editOptions);

Saving the Edited CSV File

After being edited, the input DSV can be saved back to DSV (not necessarily with the same separator) or to any supported Spreadsheet document. To save the document to DSV format, you must use the DelimitedTextSaveOptions class, which, like the DelimitedTextEditOptions, has a constructor with a mandatory string parameter—the separator (delimiter) that should not be null or an empty string.

There are also other properties:

  1. encoding: Allows specifying the encoding of the generated DSV. By default, if not specified, it is UTF-8.
  2. trimLeadingBlankRowAndColumn: Boolean flag that indicates whether leading blank rows and columns should be trimmed, similar to what MS Excel does.
  3. keepSeparatorsForBlankRow: Boolean flag that indicates whether separators should be output for a blank row. The default value is false, which means the content for a blank row will be empty.

The example below demonstrates loading a CSV file, opening it into an EditableDocument instance, and saving it to TSV and XLSM formats:

// Import the necessary modules
const groupdocsEditor = require('groupdocs-editor');

// Specify the input CSV file path
const inputCsvPath = "C://input//spreadsheet.csv";

// Create an instance of the Editor
const editor = new groupdocsEditor.Editor(inputCsvPath);

// Prepare DelimitedTextEditOptions
const editOptions = new groupdocsEditor.DelimitedTextEditOptions(",");
const document = editor.edit(editOptions);

// Prepare DelimitedTextSaveOptions for TSV
const tsvSaveOptions = new groupdocsEditor.DelimitedTextSaveOptions("\t");
tsvSaveOptions.setTrimLeadingBlankRowAndColumn(true);
tsvSaveOptions.setKeepSeparatorsForBlankRow(false);

// Prepare SpreadsheetSaveOptions for XLSM
const xlsmSaveOptions = new groupdocsEditor.SpreadsheetSaveOptions(groupdocsEditor.SpreadsheetFormats.Xlsm);

// Specify output paths
const tsvSavePath = "C://output//spreadsheet.tsv";
const xlsmSavePath = "C://output//spreadsheet.xlsm";

// Save the document as TSV
editor.save(document, tsvSavePath, tsvSaveOptions);

// Save the document as XLSM
editor.save(document, xlsmSavePath, xlsmSaveOptions);

In this example, the output file 'spreadsheet.tsv' will have UTF-8 encoding.


By following this guide, you can effectively edit CSV and other DSV files using GroupDocs.Editor for Node.js via Java, customizing the editing process according to your specific needs.

Note: Be sure to replace the paths in the code examples with the actual paths on your system.

For more examples and detailed information, you can refer to the GroupDocs.Editor for Node.js via Java Examples repository on GitHub.