This demonstration shows and explains all the necessary moments and options regarding processing DSV spreadsheets (Delimiter-Separated Values) like CSV and others.
Introduction
DSV (Delimiter-Separated Values) files are a specific form of text-based spreadsheets with delimiters (separators). Due to their nature, GroupDocs.Editor processes this class of documents separately from usual binary spreadsheets. In contrast to usual spreadsheets, DSV documents due to their textual nature have only a single tab (worksheet) and cannot be encoded. Any non-empty string may be treated as a separator, so the user always needs to specify it explicitly.
The most common types of DSV are:
CSV (Comma-Separated Values)
TSV (Tab-Separated Values)
Semicolon-separated values
Whitespace-separated values
…and any other
GroupDocs.Editor supports DSV with any separator, which can be a single character or a set of characters (string).
Loading a CSV file for edit
Unlike WordProcessing and Spreadsheet documents, DSV documents are loaded into the Editor class without any loading options. They are simple text files by their nature, so there is nothing to adjust:
In order to open any DSV document for editing, the user must use the DelimitedTextEditOptions class, whose single constructor has one mandatory parameter — a separator (delimiter) string, which should not be None or an empty string. There are also several optional properties:
convert_date_time_data and convert_numeric_data are boolean flags that indicate how to treat numbers. GroupDocs.Editor can recognize digits within cells and treat them as numbers or date-time values. By default this recognition is disabled, but the user can turn it on.
treat_consecutive_delimiters_as_one is a boolean flag that determines how consecutive delimiters should be treated — as several (default, False) or as a single one (True).
optimize_memory_usage is a boolean flag with a different purpose. By default, GroupDocs.Editor algorithms are tuned for maximum performance. However, in some rare cases the user may need to load a very large DSV file. By enabling this flag, the user switches GroupDocs.Editor to use other processing algorithms, which consume a relatively low amount of memory at the cost of lower performance.
The runnable example below loads a CSV file, edits it with comma as the delimiter and numeric recognition enabled, and then saves the edited content to a TSV file (a DSV with a tab separator).
importosfromgroupdocs.editorimportEditor,EditableDocument,Licensefromgroupdocs.editor.optionsimportDelimitedTextEditOptions,DelimitedTextSaveOptionsdefedit_csv():# Optionally set a licenselicense_path=os.path.abspath("./GroupDocs.Editor.lic")ifos.path.exists(license_path):License().set_license(license_path)# Load an input CSV file into the EditorwithEditor("./cars.csv")aseditor:# Create the DSV edit options with a comma separatoredit_options=DelimitedTextEditOptions(",")edit_options.convert_numeric_data=Trueedit_options.treat_consecutive_delimiters_as_one=True# Edit the CSV document and obtain an EditableDocumenteditable=editor.edit(edit_options)# Edit the content programmatically (in practice this is done in a WYSIWYG-editor)html=editable.get_content()edited=EditableDocument.from_markup(html)# Create DSV save options with a tab separator (TSV)save_options=DelimitedTextSaveOptions("\t")save_options.trim_leading_blank_row_and_column=Truesave_options.keep_separators_for_blank_row=False# Save the edited content to the TSV formateditor.save(edited,"./edited-cars.tsv",save_options)editable.dispose()edited.dispose()if__name__=="__main__":edit_csv()
cars.csv is the sample file used in this example. Click here to download it.
1997 Ford E350 ac, abs, moon 3000
1999 Chevy Venture «Extended Edition» 4900
1996 Jeep Grand Cherokee MUST SELL! air, moon roof, loaded 4799
2014 SsangYong Kyron good car! 1998
After being edited, an input DSV can be saved back to a DSV (not necessarily with the same separator) or to any supported Spreadsheet document. In order to save a document to the DSV format, the user must use the DelimitedTextSaveOptions class, which, like DelimitedTextEditOptions, has one constructor with a mandatory string parameter — a separator (delimiter), which should not be None or an empty string.
There are also other properties:
encoding — allows specifying the encoding of the generated DSV. By default, if not specified, it is UTF-8.
trim_leading_blank_row_and_column — a boolean flag that indicates whether leading blank rows and columns should be trimmed, like what MS Excel does.
keep_separators_for_blank_row — a boolean flag that indicates whether separators should be output for a blank row. The default value is False, which means the content for a blank row will be empty.
The edited content can also be saved to a Spreadsheet format such as XLSM. For this, a SpreadsheetSaveOptions instance with the desired output format is used: