To automatically detect encoding of a text file, the AutoDetectEncoding property defined in the IndexSettings class can be used. Setting this property to true allows to detect the following encodings:
UTF-32 LE,
UTF-32 BE,
UTF-16 LE,
UTF-16 BE,
UTF-8,
UTF-7,
ANSI.
By default, the encoding auto detection of text files is disabled. But in any case, the encoding of a text file can be set during indexing when the FileIndexing event is raised. If the encoding of a text file has not been detected or specified in the event arguments, then the default encoding, UTF-8, is used. Available encodings are presented in the Encodings class. When the encoding of a text file is detected and used for indexing, it is saved in the index to use in such methods of Index class like Highlight and GetDocumentText.
The example below shows how to set encoding of a text during indexing.
C#
stringindexFolder=@"c:\MyIndex\";stringdocumentsFolder=@"c:\MyDocuments\";// Creating an indexIndexindex=newIndex(indexFolder);// Subscribing to the eventindex.Events.FileIndexing+=(sender,args)=>{if(args.DocumentFullPath.EndsWith(".txt",StringComparison.InvariantCultureIgnoreCase)){args.Encoding=Encodings.Windows_1253;// Setting encoding for each text file}};// Indexing documents from the specified folderindex.Add(documentsFolder);
More resources
GitHub examples
You may easily run the code from documentation articles and see the features in action in our GitHub examples: