Character replacement during indexing can be used, for example, to convert all text to lowercase characters or to remove diacritics from text. Such replacements can reduce the size of an index on disk if the case of characters or diacritics are not significant. See also Character replacements page in the Managing dictionaries section.
The example below demonstrates how to configure and use character replacements during indexing.
StringindexFolder="c:\\MyIndex\\";StringdocumentFolder="c:\\MyDocuments\\";// Enabling character replacements in the index settings
IndexSettingssettings=newIndexSettings();settings.setUseCharacterReplacements(true);// Creating an index in the specified folder
Indexindex=newIndex(indexFolder,settings);// Configuring character replacements
// Deleting all existing character replacements from the dictionary
index.getDictionaries().getCharacterReplacements().clear();// Creating new character replacements
CharacterReplacementPair[]characterReplacements=newCharacterReplacementPair[Character.MAX_VALUE+1];for(inti=0;i<characterReplacements.length;i++){charcharacter=(char)i;charreplacement=Character.toLowerCase(character);characterReplacements[i]=newCharacterReplacementPair(character,replacement);}// Adding character replacements to the dictionary
index.getDictionaries().getCharacterReplacements().addRange(characterReplacements);// Indexing documents from the specified folder
index.add(documentFolder);// Searching in the index
// Case-sensitive search is no longer possible for this index, since all characters are lowercase
// By default, case-insensitive search is performed
SearchResultresult=index.search("Einstein");
More resources
GitHub examples
You may easily run the code from documentation articles and see the features in action in our GitHub examples: