GroupDocs.Search supports indexing of many document formats. But there is also the possibility to implement support for any format other than the existing ones.
The following example demonstrates how to implement a custom text extractor and use it for indexing.
constindexFolder='c:/MyIndex/';// Specify path to the index folder
constdocumentsFolder='c:/MyDocuments/';// Specify path to a folder containing documents to search
constsettings=newgroupdocs.search.IndexSettings();constlogExtractor=java.newProxy('com.groupdocs.search.common.IFieldExtractor',{getExtensions:function(){constarray=java.newArray('java.lang.String',['.log']);returnarray;},getFields:function(filePath){constfileName=path.resolve(filePath);constcontents=fs.readFileSync(filePath,'utf8');constfields=java.newArray('com.groupdocs.search.common.DocumentField',[newgroupdocs.search.DocumentField('FileName',fileName),newgroupdocs.search.DocumentField('Content',contents),]);returnfields;},});settings.getCustomExtractors().addItem(logExtractor);// Adding custom text extractor to the index settings
constindex=newgroupdocs.search.Index(indexFolder,settings,true);// Creating or loading an index
index.add(documentsFolder);// Indexing documents from the specified folder
constquery1='objection';constresult1=index.search(query1);// Searching
constquery2='log';constresult2=index.search(query2);// Searching
Note that custom extractors are not saved in an index and must be created and added each time the index is created or loaded. However, the same code can be used to create a new index and open an existing one. In this case, when opening an existing index, custom extractors from the index settings passed to the constructor will be used, the remaining index settings will be loaded from disk.
More resources
GitHub examples
You may easily run the code from documentation articles and see the features in action in our GitHub examples: