Extracting data from documents is the most resource-intensive operation of the indexing process. And in case of insufficient memory, an exception occurs that completely interrupts the process of indexing a group of documents. To prevent this situation, data can be extracted in a separate process. In this case, insufficient memory when extracting data from one document will lead to a failure in indexing only this document. Indexing of the remaining documents in the group will continue further.
Extracting data from documents in a separate process also solves another problem – the problem of immediate interruption of a document indexing process when the allocated time limit for indexing one document is exceeded.
To extract data from documents in a separate process, you need to configure the appropriate indexing options, and also create a simple console application project that the GroupDocs.Search library will run in a separate process as a data extraction service. The console application project for data extraction must have the GroupDocs.Search library as a dependency, as does the main project using the library. The console application project code is presented in the following listing.
An example of setting indexing options for data extraction in a separate process is shown in the listing below.
C#
stringindexFolder=@"c:\MyIndex\";stringdocumentFolder=@"c:\MyDocuments\";// Getting the path to the console application locationstringassemblyPath=typeof(GroupDocs.Search.Extraction.Program).Assembly.Location;// Creating an index in the specified folderIndexindex=newIndex(indexFolder);// Setting indexing options for data extraction in a separate processIndexingOptionsoptions=newIndexingOptions();options.SeparateProcessOptions.ExtractInSeparateProcess=true;options.SeparateProcessOptions.AssemblyPath=assemblyPath;options.SeparateProcessOptions.Timeout=newTimeSpan(0,1,0);// Indexing documents from the specified folderindex.Add(documentFolder,options);
More resources
GitHub examples
You may easily run the code from documentation articles and see the features in action in our GitHub examples: