Getting document text in network

To obtain the extracted text of indexed documents in the search network, the getDocumentText method of the Searcher class is used.

The first parameter of the getDocumentText method is a document represented by the NetworkDocumentInfo class. A list of all indexed documents in the search network can be obtained using the getIndexedDocuments method of the Searcher class.

The second parameter of the getDocumentText method is the result output adapter. In this operation, the same output adapter classes are used as those used in similar operations of a single index. For information about output adapters, see the article Output adapters.

The following code example demonstrates retrieving the text of documents indexed in the search network, as well as outputting the resulting data to the console.

Searcher searcher = node.getSearcher();

ArrayList<NetworkDocumentInfo> documents = new ArrayList<NetworkDocumentInfo>();
int[] shardIndices = node.getShardIndices();
for (int i = 0; i < shardIndices.length; i++) {
    int shardIndex = shardIndices[i];
    NetworkDocumentInfo[] infos = searcher.getIndexedDocuments(shardIndex);
    documents.addAll(Arrays.asList(infos));
    for (NetworkDocumentInfo info : infos) {
        NetworkDocumentInfo[] items = searcher.getIndexedDocumentItems(info);
        documents.addAll(Arrays.asList(items));
    }
}

for (int i = 0; i < documents.size(); i++) {
    NetworkDocumentInfo document = documents.get(i);
    if (document.getDocumentInfo().toString().contains(containsInPath)) {
        System.out.println();
        System.out.println(document.getDocumentInfo().toString());

        StringOutputAdapter outputAdapter = new StringOutputAdapter(OutputFormat.PlainText);
        searcher.getDocumentText(document, outputAdapter);

        System.out.println(outputAdapter.getResult());
        break;
    }
}

More resources

GitHub examples

You may easily run the code from documentation articles and see the features in action in our GitHub examples:

Free online document search App

Along with full featured .NET library we provide simple, but powerful free Apps.

You are welcome to search over your PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and more with our free online Free Online Document Search App.