GroupDocs.Search for Java 20.6 Release Notes

Major Features

There are the following new features and improvements in this release:

  • Improve formatting of text extracted from index

Full List of Issues Covering all Changes in this Release

SEARCHNET-2278Improve formatting of text extracted from indexImprovement
SEARCHJAVA-132Move AttributeChangeBatch class to common packageBreaking Change

Public API and Backward Incompatible Changes

Improve formatting of text extracted from index

This improvement allows you to choose between two alternatives:

  1. Indexing is faster, but with loss of formatting quality in some cases (mostly relevant for PDF format).
  2. Improve the formatting of text extracted during indexing, but with loss of indexing speed (mainly relevant for the PDF format).

By default, the raw mode is used if possible.

Public API changes

Method boolean getUseRawTextExtraction() has been added to class.
Method void setUseRawTextExtraction(boolean) has been added to class.

Use cases

The following example demonstrates how to disable raw text extraction mode to improve formatting of extracted text:

String indexFolder = "c:\\MyIndex";
String documentFolder = "c:\\MyDocuments";

// Setting not to use of raw text extraction mode
IndexSettings settings = new IndexSettings();

// Creating an index
Index index = new Index(indexFolder, settings);

// Indexing documents in the document folder

Move AttributeChangeBatch class to common package

This improvement brings order to the grouping of classes into packages.

Public API changes

Class AttributeChangeBatch has been moved from to package.
Constructor AttributeChangeBatch() has been deleted from class.
Method AttributeChangeBatch create() has been added to class.

Use cases

This example demonstrates how to add and remove attributes from indexed documents:

String indexFolder = "c:\\MyIndex";
String documentFolder = "c:\\MyDocuments";
// Creating an index
Index index = new Index(indexFolder);
// Indexing documents in a document folder
// Creating an attribute change container object
AttributeChangeBatch batch = AttributeChangeBatch.create();
// Adding one attribute to all indexed documents
// Removing one attribute from one indexed document
batch.remove("c:\\MyDocuments\\KeyPoints.doc", "public");
// Adding two attributes to one indexed document
batch.add("c:\\MyDocuments\\KeyPoints.doc", "main", "key");
// Applying attribute changes in the index