Reverse image search

Reverse image search is a search for images that are similar to a given reference image.

In the GroupDocs.Search index, reverse image search allows you to search for similar images in ZIP archives, various documents, and individual files. The image search is performed by comparing the perceptual hash of the reference image with the hashes of the images in the index. The idea of ​​a perceptual hash is that for very similar images it has a value with a minimum number of different bits, and for very different images it gives a large number of different bits.

In the GroupDocs.Search engine, reverse image search, like full-text search, consists of two stages: the indexing stage and the actual search stage.

To search for images in the index, you must enable at least one of the flags in the options at the indexing stage:

  • setEnabledForSeparateImages is for indexing images in separate files.
  • setEnabledForEmbeddedImages is for indexing images embedded in various documents.
  • setEnabledForContainerItemImages is for indexing images that are elements of container documents, such as ZIP archives, OST/PST storages. For more information, see the Indexing options page.

The following options are available during the search stage:

  • setHashDifferences method sets the maximum number of different bits in the hashes of found images. This value ranges from 0 to 32.
  • setMaxResultCount method sets the maximum number of found images.
  • setSearchDocumentFilter method sets the filter for found documents. For more information, see the Image search options page.

The following code example demonstrates all stages of the reverse image search:

String indexFolder = "c:\\MyIndex";
String documentsFolder = "c:\\MyDocuments";

// Creating an index
Index index = new Index(indexFolder);

// Setting the image indexing options
IndexingOptions indexingOptions = new IndexingOptions();
indexingOptions.getImageIndexingOptions().setEnabledForContainerItemImages(true);
indexingOptions.getImageIndexingOptions().setEnabledForEmbeddedImages(true);
indexingOptions.getImageIndexingOptions().setEnabledForSeparateImages(true);

// Indexing documents in a document folder
index.add(documentsFolder, indexingOptions);

// Setting the image search options
ImageSearchOptions imageSearchOptions = new ImageSearchOptions();
imageSearchOptions.setHashDifferences(10);
imageSearchOptions.setMaxResultCount(10000);
imageSearchOptions.setSearchDocumentFilter(SearchDocumentFilter.createFileExtension(".zip", ".png", ".jpg"));

// Creating a reference image for search
SearchImage searchImage = SearchImage.create("c:\\MyDocuments\\image.png");

// Searching in the index
ImageSearchResult result = index.search(searchImage, imageSearchOptions);

System.out.print("Images found: " + result.getImageCount());
for (int i = 0; i < result.getImageCount(); i++) {
    FoundImageFrame image = result.getFoundImage(i);
    System.out.print(image.getDocumentInfo().toString());
}

More resources

GitHub examples

You may easily run the code from documentation articles and see the features in action in our GitHub examples:

Free online document search App

Along with full featured .NET library we provide simple, but powerful free Apps.

You are welcome to search over your PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and more with our free online Free Online Document Search App.