DEV Community

GroupDocs
GroupDocs

Posted on

Implementing Wildcard Search Functionality in your Java Applications

Indexing is a term quite synonymous with search engines now days, however, by definition, it refers to organizing data based on a specific schema. To put it another way, it’s the process of making data and information more presentable as well as conveniently accessible.

Arrangement of data in an index saves you time and effort during information search or retrieval. For perspective, consider a book on space science containing hundreds of pages. If it does not include an index, you would have to search through the whole book every time you wish to view desired information. It would be a completely different scenario with this book having an index, you can open the exact page which holds the information you are looking for.

As a programmer, you may come across a business requirement of data indexing in certain types of documents. It would require creating a sophisticated schema allowing you to not only index information seamlessly, but also perform search operations where required. Add to it the need to execute all this on Java platform and you would require an API offering:

  • Support for multi-format documents
  • Ability to create multiple indices
  • Capability of several types of search

GroupDocs.Search for Java is a unique indexing API fulfilling aforementioned requisites by providing a refined feature set and flexible structure. It lets Java app developers create and manage multiple indices and use various search queries such as simple, boolean, regular expression (regex) or fuzzy search.

You can use blended characters with this Java API which help you in utilizing characters like a hyphen as valid letters and separators when indexing. Wildcard search option is another very useful feature of this API. There are two forms of wildcards available, single arbitrary character or a range of wildcard arbitrary characters.

Following code snippet shows how to perform wildcard search using text query:
// Creating index
Index index = new Index(Utilities.INDEX_PATH,true);
// Adding documents to index
index.addToIndex(Utilities.DOCUMENTS_PATH,true);
// Searching for words 'affect' or 'effect' in a one document with 'principal', 'principle', 'principles', or 'principally'
SearchResults results1 = index.search("?ffect & princip?(2~4)");
// Searching with a single query for phrases 'assure equal opportunities', 'ensure equal opportunities', and 'sure equal opportunities'
SearchResults results2 = index.search("\"?(0~2)sure equal opportunities\"");

And to search wildcards using object, refer to below code snippet:
// Creating index
Index index = new Index(Utilities.INDEX_PATH,true);
// Adding documents to index
index.addToIndex(Utilities.DOCUMENTS_PATH, true);
// Constructing query 1
// Word 1 in the query is a pattern '?ffect' for wildcard search
WordPattern pattert11 = new WordPattern();
pattert11.appendOneCharacterWildcard();
pattert11.appendString("ffect");
SearchQuery subquery11 = SearchQuery.createWordPatternQuery(pattert11);
// Word 2 in the query is a pattern 'princip?(2~4)' for wildcard search
WordPattern pattert12 = new WordPattern();
pattert12.appendString("princip");
pattert12.appendWildcard(2, 4);
SearchQuery subquery12 = SearchQuery.createWordPatternQuery(pattert12);
// Creating boolean search query
SearchQuery query1 = SearchQuery.createAndQuery(subquery11, subquery12);
// Searching with query 1
SearchResults results1 = index.search(query1, new SearchParameters());
// Constructing query 2
// Word 1 in the phrase is a pattern '?(0~2)sure' for wildcard search
WordPattern pattert21 = new WordPattern();
pattert21.appendWildcard(0, 2);
pattert21.appendString("sure");
SearchQuery subquery21 = SearchQuery.createWordPatternQuery(pattert21);
// Word 2 in the phrase is searched with different word forms ('equal', 'equals', 'equally', etc.)
SearchQuery subquery22 = SearchQuery.createWordQuery("equal");
subquery22.setSearchParameters(new SearchParameters());
subquery22.getSearchParameters().setUseWordFormsSearch(true);
// Word 3 in the phrase is searched with maximum 2 differences of fuzzy search
SearchQuery subquery23 = SearchQuery.createWordQuery("opportunities");
subquery23.setSearchParameters(new SearchParameters());
subquery23.getSearchParameters().getFuzzySearch().setEnabled(true);
subquery23.getSearchParameters().getFuzzySearch().setFuzzyAlgorithm(new TableDiscreteFunction(2));
// Creating phrase search query
SearchQuery query2 = SearchQuery.createPhraseSearchQuery(subquery21, subquery22, subquery23);
// Searching with query 2
SearchResults results2 = index.search(query2, new SearchParameters());

Check all available features yourself today – http://bit.ly/2TmhZRO

Top comments (0)