I had the opportunity to implement keyword search with field-level boosting at work. It was my first experience creating such functionality, so I had a hard time doing it. If you make similar functionality, this post may help you.
NOTE:
Sitecore has a feature for field-level boosting, but this is not supported until Sitecore 9.4. So boosting is implemented manually (coded) in this post.
UPDATE (2020/2/5):
I made a library for generating an efficient query of keyword search that supports field-level boosting. If you are interested in, see the next link.
Problem
My first code is like this:
public SearchResults<SearchResultItem> Search(string[] keywords)
{
    using (var context = index.CreateSearchContext())
    {
        // the "title" field contains all keywords. (boost: 10)
        var titlePred = PredicateBuilder.True<SearchResultItem>(); 
        foreach (var keyword in keywords) {
            titlePred = titlePred.And(item => item["title"].Contains(keyword).Boost(10));
        }
        // OR the "body" field contains all keywords. (boost: 5)
        var bodyPred = PredicateBuilder.True<SearchResultItem>();  
        foreach (var keyword in keywords) {
            bodyPred = bodyPred.And(item => item["body"].Contains(keyword).Boost(5));
        }
        var keywordSearchPred = PredicateBuilder
            .False<SearchResultItem>()
            .Or(titlePred)
            .Or(bodyPred);
        return context.GetQueryable<SearchResultItem>().Where(keywordSearchPred).GetResult();
    }
}
This worked well at first, but I noticed this doesn't work when the keywords are contained across some fields.
Here is an example of an invalid case:
- Keywords: Sitecore,Experience,Platform
| Field | Value | 
|---|---|
| title | What means Sitecore "XP"? | 
| body | XP stands for eXperience Platform. | 
As a simple solution, enumerate all permutation with repetition of fields and keywords, and determine if they match for each one. The following code would be generated by this solution:
   (item["title"].Contains("Sitecore").Boost(10) && item["title"].Contains("Experience").Boost(10) && item["title"].Contains("Platform").Boost(10))
|| (item["title"].Contains("Sitecore").Boost(10) && item["title"].Contains("Experience").Boost(10) && item["body"].Contains("Platform").Boost(5))
|| (item["title"].Contains("Sitecore").Boost(10) && item["body"].Contains("Experience").Boost(5) && item["title"].Contains("Platform").Boost(10))
|| (item["title"].Contains("Sitecore").Boost(10) && item["body"].Contains("Experience").Boost(5) && item["body"].Contains("Platform").Boost(5))
|| (item["body"].Contains("Sitecore").Boost(5) && item["title"].Contains("Experience").Boost(10) && item["title"].Contains("Platform").Boost(10))
|| (item["body"].Contains("Sitecore").Boost(5) && item["title"].Contains("Experience").Boost(10) && item["body"].Contains("Platform").Boost(5))
|| (item["body"].Contains("Sitecore").Boost(5) && item["body"].Contains("Experience").Boost(5) && item["title"].Contains("Platform").Boost(10))
|| (item["body"].Contains("Sitecore").Boost(5) && item["body"].Contains("Experience").Boost(5) && item["body"].Contains("Platform").Boost(5))
Too long! The number of Contains condition is calculated with the following formula.
If you have 5 target fields and 3 keywords input, 375 conditions will be generated. So in many cases, the query ends up exceeding the request size limit.
Solution
Now, to solve the problem, divide the query into ① "checking whether keywords are contained in" part and ② "applying boost value to results" part.
For making ① part, create a "contents" field that has concatenated value of all the target fields. Using this field, the query can be written as follows:
item["contents"].Contains("Sitecore") && item["contents"].Contains("Experience") && item["contents"].Contains("Platform")
It's very simple.
Then, the ② part is composed of all combinations of fields and keywords. Boost each field when a keyword is contained, and combine all the boosting query with OR condition.
   item["title"].Contains("Sitecore").Boost(10)
|| item["title"].Contains("Experience").Boost(10)
|| item["title"].Contains("Platform").Boost(10)
|| item["body"].Contains("Sitecore").Boost(5)
|| item["body"].Contains("Experience").Boost(5)
|| item["body"].Contains("Platform").Boost(5)
Finally, we can get the whole query by combining ① and ② with AND condition. This query has fewer conditions compare with the previous one.
This query actually works well. When ① part is evaluated as true, it means "all keywords are in some fields at least". So ② part becomes true, and the whole query returns true. When ① is false, the whole query is naturally false.
Implementation
First, we need to create the "contents" field used in ① part. This field can be created with the Computed Field in Sitecore.
Here is a sample code:
public class ContentsField : IComputedIndexField
{
    public string FieldName { get; set; }
    public string ReturnType { get; set; }
    public object ComputeFieldValue(IIndexable indexable)
    {
        if (!(indexable is SitecoreIndexableItem item))
        {
            return null;
        }
        // The fields for keyword search
        var targetFields = new[] { "Title", "Body", "Summary", "Category", "Author" };
        // Concatenate all value of the target fields
        return string.Join(" ", targetFields.Select(keyword => item.Item[keyword]));
    }
}
This class has to be registered in the configuration. Here is a patch file to register:
<?xml version="1.0" encoding="utf-8"?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:role="http://www.sitecore.net/xmlconfig/role/" xmlns:search="http://www.sitecore.net/xmlconfig/search/">
  <sitecore role:require="Standalone or ContentManagement or ContentDelivery" search:require="solr">
    <contentSearch>
      <indexConfigurations>
        <defaultSolrIndexConfiguration type="Sitecore.ContentSearch.SolrProvider.SolrIndexConfiguration, Sitecore.ContentSearch.SolrProvider">
          <documentOptions type="Sitecore.ContentSearch.SolrProvider.SolrDocumentBuilderOptions, Sitecore.ContentSearch.SolrProvider">
            <fields  hint="raw:AddComputedIndexField">
              <!-- Add contents field -->
              <field fieldName="contents" returnType="string" type="NamespaceTo.ContentsField, Assembly"/>
            </fields>
          </documentOptions>
        </defaultSolrIndexConfiguration>
      </indexConfigurations>
    </contentSearch>
  </sitecore>
</configuration>
Then, execute "Populate solr managed schema" and "Rebuild index" in your Sitecore. The "contents" field will be generated in sitecore_master_index (and web, core).  
The main program of the keyword search can be written as follows:
public class KeywordSearchApi
{
    // The target fields and its boosting value for keyword searching (You'd better load this from item or configuration)
    protected static IReadOnlyDictionary<string, int> TargetFields = new Dictionary<string, int>()
    {
        ["title"] = 10,
        ["body"] = 8,
        ["summary"] = 6,
        ["category"] = 2,
        ["author"] = 1
    };
    public static SearchResults<SearchResultItem> Search(ICollection<string> keywords)
    {
        var index = ContentSearchManager.GetIndex("sitecore_master_index");
        using (var context = index.CreateSearchContext())
        {
            // The predicate for ①
            var matchPred = keywords
                .Aggregate(
                    PredicateBuilder.True<SearchResultItem>(),
                    (acc, keyword) => acc.And(item => item["contents"].Contains(keyword))); // without boosting
            // The predicate for ②
            var boostPred = TargetFields.Keys
                // Make all pairs of field/keyword with boosting value
                .SelectMany(_ => keywords, (field, keyword) => (field, keyword, boost: TargetFields[field]))
                .Aggregate(
                    PredicateBuilder.Create<T>(item => item.Name.MatchWildcard("*").Boost(0)), // always true
                    (acc, pair) => acc.Or(item => item[pair.field].Contains(pair.keyword).Boost(pair.boost))); // with boosting
            return context.GetQueryable<SearchResultItem>()
                .Filter(matchPred)
                .Where(boostPred) // Use 'Where' instead because 'Filter' ignores the boosting values.
                .OrderByDescending(item => item["score"])
                .GetResults();
        }
    }
}
If you use Sitecore PowerShell Extensions, you can easily check this method by the next script.
$keywords = "Sitecore","Experience","Platform"
[NamespaceTo.KeywordSearchApi]::Search($keywords)
Conclusion
This solution is only one of many ideas. If you have more smart ideas, let me know in the comment or your post.
Happy searching!
 



 
    
Top comments (0)