DEV Community

Cover image for Solve Elasticsearch complex sort using script
lightkun
lightkun

Posted on • Updated on

Solve Elasticsearch complex sort using script

Overview

One day, I had to write program of complex sort in elasticsearch.
At first, I tries to operate score field but I noticed that it didn't work and it cannot explain what I want to do.
Therefore, I used script to solve this question. This article is a report of this solution.

Environment

※use docker container

  • Elasticsearch version 7.1.0
  • Kibana version 7.1.0

Background

I thought simple example for you.
What do you choose cloud service? AWS? Azure?

To take an example, 4 companies using cloud service.

  • A company

    • main cloud servie : AWS
    • sub cloud service: GCP, Azure
  • B company

    • main : GCP
    • sub : AWS
  • C company

    • main : GCP
    • sub : AWS, Azure
  • D company

    • main : Azure
    • sub : GCP, AWS
company name main sub
A AWS GCP, Azure
B GCP AWS
C GCP AWS, Azure
D Azure GCP, AWS

If I input "Azure", I expect hitting company names using Azure. To sum up, A, C and D.
Additionally, I want to add 1 search option of sort. If input value is equal to main cloud service, it's on top in search result. On the contrary, if input value is equal to sub cloud service, it's on under.
Finally, search result I expect is D, A and C.

input value expect result
Azure D, A, C
AWS A,B,C,D
GCP B,C,A,D

※If hitting column is same, random order.

  • Azure : (sub)A and C
  • AWS : (sub)B, C and D
  • GCP : (main)B and C. (sub)A and D

In the case of database(MySQL)

It was difficult for me to think elasitcsearch code, so that I decided to think about using database.

※use docker container

  • mysql version 5.7

Preparation

①Create table.

CREATE TABLE cloud_service (
  id INT PRIMARY KEY,
  company_name VARCHAR(50),
  main_name VARCHAR(30)
);

CREATE TABLE cloud_service_sub (
  id INT PRIMARY KEY,
  cloud_service_main_id INT,
  sub_name VARCHAR(30)
);
Enter fullscreen mode Exit fullscreen mode

②Insert table data.

INSERT INTO cloud_service (id, company_name, main_name) VALUES
  (1, 'A', 'AWS'),
  (2, 'B', 'GCP'),
  (3, 'C', 'GCP'),
  (4, 'D', 'Azure');

INSERT INTO cloud_service_sub VALUES
  (1, 1, 'GCP',
  (2, 1, 'Azure',
  (3, 2, 'AWS',
  (4, 3, 'AWS',
  (5, 3, 'Azure',
  (6, 4, 'GCP',
  (7, 4, 'AWS');
Enter fullscreen mode Exit fullscreen mode

③Check table data.

SELECT * FROM cloud_service;

+----+--------------+-----------+
| id | company_name | main_name |
+----+--------------+-----------+
|  1 | A            | AWS       |
|  2 | B            | GCP       |
|  3 | C            | GCP       |
|  4 | D            | Azure     |
+----+--------------+-----------+
4 rows in set (0.00 sec)

SELECT * FROM cloud_service_sub;

+----+-----------------------+----------+
| id | cloud_service_main_id | sub_name |
+----+-----------------------+----------+
|  1 |                     1 | GCP      |
|  2 |                     1 | Azure    |
|  3 |                     2 | AWS      |
|  4 |                     3 | AWS      |
|  5 |                     3 | Azure    |
|  6 |                     4 | GCP      |
|  7 |                     4 | AWS      |
+----+-----------------------+----------+
7 rows in set (0.00 sec)
Enter fullscreen mode Exit fullscreen mode

sort SQL sentence

When I ran the following SQL, I got expected result.
"ORDER BY CASE ~" sentence is an extremely important point.

When input value is equal to main.main_name, this record sort no is 0. When input value isn't equal to main.main_name, this record sort no is 1.
Then, By using "ASC", main → sub.

SELECT DISTINCT company_name, main.main_name
FROM cloud_service main 
LEFT JOIN cloud_service_sub sub ON main.id = sub.cloud_service_main_id
WHERE main.main_name = "Azure" OR sub.sub_name = "Azure"
ORDER BY 
CASE main.main_name 
WHEN "Azure" THEN 0 
ELSE 1 END 
ASC;

+--------------+-----------+
| company_name | main_name |
+--------------+-----------+
| D            | Azure     |
| A            | AWS       |
| C            | GCP       |
+--------------+-----------+
3 rows in set (0.00 sec)
Enter fullscreen mode Exit fullscreen mode
SELECT DISTINCT company_name, main.main_name
FROM cloud_service main 
LEFT JOIN cloud_service_sub sub ON main.id = sub.cloud_service_main_id
WHERE main.main_name = "AWS" OR sub.sub_name = "AWS"
ORDER BY 
CASE main.main_name 
WHEN "AWS" THEN 0 
ELSE 1 END 
ASC;

+--------------+-----------+
| company_name | main_name |
+--------------+-----------+
| A            | AWS       |
| C            | GCP       |
| D            | Azure     |
| B            | GCP       |
+--------------+-----------+
4 rows in set (0.00 sec)
Enter fullscreen mode Exit fullscreen mode
SELECT DISTINCT company_name, main.main_name
FROM cloud_service main 
LEFT JOIN cloud_service_sub sub ON main.id = sub.cloud_service_main_id
WHERE main.main_name = "GCP" OR sub.sub_name = "GCP"
ORDER BY CASE main.main_name 
WHEN "GCP" THEN 0 
ELSE 1 END 
ASC;

+--------------+-----------+
| company_name | main_name |
+--------------+-----------+
| C            | GCP       |
| B            | GCP       |
| A            | AWS       |
| D            | Azure     |
+--------------+-----------+
4 rows in set (0.00 sec)
Enter fullscreen mode Exit fullscreen mode

In the case of Elasticsearch(operating kibana console)

Preparation

①create template

PUT /_template/my_template?pretty
{
  "index_patterns": "mytest",
  "settings": {
    "number_of_shards": 1
  },
  "mappings": {
    "properties": {
      "company_name": {
        "type": "keyword"
      },
      "main_cloud_name": {
        "type": "keyword"
      },
      "sub_cloud_name": {
        "type": "keyword"
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

②post data

POST /mytest/_doc
{
  "company_name" : "A",
  "main_cloud_name": "AWS",
  "sub_cloud_name": ["GCP","Azure"]
}

POST /mytest/_doc
{
  "company_name" : "B",
  "main_cloud_name": "GCP",
  "sub_cloud_name": ["AWS"]
}

POST /mytest/_doc
{
  "company_name" : "C",
  "main_cloud_name": "GCP",
  "sub_cloud_name": ["AWS", "Azure"]
}

POST /mytest/_doc
{
  "company_name" : "D",
  "main_cloud_name": "Azure",
  "sub_cloud_name": ["GCP", "AWS"]
}
Enter fullscreen mode Exit fullscreen mode

③check document

GET /mytest/_search?pretty
Enter fullscreen mode Exit fullscreen mode

sort Elasticsearch search API

When I ran the following search API, I got expected result.
"_script" sentence is an extremely important point.

"type" : This is the type using sort. I can choose "string" or "number".

"lang" : "painless" is original langrage of Elasticsearch and can use part of java API.(Sorry, I don't know this in detail. Maybe, "painless" is java base.)

"source" : I can write the script of "lang". The mechanism of script contents is the same as sort SQL sentence

GET /mytest/_search?pretty
{
  "query": { 
    "bool" : {
      "should" : [
        {
          "term" : {
            "main_cloud_name" : "Azure"
          }
        },
        {
          "term" : {
            "sub_cloud_name" : "Azure"
          }
        }
      ]
    }
  },
  "sort" : {
    "_script" : {
      "type" : "number",
      "script" : {
        "lang": "painless",
        "source": """
            if (params.main == (doc['main_cloud_name'].value)) {
              return 0;
            } else {
              return 1;  
            }
          """,
        "params" : {
          "main" : "Azure"
        }
      },
      "order" : "ASC"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode
GET /mytest/_search?pretty
{
  "query": { 
    "bool" : {
      "should" : [
        {
          "term" : {
            "main_cloud_name" : "AWS"
          }
        },
        {
          "term" : {
            "sub_cloud_name" : "AWS"
          }
        }
      ]
    }
  },
  "sort" : {
    "_script" : {
      "type" : "number",
      "script" : {
        "lang": "painless",
        "source": """
            if (params.main == (doc['main_cloud_name'].value)) {
              return 0;
            } else {
              return 1;  
            }
          """,
        "params" : {
          "main" : "AWS"
        }
      },
      "order" : "ASC"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode
GET /mytest/_search?pretty
{
  "query": { 
    "bool" : {
      "should" : [
        {
          "term" : {
            "main_cloud_name" : "GCP"
          }
        },
        {
          "term" : {
            "sub_cloud_name" : "GCP"
          }
        }
      ]
    }
  },
  "sort" : {
    "_script" : {
      "type" : "number",
      "script" : {
        "lang": "painless",
        "source": """
            if (params.main == (doc['main_cloud_name'].value)) {
              return 0;
            } else {
              return 1;  
            }
          """,
        "params" : {
          "main" : "GCP"
        }
      },
      "order" : "ASC"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Reference

Latest comments (0)