Here's a quick look at how to leverage Elastic Search's processor to quickly change data.
It is difficult to understand the answers available on the internet, and I recommend official documentation, but understanding the documentation requires some trial and error. In this post, I've included some queries that will change text to lower or upper case.
UPPERCASE PROCESSOR
If you have a string (many words), you must divide it before processing it, therefore I've included a 'split processor' and then we can loop through to convert to lowercase or uppercase.
curl -XPUT "http://localhost:9200/_ingest/pipeline/uppercase_Processor" -H 'Content-Type: application/json' -d'
{
"processors": [
{
"split": {
"field": "name",
"separator": " ",
"target_field": "name",
"ignore_missing": true
}
},
{
"foreach": {
"field": "name",
"processor": {
"uppercase": {
"field": "_ingest._value"
}
}
}
}
]
}'
curl -XPUT "http://localhost:9200/_ingest/pipeline/lowercase_Processor" -H 'Content-Type: application/json' -d'
{
"processors": [
{
"split": {
"field": "name",
"separator": " ",
"target_field": "name",
"ignore_missing": true
}
},
{
"foreach": {
"field": "name",
"processor": {
"lowercase": {
"field": "_ingest._value"
}
}
}
}
]
}'
The result will be an array of values; if you're wondering what the point in doing this , don't worry, we can perform the 'join processor'.
curl -XPUT "http://localhost:9200/_ingest/pipeline/join_Processor" -H 'Content-Type: application/json' -d'
{
"processors": [
{
"join": {
"field": "name",
"separator": " "
}
}
]
}'
Do you deed to trim leading and trailing spaces?
In javascript we can call .trim()
and what about ElasticSearch. well, we have Trim
Processor.
curl -XPUT "http://localhost:9200/_ingest/pipeline/trim_Processor" -H 'Content-Type: application/json' -d'
{
"processors": [
{
"trim": {
"field": "name"
}
}
]
}'
Need to remove or convert special characters to desired character?
let's say Double Space to Single Space, gsub
is the processor we can use.
curl -XPUT "http://localhost:9200/_ingest/pipeline/gsub_Processor" -H 'Content-Type: application/json' -d'
{
"processors": [
{
"gsub": {
"field": "name",
"pattern": "\\s+",
"replacement": " "
}
}
]
}'
use this query to bulk update data.
curl -XPOST "http://localhost:9200/index/_update_by_query?pipeline=join_Processor"
curl -XPOST "http://localhost:9200/index/_update_by_query?pipeline=join_Processor&q=name='your_name'"
Top comments (0)