DEV Community

loizenai
loizenai

Posted on

1

Elasticsearch Character Filters

https://grokonez.com/elasticsearch/elasticsearch-character-filters

Elasticsearch Character Filters

Elasticsearch Character Filters preprocess (adding, removing, or changing) the stream of characters before it is passed to Tokenizer. In this tutorial, we're gonna look at 3 types of Character Filters: HTML Strip, Mapping, Pattern Replace that are very important to build Customer Analyzers.

1. HTML Strip Character Filter

html_strip character filter can:

  • strip out HTML elements (like <b>)
  • replace HTML entities with their decoded value (&amp; becomes &).

For example:


POST _analyze
{
  "tokenizer":      "keyword", 
  "char_filter":  [ "html_strip" ],
  "text": "

JavaSampleApproach's tutorials are so helpful!

" }

Terms:


[ \nJavaSampleApproach's tutorials are so helpful!\n ]

Configuration

escaped_tags: array of HTML tags which should not be stripped.

For example, we want to to leave <b> and <p> tags in place:


PUT jsa_index_char_filter_html
{
  "settings": {
    "analysis": {
      "analyzer": {
        "jsa_analyzer": {
          "tokenizer": "keyword",
          "char_filter": ["jsa_char_filter"]
        }
      },
      "char_filter": {
        "jsa_char_filter": {
          "type": "html_strip",
          "escaped_tags": ["b", "p"]
        }
      }
    }
  }
}

POST jsa_index_char_filter_html/_analyze
{
  "analyzer": "jsa_analyzer",
  "text": "

JavaSampleApproach's tutorials are so helpful!

" }

More at:

https://grokonez.com/elasticsearch/elasticsearch-character-filters

Elasticsearch Character Filters

AWS GenAI LIVE image

Real challenges. Real solutions. Real talk.

From technical discussions to philosophical debates, AWS and AWS Partners examine the impact and evolution of gen AI.

Learn more

Top comments (0)

Qodo Takeover

Introducing Qodo Gen 1.0: Transform Your Workflow with Agentic AI

Rather than just generating snippets, our agents understand your entire project context, can make decisions, use tools, and carry out tasks autonomously.

Read full post

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay