DEV Community

loizenai
loizenai

Posted on

1

Elasticsearch Character Filters

https://grokonez.com/elasticsearch/elasticsearch-character-filters

Elasticsearch Character Filters

Elasticsearch Character Filters preprocess (adding, removing, or changing) the stream of characters before it is passed to Tokenizer. In this tutorial, we're gonna look at 3 types of Character Filters: HTML Strip, Mapping, Pattern Replace that are very important to build Customer Analyzers.

1. HTML Strip Character Filter

html_strip character filter can:

  • strip out HTML elements (like <b>)
  • replace HTML entities with their decoded value (&amp; becomes &).

For example:


POST _analyze
{
  "tokenizer":      "keyword", 
  "char_filter":  [ "html_strip" ],
  "text": "

JavaSampleApproach's tutorials are so helpful!

" }

Terms:


[ \nJavaSampleApproach's tutorials are so helpful!\n ]

Configuration

escaped_tags: array of HTML tags which should not be stripped.

For example, we want to to leave <b> and <p> tags in place:


PUT jsa_index_char_filter_html
{
  "settings": {
    "analysis": {
      "analyzer": {
        "jsa_analyzer": {
          "tokenizer": "keyword",
          "char_filter": ["jsa_char_filter"]
        }
      },
      "char_filter": {
        "jsa_char_filter": {
          "type": "html_strip",
          "escaped_tags": ["b", "p"]
        }
      }
    }
  }
}

POST jsa_index_char_filter_html/_analyze
{
  "analyzer": "jsa_analyzer",
  "text": "

JavaSampleApproach's tutorials are so helpful!

" }

More at:

https://grokonez.com/elasticsearch/elasticsearch-character-filters

Elasticsearch Character Filters

Sentry image

Hands-on debugging session: instrument, monitor, and fix

Join Lazar for a hands-on session where you’ll build it, break it, debug it, and fix it. You’ll set up Sentry, track errors, use Session Replay and Tracing, and leverage some good ol’ AI to find and fix issues fast.

RSVP here →

Top comments (0)

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay