Okay, so you've set up elasticsearch. You've indexed your data. Search is super fast. All's good. But, suddenly, you have a requirement for which you need to change the mapping of your index. Maybe you need to use a different analyser, or maybe it's as simple as adding a new field to your document, which requires you to add the associated static mapping.
If you find yourself in such a situation, here are a few approaches you can take -
-
Approach 1 - with downtime; index from external data source.
- This assumes that you have an external data source such as a database from which you can index data all over again, as if you were doing it for the first time.
- When to use?
- This approach only makes sense for testing purposes in local or in staging. This should not be used in a production environment because downtime isn't really desirable.
- Steps
- Delete the index using the Delete API
- Create the index, and set the new mapping using the PUT Mapping API
- Index documents from external data source. You could do this using the Bulk API
-
Approach 2 - without downtime; index from external data source
- When to use?
- You could use this approach in production, but if you have a large number of documents, indexing from an external data source like a DB can be a time-consuming process.
- Steps
- If not done already, create an alias
index_alias
for your existing index (old_index
) and change your code to use the alias instead ofold_index
directly. - Create a new index
new_index
- Index documents from external data source. You could do this using the Bulk API
- Move the alias
index_alias
fromold_index
tonew_index
.
- If not done already, create an alias
- Caveats
- While the downtime is essentially zero, there could still be consistency issues
- Indexing from an external data source like a DB can be a time-consuming process if you have a large number of documents.
- When to use?
-
Approach 3 - without downtime; index from elasticsearch
- When to use?
- Can be used in production when you want to change the mapping of an existing field. If you are merely adding a field mapping, prefer Approach 4
- Steps
- If not done already, create an alias
index_alias
for your existing index (old_index
) and change your code to use the alias instead ofold_index
directly. - Create a new index
new_index
- use elasticsearch reindex API to copy docs from
old_index
tonew_index
. - Move the alias
index_alias
fromold_index
tonew_index
.
- If not done already, create an alias
- Caveats
- While the downtime is essentially zero, there could still be consistency issues
- When to use?
-
Approach 4 - without downtime; update existing index
- When to use?
- Can be used in production when you want to merely add a new field mapping.
- Steps
- Update mappings of index online using PUT mapping API.
- Use _update_by_query API with params
-
conflicts=proceed
- In the context of just picking up an online mapping change, documents which have been updated during the process, and therefore have a version conflict, would have picked up the new mapping anyway. Hence, version conflicts can be ignored.
-
wait_for_completion=false
so that it runs as a background task -
refresh
so that all shards of the index are updated when the request completes.
-
- Caveats
- Can't be used if you want to change the mapping of an existing field. Use Approach 3 instead.
- When to use?
Top comments (0)