โLogs and queries are silent narrators. Migrating them? That's like rewriting a language mid-conversation.โ
๐งญ Introduction
When you work in a production environment dealing with millions of documents, real-time search, and user-facing performance, even the smallest change can shake the system.
So when our team decided to migrate from Apache Solr to Elasticsearch, I knew I wasnโt just moving data โ I was rebuilding the core of how our system understood information.
This post is not just about scripts and APIs, but about:
- The real challenges we faced
- The strategic decisions I made
- And how Python became my best friend during this transformation.
๐ฆ Why We Moved from Solr to Elasticsearch
Solr had served us well โ reliable, fast, and open-source. But as our application grew, so did our need for:
- Real-time indexing
- Horizontal scalability
- Easier integrations (with tools like Kibana, Logstash, Filebeat)
- And most importantly, a richer Query DSL
Elasticsearch felt like the natural upgrade to keep up with modern DevOps and product needs.
๐งช Pre-Migration Analysis
Before writing a single line of code, I listed key things to answer:
- How much data are we moving?
- How are the fields mapped in Solr vs Elasticsearch?
- What index strategy fits Elasticsearch best?
- Are there analyzers/tokenizers to be preserved or changed?
- What queries need to be rewritten post-migration?
I also took sample data dumps to:
- Understand field types
- Plan bulk ingestion structure
- And define a custom Elasticsearch index template
๐งฐ Tools I Used
- Python 3.x
- pysolr for querying and extracting from Solr
- elasticsearch Python client
- tqdm for progress bars (life-saver during 10M+ docs)
- Shell scripting for automation & logging
"When your data runs in GBs and TBs, scripting smartly is 50% of the battle."
๐ The Migration Workflow
Hereโs how I structured the migration process:
1. Connect to Solr & Elasticsearch
2. Fetch data in batches from Solr (e.g., 1000 docs at a time)
3. Transform each doc โ match target index structure
4. Index into Elasticsearch using bulk API
5. Track failures, retry, log every step
Python Snippet: Solr to Elasticsearch
# ๐ Solr to Elasticsearch Migration Script Using Python
# Author: Aman Marothiya | Purpose: Migrate data from Solr Core to Elasticsearch Index
import requests
import json
from tqdm import tqdm # Optional: Adds a nice progress bar
# ๐ง Step 1: Solr Configuration
solr_url = 'http://localhost:8983/solr/your_solr_core/select?q=*:*&rows=10000&wt=json'
# ๐ง Step 2: Elasticsearch Configuration
es_url = 'http://localhost:9200/your_es_index/_bulk'
headers = {'Content-Type': 'application/json'}
# ๐ฆ Step 3: Fetch data from Solr
print("๐ก Fetching data from Solr...")
response = requests.get(solr_url)
solr_docs = response.json()['response']['docs']
# ๐ ๏ธ Step 4: Prepare bulk payload for Elasticsearch
bulk_data = ''
for doc in tqdm(solr_docs, desc="๐ Converting docs"):
meta = {
"index": {
"_index": "your_es_index"
}
}
bulk_data += json.dumps(meta) + '\n'
bulk_data += json.dumps(doc) + '\n'
# ๐ Step 5: Send data to Elasticsearch
print("๐ Sending data to Elasticsearch...")
response = requests.post(es_url, headers=headers, data=bulk_data)
# โ
Step 6: Confirmation
if response.status_code == 200:
print("๐ Migration complete!")
else:
print(f"โ Migration failed! Status: {response.status_code}, Message: {response.text}")
๐ง Final Thoughts
Migrating from Solr to Elasticsearch wasnโt just about switching systems โ it was about redefining how we scale, search, and serve users.
Yes, there were challenges โ schema mismatches, tuning analyzers, performance hits โ but solving them helped me grow technically and strategically.
If youโre planning a similar migration, remember:
- Plan before you code.
- Validate with small datasets.
- And always log intelligently.
๐ค Letโs Connect
Have questions? Want to know more about this setup or DevOps/ELK-related things?
๐ Drop a comment or connect with me on
LinkedIn
๐งโ๐ป [GitHub]
Top comments (0)