DEV Community

Cover image for πŸ› οΈ How I Migrated Data from Solr to Elasticsearch Using Python
Aman Marothiya
Aman Marothiya

Posted on

πŸ› οΈ How I Migrated Data from Solr to Elasticsearch Using Python

β€œLogs and queries are silent narrators. Migrating them? That's like rewriting a language mid-conversation.”

🧭 Introduction
When you work in a production environment dealing with millions of documents, real-time search, and user-facing performance, even the smallest change can shake the system.

So when our team decided to migrate from Apache Solr to Elasticsearch, I knew I wasn’t just moving data β€” I was rebuilding the core of how our system understood information.

This post is not just about scripts and APIs, but about:

  • The real challenges we faced
  • The strategic decisions I made
  • And how Python became my best friend during this transformation.

πŸ“¦ Why We Moved from Solr to Elasticsearch
Solr had served us well β€” reliable, fast, and open-source. But as our application grew, so did our need for:

  • Real-time indexing
  • Horizontal scalability
  • Easier integrations (with tools like Kibana, Logstash, Filebeat)
  • And most importantly, a richer Query DSL

Elasticsearch felt like the natural upgrade to keep up with modern DevOps and product needs.

πŸ§ͺ Pre-Migration Analysis
Before writing a single line of code, I listed key things to answer:

  1. How much data are we moving?
  2. How are the fields mapped in Solr vs Elasticsearch?
  3. What index strategy fits Elasticsearch best?
  4. Are there analyzers/tokenizers to be preserved or changed?
  5. What queries need to be rewritten post-migration?

I also took sample data dumps to:

  • Understand field types
  • Plan bulk ingestion structure
  • And define a custom Elasticsearch index template

🧰 Tools I Used

  • Python 3.x
  • pysolr for querying and extracting from Solr
  • elasticsearch Python client
  • tqdm for progress bars (life-saver during 10M+ docs)
  • Shell scripting for automation & logging

"When your data runs in GBs and TBs, scripting smartly is 50% of the battle."

πŸ”„ The Migration Workflow
Here’s how I structured the migration process:

1. Connect to Solr & Elasticsearch
2. Fetch data in batches from Solr (e.g., 1000 docs at a time)
3. Transform each doc β†’ match target index structure
4. Index into Elasticsearch using bulk API
5. Track failures, retry, log every step
Enter fullscreen mode Exit fullscreen mode

Python Snippet: Solr to Elasticsearch

# πŸš€ Solr to Elasticsearch Migration Script Using Python
# Author: Aman Marothiya | Purpose: Migrate data from Solr Core to Elasticsearch Index

import requests
import json
from tqdm import tqdm  # Optional: Adds a nice progress bar

# πŸ”§ Step 1: Solr Configuration
solr_url = 'http://localhost:8983/solr/your_solr_core/select?q=*:*&rows=10000&wt=json'

# πŸ”§ Step 2: Elasticsearch Configuration
es_url = 'http://localhost:9200/your_es_index/_bulk'
headers = {'Content-Type': 'application/json'}

# πŸ“¦ Step 3: Fetch data from Solr
print("πŸ“‘ Fetching data from Solr...")
response = requests.get(solr_url)
solr_docs = response.json()['response']['docs']

# πŸ› οΈ Step 4: Prepare bulk payload for Elasticsearch
bulk_data = ''
for doc in tqdm(solr_docs, desc="πŸ”„ Converting docs"):
    meta = {
        "index": {
            "_index": "your_es_index"
        }
    }
    bulk_data += json.dumps(meta) + '\n'
    bulk_data += json.dumps(doc) + '\n'

# 🚚 Step 5: Send data to Elasticsearch
print("πŸš€ Sending data to Elasticsearch...")
response = requests.post(es_url, headers=headers, data=bulk_data)

# βœ… Step 6: Confirmation
if response.status_code == 200:
    print("πŸŽ‰ Migration complete!")
else:
    print(f"❌ Migration failed! Status: {response.status_code}, Message: {response.text}")

Enter fullscreen mode Exit fullscreen mode

🧠 Final Thoughts
Migrating from Solr to Elasticsearch wasn’t just about switching systems β€” it was about redefining how we scale, search, and serve users.

Yes, there were challenges β€” schema mismatches, tuning analyzers, performance hits β€” but solving them helped me grow technically and strategically.

If you’re planning a similar migration, remember:

  • Plan before you code.
  • Validate with small datasets.
  • And always log intelligently.

🀝 Let’s Connect
Have questions? Want to know more about this setup or DevOps/ELK-related things?

πŸ‘‰ Drop a comment or connect with me on
LinkedIn
πŸ§‘β€πŸ’» [GitHub]

Top comments (0)