DEV Community

Cover image for ๐Ÿ› ๏ธ How I Migrated Data from Solr to Elasticsearch Using Python
Aman Marothiya
Aman Marothiya

Posted on

๐Ÿ› ๏ธ How I Migrated Data from Solr to Elasticsearch Using Python

โ€œLogs and queries are silent narrators. Migrating them? That's like rewriting a language mid-conversation.โ€

๐Ÿงญ Introduction
When you work in a production environment dealing with millions of documents, real-time search, and user-facing performance, even the smallest change can shake the system.

So when our team decided to migrate from Apache Solr to Elasticsearch, I knew I wasnโ€™t just moving data โ€” I was rebuilding the core of how our system understood information.

This post is not just about scripts and APIs, but about:

  • The real challenges we faced
  • The strategic decisions I made
  • And how Python became my best friend during this transformation.

๐Ÿ“ฆ Why We Moved from Solr to Elasticsearch
Solr had served us well โ€” reliable, fast, and open-source. But as our application grew, so did our need for:

  • Real-time indexing
  • Horizontal scalability
  • Easier integrations (with tools like Kibana, Logstash, Filebeat)
  • And most importantly, a richer Query DSL

Elasticsearch felt like the natural upgrade to keep up with modern DevOps and product needs.

๐Ÿงช Pre-Migration Analysis
Before writing a single line of code, I listed key things to answer:

  1. How much data are we moving?
  2. How are the fields mapped in Solr vs Elasticsearch?
  3. What index strategy fits Elasticsearch best?
  4. Are there analyzers/tokenizers to be preserved or changed?
  5. What queries need to be rewritten post-migration?

I also took sample data dumps to:

  • Understand field types
  • Plan bulk ingestion structure
  • And define a custom Elasticsearch index template

๐Ÿงฐ Tools I Used

  • Python 3.x
  • pysolr for querying and extracting from Solr
  • elasticsearch Python client
  • tqdm for progress bars (life-saver during 10M+ docs)
  • Shell scripting for automation & logging

"When your data runs in GBs and TBs, scripting smartly is 50% of the battle."

๐Ÿ”„ The Migration Workflow
Hereโ€™s how I structured the migration process:

1. Connect to Solr & Elasticsearch
2. Fetch data in batches from Solr (e.g., 1000 docs at a time)
3. Transform each doc โ†’ match target index structure
4. Index into Elasticsearch using bulk API
5. Track failures, retry, log every step
Enter fullscreen mode Exit fullscreen mode

Python Snippet: Solr to Elasticsearch

# ๐Ÿš€ Solr to Elasticsearch Migration Script Using Python
# Author: Aman Marothiya | Purpose: Migrate data from Solr Core to Elasticsearch Index

import requests
import json
from tqdm import tqdm  # Optional: Adds a nice progress bar

# ๐Ÿ”ง Step 1: Solr Configuration
solr_url = 'http://localhost:8983/solr/your_solr_core/select?q=*:*&rows=10000&wt=json'

# ๐Ÿ”ง Step 2: Elasticsearch Configuration
es_url = 'http://localhost:9200/your_es_index/_bulk'
headers = {'Content-Type': 'application/json'}

# ๐Ÿ“ฆ Step 3: Fetch data from Solr
print("๐Ÿ“ก Fetching data from Solr...")
response = requests.get(solr_url)
solr_docs = response.json()['response']['docs']

# ๐Ÿ› ๏ธ Step 4: Prepare bulk payload for Elasticsearch
bulk_data = ''
for doc in tqdm(solr_docs, desc="๐Ÿ”„ Converting docs"):
    meta = {
        "index": {
            "_index": "your_es_index"
        }
    }
    bulk_data += json.dumps(meta) + '\n'
    bulk_data += json.dumps(doc) + '\n'

# ๐Ÿšš Step 5: Send data to Elasticsearch
print("๐Ÿš€ Sending data to Elasticsearch...")
response = requests.post(es_url, headers=headers, data=bulk_data)

# โœ… Step 6: Confirmation
if response.status_code == 200:
    print("๐ŸŽ‰ Migration complete!")
else:
    print(f"โŒ Migration failed! Status: {response.status_code}, Message: {response.text}")

Enter fullscreen mode Exit fullscreen mode

๐Ÿง  Final Thoughts
Migrating from Solr to Elasticsearch wasnโ€™t just about switching systems โ€” it was about redefining how we scale, search, and serve users.

Yes, there were challenges โ€” schema mismatches, tuning analyzers, performance hits โ€” but solving them helped me grow technically and strategically.

If youโ€™re planning a similar migration, remember:

  • Plan before you code.
  • Validate with small datasets.
  • And always log intelligently.

๐Ÿค Letโ€™s Connect
Have questions? Want to know more about this setup or DevOps/ELK-related things?

๐Ÿ‘‰ Drop a comment or connect with me on
LinkedIn
๐Ÿง‘โ€๐Ÿ’ป [GitHub]

Top comments (0)