<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Aman Marothiya</title>
    <description>The latest articles on DEV Community by Aman Marothiya (@aman_marothiya7855).</description>
    <link>https://dev.to/aman_marothiya7855</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3384458%2F572769c9-9baa-41c7-9a67-68c830ceac73.png</url>
      <title>DEV Community: Aman Marothiya</title>
      <link>https://dev.to/aman_marothiya7855</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/aman_marothiya7855"/>
    <language>en</language>
    <item>
      <title>🛠️ How I Migrated Data from Solr to Elasticsearch Using Python</title>
      <dc:creator>Aman Marothiya</dc:creator>
      <pubDate>Mon, 28 Jul 2025 09:03:11 +0000</pubDate>
      <link>https://dev.to/aman_marothiya7855/how-i-migrated-data-from-solr-to-elasticsearch-using-python-k4b</link>
      <guid>https://dev.to/aman_marothiya7855/how-i-migrated-data-from-solr-to-elasticsearch-using-python-k4b</guid>
      <description>&lt;p&gt;“Logs and queries are silent narrators. Migrating them? That's like rewriting a language mid-conversation.”&lt;/p&gt;

&lt;p&gt;🧭 Introduction&lt;br&gt;
When you work in a production environment dealing with millions of documents, real-time search, and user-facing performance, even the smallest change can shake the system.&lt;/p&gt;

&lt;p&gt;So when our team decided to migrate from Apache Solr to Elasticsearch, I knew I wasn’t just moving data — I was rebuilding the core of how our system understood information.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This post is not just about scripts and APIs, but about:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The real challenges we faced&lt;/li&gt;
&lt;li&gt;The strategic decisions I made&lt;/li&gt;
&lt;li&gt;And how Python became my best friend during this transformation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;📦 Why We Moved from Solr to Elasticsearch&lt;/strong&gt;&lt;br&gt;
Solr had served us well — reliable, fast, and open-source. But as our application grew, so did our need for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Real-time indexing&lt;/li&gt;
&lt;li&gt;Horizontal scalability&lt;/li&gt;
&lt;li&gt;Easier integrations (with tools like Kibana, Logstash, Filebeat)&lt;/li&gt;
&lt;li&gt;And most importantly, a richer Query DSL&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Elasticsearch felt like the natural upgrade to keep up with modern DevOps and product needs.&lt;/p&gt;

&lt;p&gt;🧪 Pre-Migration Analysis&lt;br&gt;
Before writing a single line of code, I listed key things to answer:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; How much data are we moving?&lt;/li&gt;
&lt;li&gt; How are the fields mapped in Solr vs Elasticsearch?&lt;/li&gt;
&lt;li&gt; What index strategy fits Elasticsearch best?&lt;/li&gt;
&lt;li&gt; Are there analyzers/tokenizers to be preserved or changed?&lt;/li&gt;
&lt;li&gt; What queries need to be rewritten post-migration?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I also took sample data dumps to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Understand field types&lt;/li&gt;
&lt;li&gt;Plan bulk ingestion structure&lt;/li&gt;
&lt;li&gt;And define a custom Elasticsearch index template&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;🧰 Tools I Used&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Python 3.x&lt;/li&gt;
&lt;li&gt;pysolr for querying and extracting from Solr&lt;/li&gt;
&lt;li&gt;elasticsearch Python client&lt;/li&gt;
&lt;li&gt;tqdm for progress bars (life-saver during 10M+ docs)&lt;/li&gt;
&lt;li&gt;Shell scripting for automation &amp;amp; logging&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;"When your data runs in GBs and TBs, scripting smartly is 50% of the battle."&lt;/p&gt;

&lt;p&gt;🔄 The Migration Workflow&lt;br&gt;
Here’s how I structured the migration process:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. Connect to Solr &amp;amp; Elasticsearch
2. Fetch data in batches from Solr (e.g., 1000 docs at a time)
3. Transform each doc → match target index structure
4. Index into Elasticsearch using bulk API
5. Track failures, retry, log every step
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Python Snippet: Solr to Elasticsearch&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# 🚀 Solr to Elasticsearch Migration Script Using Python
# Author: Aman Marothiya | Purpose: Migrate data from Solr Core to Elasticsearch Index

import requests
import json
from tqdm import tqdm  # Optional: Adds a nice progress bar

# 🔧 Step 1: Solr Configuration
solr_url = 'http://localhost:8983/solr/your_solr_core/select?q=*:*&amp;amp;rows=10000&amp;amp;wt=json'

# 🔧 Step 2: Elasticsearch Configuration
es_url = 'http://localhost:9200/your_es_index/_bulk'
headers = {'Content-Type': 'application/json'}

# 📦 Step 3: Fetch data from Solr
print("📡 Fetching data from Solr...")
response = requests.get(solr_url)
solr_docs = response.json()['response']['docs']

# 🛠️ Step 4: Prepare bulk payload for Elasticsearch
bulk_data = ''
for doc in tqdm(solr_docs, desc="🔄 Converting docs"):
    meta = {
        "index": {
            "_index": "your_es_index"
        }
    }
    bulk_data += json.dumps(meta) + '\n'
    bulk_data += json.dumps(doc) + '\n'

# 🚚 Step 5: Send data to Elasticsearch
print("🚀 Sending data to Elasticsearch...")
response = requests.post(es_url, headers=headers, data=bulk_data)

# ✅ Step 6: Confirmation
if response.status_code == 200:
    print("🎉 Migration complete!")
else:
    print(f"❌ Migration failed! Status: {response.status_code}, Message: {response.text}")

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhj4mkjoq1x1m3uul13vc.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhj4mkjoq1x1m3uul13vc.gif" alt=" " width="800" height="200"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;🧠 Final Thoughts&lt;br&gt;
Migrating from Solr to Elasticsearch wasn’t just about switching systems — it was about redefining how we scale, search, and serve users.&lt;/p&gt;

&lt;p&gt;Yes, there were challenges — schema mismatches, tuning analyzers, performance hits — but solving them helped me grow technically and strategically.&lt;/p&gt;

&lt;p&gt;If you’re planning a similar migration, remember:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Plan before you code.&lt;/li&gt;
&lt;li&gt;Validate with small datasets.&lt;/li&gt;
&lt;li&gt;And always log intelligently.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;🤝 Let’s Connect&lt;br&gt;
Have questions? Want to know more about this setup or DevOps/ELK-related things?&lt;/p&gt;

&lt;p&gt;👉 Drop a comment or connect with me on&lt;br&gt;
   &lt;a href="https://www.linkedin.com/in/aman-marothiya78/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;&lt;br&gt;
🧑‍💻 &lt;a href="https://github.com/aman7855/" rel="noopener noreferrer"&gt;[GitHub]&lt;/a&gt; &lt;/p&gt;

</description>
      <category>elasticsearch</category>
      <category>devops</category>
      <category>python</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
