<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: SamxtnZ</title>
    <description>The latest articles on DEV Community by SamxtnZ (@samxtnz).</description>
    <link>https://dev.to/samxtnz</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3903869%2F021a4feb-7df4-4412-8786-057bbf910c70.jpeg</url>
      <title>DEV Community: SamxtnZ</title>
      <link>https://dev.to/samxtnz</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/samxtnz"/>
    <language>en</language>
    <item>
      <title>I spent 2 weeks building a free Python script to migrate IMAP email — here's what I learned</title>
      <dc:creator>SamxtnZ</dc:creator>
      <pubDate>Wed, 29 Apr 2026 08:30:03 +0000</pubDate>
      <link>https://dev.to/samxtnz/i-spent-2-weeks-building-a-free-python-script-to-migrate-imap-email-heres-what-i-learned-44bc</link>
      <guid>https://dev.to/samxtnz/i-spent-2-weeks-building-a-free-python-script-to-migrate-imap-email-heres-what-i-learned-44bc</guid>
      <description>&lt;p&gt;I needed to migrate a full email account from one server to another.&lt;/p&gt;

&lt;p&gt;Simple task, right?&lt;/p&gt;

&lt;p&gt;Three days later I had tried four different tools, corrupted two MBOX files, and accidentally uploaded 8,000 emails twice to the destination server with wrong timestamps.&lt;/p&gt;

&lt;p&gt;So I built my own. It took three weeks to get right. Here is everything I had to figure out — and the mistakes that almost nobody talks about.&lt;/p&gt;




&lt;h2&gt;
  
  
  The problem with existing tools
&lt;/h2&gt;

&lt;p&gt;Most email migration tools fall into one of three categories:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Cloud-based services&lt;/strong&gt; that route your emails through their servers. You pay monthly, and your data touches infrastructure you don't control.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Desktop apps&lt;/strong&gt; that cost $50–$200 for a one-time task and lock you to Windows.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generic Python IMAP scripts&lt;/strong&gt; on Stack Overflow that handle the happy path and break the moment your provider does something non-standard.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I wanted something that ran locally, cost nothing, handled the edge cases correctly, and worked on any operating system.&lt;/p&gt;




&lt;h2&gt;
  
  
  The five things I had to get right
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Aruba uses two different IMAP hostnames
&lt;/h3&gt;

&lt;p&gt;If you are in Italy and use Aruba email, you have probably hit this. Aruba has two different IMAP endpoints depending on your account type:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;imaps.aruba.it&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;imap.aruba.it&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most tools hardcode one. If yours is the other, you get a connection error with no useful explanation. The fix is simple — try both in sequence:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;IMAP_SERVERS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;aruba.it&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;imaps.aruba.it&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;imap.aruba.it&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;hostname&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;IMAP_SERVERS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]):&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;imap&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;imaplib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;IMAP4_SSL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hostname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;993&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;imap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;login&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;imap&lt;/span&gt;  &lt;span class="c1"&gt;# success — stop trying
&lt;/span&gt;    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;continue&lt;/span&gt;     &lt;span class="c1"&gt;# try next hostname
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This pattern works for any provider that might have multiple endpoints.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. MBOX files silently corrupt if you skip one line of code
&lt;/h3&gt;

&lt;p&gt;The MBOX format uses a line starting with &lt;code&gt;From&lt;/code&gt; (with a space) as a message separator. The problem: email bodies can also contain lines that start with &lt;code&gt;From&lt;/code&gt;. When a reader parses the MBOX file, it treats those lines as message boundaries and splits your email into phantom messages.&lt;/p&gt;

&lt;p&gt;The fix is to escape those lines by prepending &lt;code&gt;&amp;gt;&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;email_bytes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;email_message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;as_bytes&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;escaped&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;From &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;email_bytes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every line inside the message body that starts with &lt;code&gt;From&lt;/code&gt; becomes &lt;code&gt;&amp;gt;From&lt;/code&gt;. Compliant readers know to strip the &lt;code&gt;&amp;gt;&lt;/code&gt; when displaying. Non-compliant readers at least don't corrupt the file structure.&lt;/p&gt;

&lt;p&gt;Most MBOX implementations I tested — including popular Python scripts — skip this. You only notice the corruption when you try to import the file somewhere and get double the message count.&lt;/p&gt;




&lt;h3&gt;
  
  
  3. Email timestamps break without timezone preservation
&lt;/h3&gt;

&lt;p&gt;When you upload a message to a new IMAP server using the &lt;code&gt;APPEND&lt;/code&gt; command, you can pass an &lt;code&gt;internaldate&lt;/code&gt; — the timestamp that tells the server when the email was received.&lt;/p&gt;

&lt;p&gt;The naive approach:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# WRONG — loses timezone, shifts timestamps
&lt;/span&gt;&lt;span class="n"&gt;date_tuple&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parsedate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Date&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;internaldate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;imaplib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Time2Internaldate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mktime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;date_tuple&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;time.mktime()&lt;/code&gt; assumes local timezone. If your server is in UTC and your emails have a &lt;code&gt;+0200&lt;/code&gt; offset, every message lands two hours off. For 12,000 emails, that is 12,000 wrong timestamps in the destination's sort order.&lt;/p&gt;

&lt;p&gt;The correct approach:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;email.utils&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;parsedate_to_datetime&lt;/span&gt;

&lt;span class="n"&gt;date_header&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Date&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;internaldate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;date_header&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;dt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parsedate_to_datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;date_header&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# Only use if timezone-aware
&lt;/span&gt;        &lt;span class="c1"&gt;# Naive datetimes get discarded — server assigns current time
&lt;/span&gt;        &lt;span class="c1"&gt;# which is wrong, but less wrong than a shifted timestamp
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;dt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tzinfo&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;internaldate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dt&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;pass&lt;/span&gt;

&lt;span class="n"&gt;imap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;folder&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;internaldate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;raw_message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;parsedate_to_datetime()&lt;/code&gt; returns a timezone-aware datetime when the header includes offset information. Passing that directly to &lt;code&gt;imaplib.append()&lt;/code&gt; preserves the original timestamp correctly.&lt;/p&gt;




&lt;h3&gt;
  
  
  4. Duplicate messages accumulate if you can re-run
&lt;/h3&gt;

&lt;p&gt;If a migration gets interrupted — connection drops, rate limit hit, power cut — and you restart, a naive script uploads everything again. You end up with every email twice on the destination.&lt;/p&gt;

&lt;p&gt;The solution is to fingerprint every existing message on the destination before starting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_message_fingerprint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Message-ID is globally unique per RFC 2822
&lt;/span&gt;    &lt;span class="n"&gt;mid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Message-ID&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;mid&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;mid&lt;/span&gt;

    &lt;span class="c1"&gt;# Fallback for messages without Message-ID
&lt;/span&gt;    &lt;span class="n"&gt;date&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Date&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;sender&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;From&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;subject&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Subject&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;size&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;as_bytes&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    &lt;span class="n"&gt;raw&lt;/span&gt;     &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;|&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;sender&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;|&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;subject&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;|&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fp:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sha1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Before uploading anything, fetch headers-only from the destination and build a set of fingerprints:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;imap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id_range&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;(BODY.PEEK[HEADER])&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;existing_fingerprints&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;_message_fingerprint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then skip any message whose fingerprint is already in the set. The migration becomes safe to re-run as many times as needed.&lt;/p&gt;




&lt;h3&gt;
  
  
  5. Memory usage blows up if you load the mailbox into RAM
&lt;/h3&gt;

&lt;p&gt;The obvious approach — fetch all messages, store them in a list, iterate — works fine for small mailboxes. For a 50,000-email account it uses gigabytes of RAM and often crashes.&lt;/p&gt;

&lt;p&gt;The fix is generators. Instead of:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# BAD — loads everything into memory
&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;eid&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;all_ids&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;imap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;(RFC822)&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;raw&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# GOOD — one message in memory at a time
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;collect_messages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;imap&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;all_ids&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;eid&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;all_ids&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;imap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;(RFC822)&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;OK&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;raw&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;collect_messages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;imap&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;all_ids&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A 200,000-email mailbox uses the same RAM as a 200-email one. The generator yields one message, it gets processed and written to disk, memory is released, next message comes in.&lt;/p&gt;




&lt;h2&gt;
  
  
  The result
&lt;/h2&gt;

&lt;p&gt;After three weeks of building and testing across Gmail, Outlook, Aruba, Yahoo, and iCloud, I packaged everything into a single Python file called &lt;strong&gt;Please Backup&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;What it does:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Backup&lt;/strong&gt; any IMAP mailbox to EML (one file per email) or MBOX format&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Migrate&lt;/strong&gt; between any two IMAP servers with deduplication&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Full migration mode&lt;/strong&gt; — backup and upload in one run, credentials entered once&lt;/li&gt;
&lt;li&gt;Zero external dependencies — pure Python standard library&lt;/li&gt;
&lt;li&gt;Runs on Windows, macOS, Linux — anywhere Python 3.7+ is installed
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;python please_backup.py
&lt;span class="go"&gt;
╔══════════════════════════════════════════════════╗
║  Please Backup · Created By SamxtnZ · 2026      ║
╚══════════════════════════════════════════════════╝

What would you like to do?
  1. 📥 Download / Backup emails from a server
  2. 📤 Upload / Migrate emails to a new server
  3. 🔄 Full migration (backup then upload)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;MIT licensed. Free forever. No cloud involved — your emails go directly from the source IMAP server to your disk or destination server.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub: &lt;a href="https://github.com/SamxtnZ/please-backup" rel="noopener noreferrer"&gt;github.com/SamxtnZ/please-backup&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What I would do differently
&lt;/h2&gt;

&lt;p&gt;If I started over, two things:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Add OAuth2 support from day one.&lt;/strong&gt; Gmail and Outlook are actively deprecating App Passwords for IMAP. Right now the tool works with App Passwords, but that window is closing. OAuth2/XOAUTH2 is the right long-term solution and it is significantly more complex to implement.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Add a resume/checkpoint file.&lt;/strong&gt; If a 50,000-email migration is interrupted at message 49,000, the deduplication prevents re-uploading what already made it — but the backup phase restarts from zero. Writing a checkpoint file every 500 messages would solve this cleanly.&lt;/p&gt;

&lt;p&gt;Both are on the roadmap. If you want to contribute, pull requests are open.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key takeaways
&lt;/h2&gt;

&lt;p&gt;If you are building anything with IMAP:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Try multiple hostnames&lt;/strong&gt; per domain — providers like Aruba use more than one&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Escape &lt;code&gt;From&lt;/code&gt; lines&lt;/strong&gt; in MBOX output — this is not optional, it is spec&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use &lt;code&gt;parsedate_to_datetime()&lt;/code&gt;&lt;/strong&gt; and check &lt;code&gt;tzinfo&lt;/code&gt; before passing to &lt;code&gt;append()&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fingerprint before uploading&lt;/strong&gt; — Message-ID first, SHA1 fallback&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stream with generators&lt;/strong&gt; — never load a full mailbox into a list&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These five things are the difference between a script that works in testing and a tool that works on real email accounts in production.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Please Backup is open source and MIT licensed.&lt;/em&gt;&lt;br&gt;
&lt;em&gt;GitHub: &lt;a href="https://github.com/SamxtnZ/please-backup" rel="noopener noreferrer"&gt;github.com/SamxtnZ/please-backup&lt;/a&gt;&lt;/em&gt;&lt;br&gt;
&lt;em&gt;Issues, pull requests, and provider compatibility reports welcome.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>opensource</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
