<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Krishna Kanth Latya</title>
    <description>The latest articles on DEV Community by Krishna Kanth Latya (@krishnakanthlatya).</description>
    <link>https://dev.to/krishnakanthlatya</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3986106%2F3695b0cb-cbf0-444c-8b27-8849377f5ba3.png</url>
      <title>DEV Community: Krishna Kanth Latya</title>
      <link>https://dev.to/krishnakanthlatya</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/krishnakanthlatya"/>
    <language>en</language>
    <item>
      <title>System Design: What Actually Happens When You Upload a File to Google Drive?</title>
      <dc:creator>Krishna Kanth Latya</dc:creator>
      <pubDate>Mon, 15 Jun 2026 20:27:04 +0000</pubDate>
      <link>https://dev.to/krishnakanthlatya/system-design-what-actually-happens-when-you-upload-a-file-to-google-drive-ple</link>
      <guid>https://dev.to/krishnakanthlatya/system-design-what-actually-happens-when-you-upload-a-file-to-google-drive-ple</guid>
      <description>&lt;p&gt;Uploading a file to Google Drive feels simple. You select a file, click Upload, watch a progress bar move, and moments later the file appears in your Drive.&lt;/p&gt;

&lt;p&gt;But behind this seemingly simple action lies a highly distributed system designed to handle millions of users, billions of files, and exabytes of data while remaining reliable, scalable, and fault-tolerant.&lt;/p&gt;

&lt;p&gt;In this article, we'll explore what actually happens behind the scenes when you upload a file to Google Drive and how the system is designed to operate at global scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;At first glance, uploading a file appears straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User
  │
  ▼
Upload API
  │
  ▼
Storage
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For a small application, this architecture might work. However, Google Drive operates at an entirely different scale. Users upload everything from small images to massive video files and backups that can be hundreds of gigabytes in size. At the same time, millions of users may be uploading files concurrently from different parts of the world.&lt;/p&gt;

&lt;p&gt;This creates several challenges:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Large file uploads can take hours&lt;/li&gt;
&lt;li&gt;Network connections may disconnect midway&lt;/li&gt;
&lt;li&gt;Millions of uploads must be handled simultaneously&lt;/li&gt;
&lt;li&gt;Uploaded data must remain accurate and uncorrupted&lt;/li&gt;
&lt;li&gt;Hardware failures should never cause data loss&lt;/li&gt;
&lt;li&gt;Storage must scale to billions of files&lt;/li&gt;
&lt;li&gt;Users expect fast and seamless uploads&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A simple upload server cannot solve these problems.&lt;/p&gt;

&lt;h2&gt;
  
  
  High-Level Solution
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs5seqx5xkx85ct537iyx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs5seqx5xkx85ct537iyx.png" alt=" " width="800" height="356"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Instead of uploading an entire file at once, Google Drive breaks the file into smaller chunks. These chunks are uploaded independently, validated, temporarily stored, and later assembled into the final file.&lt;/p&gt;

&lt;p&gt;Each upload is tracked through an upload session, allowing interrupted uploads to resume from where they stopped rather than starting over.&lt;/p&gt;

&lt;p&gt;Once the upload is complete, the file is stored in Google's distributed storage infrastructure and replicated across multiple locations to ensure durability and availability.&lt;/p&gt;

&lt;p&gt;Meanwhile, background services generate thumbnails, scan for viruses, extract metadata, and prepare previews without delaying the user experience.&lt;/p&gt;

&lt;p&gt;Let's walk through the complete upload journey.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: User Authentication
&lt;/h2&gt;

&lt;p&gt;Before an upload begins, Google must verify the user's identity. The Google Drive client sends an access token obtained during login.&lt;/p&gt;

&lt;p&gt;The authentication service verifies:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User identity&lt;/li&gt;
&lt;li&gt;Storage quota&lt;/li&gt;
&lt;li&gt;Account permissions&lt;/li&gt;
&lt;li&gt;Upload authorization&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Only after successful verification can the upload proceed. This prevents unauthorized users from consuming storage resources.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Upload Session Creation
&lt;/h2&gt;

&lt;p&gt;Google does not immediately start receiving file data. Instead, it first creates an upload session. The upload session acts as a tracking record for the entire upload process.&lt;/p&gt;

&lt;p&gt;It stores information such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User ID&lt;/li&gt;
&lt;li&gt;File name&lt;/li&gt;
&lt;li&gt;Upload status&lt;/li&gt;
&lt;li&gt;Uploaded chunks&lt;/li&gt;
&lt;li&gt;Remaining chunks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This session becomes extremely important if the upload gets interrupted.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: File Chunking
&lt;/h2&gt;

&lt;p&gt;Uploading large files as a single request is inefficient and risky. Instead, Google splits files into smaller chunks.&lt;/p&gt;

&lt;p&gt;Example: &lt;strong&gt;5 GB File&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Chunk 1
Chunk 2
Chunk 3
Chunk 4
...
Chunk N
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Chunking provides several advantages:&lt;/p&gt;

&lt;h3&gt;
  
  
  Faster Recovery
&lt;/h3&gt;

&lt;p&gt;If a single chunk fails:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Retry Chunk 52
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;instead of:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Retry Entire 5 GB File
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Parallel Uploads
&lt;/h3&gt;

&lt;p&gt;Multiple chunks can be uploaded simultaneously. This significantly improves upload performance.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Chunk 1 ──►
Chunk 2 ──►
Chunk 3 ──►
Chunk 4 ──►
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 4: API Gateway and Load Balancing
&lt;/h2&gt;

&lt;p&gt;Every upload request first reaches Google's edge infrastructure.&lt;/p&gt;

&lt;p&gt;Responsibilities include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Request routing&lt;/li&gt;
&lt;li&gt;Authentication validation&lt;/li&gt;
&lt;li&gt;Rate limiting&lt;/li&gt;
&lt;li&gt;Traffic management&lt;/li&gt;
&lt;li&gt;DDoS protection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instead of a single upload server handling all traffic, requests are distributed across thousands of upload servers. This allows Google Drive to support millions of concurrent uploads.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Chunk Verification
&lt;/h2&gt;

&lt;p&gt;Data can become corrupted during transmission. To ensure integrity, every uploaded chunk is validated using checksums.&lt;/p&gt;

&lt;p&gt;Common verification methods include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SHA-256&lt;/li&gt;
&lt;li&gt;CRC32C&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If verification fails:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Chunk Rejected
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The client simply uploads the chunk again. This guarantees that the stored data exactly matches the original file.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 6: Temporary Chunk Storage
&lt;/h2&gt;

&lt;p&gt;Successfully verified chunks are stored temporarily. At this stage, the file does not yet exist as a complete object. Google stores each chunk independently while tracking progress through the upload session.&lt;/p&gt;

&lt;p&gt;This design enables:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Upload recovery&lt;/li&gt;
&lt;li&gt;Parallel uploads&lt;/li&gt;
&lt;li&gt;Efficient retries&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 7: Resumable Uploads
&lt;/h2&gt;

&lt;p&gt;One of the most important features of Google Drive is resumable uploads.&lt;/p&gt;

&lt;p&gt;Imagine a network failure during upload. Without upload sessions, the user would need to start over.&lt;/p&gt;

&lt;p&gt;Instead, Google checks the upload session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Uploaded Chunks:
1 ✓
2 ✓
3 ✓
4 ✓
...
400 ✓
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When connectivity returns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Resume From Chunk 401
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;rather than:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Resume From Chunk 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This dramatically improves reliability and user experience.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 8: File Assembly Service
&lt;/h2&gt;

&lt;p&gt;After all chunks arrive successfully, Google assembles them into a complete file. The assembly service ensures chunks are combined in the correct order to reconstruct the original file.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 9: Metadata Service
&lt;/h2&gt;

&lt;p&gt;A file consists of two parts:&lt;/p&gt;

&lt;h3&gt;
  
  
  Metadata
&lt;/h3&gt;

&lt;p&gt;The actual bytes of the file.&lt;/p&gt;

&lt;h3&gt;
  
  
  File Content
&lt;/h3&gt;

&lt;p&gt;Information about the file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"fileId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"xyz123"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"vacation.mp4"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"owner"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"size"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"5GB"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Metadata is stored separately because it allows Google Drive to provide:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Search&lt;/li&gt;
&lt;li&gt;Sharing&lt;/li&gt;
&lt;li&gt;Folder navigation&lt;/li&gt;
&lt;li&gt;Permission management&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;without scanning the actual file contents.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 10: Distributed Object Storage &amp;amp; Metadata Mapping
&lt;/h2&gt;

&lt;p&gt;Once all chunks are successfully uploaded and verified, the system logically assembles the file. Instead of physically gluing the chunks back together onto a single hard drive, the system creates a metadata map (a recipe showing how the chunks fit together) and distributes the individual chunks across Google's storage infrastructure.&lt;/p&gt;

&lt;p&gt;Instead of keeping the data on one machine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Storage Node A holds Chunk 1
Storage Node B holds Chunk 2
Storage Node C holds Chunk 3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Benefits include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Horizontal scalability:&lt;/strong&gt; No single server runs out of disk space.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Faster access:&lt;/strong&gt; Users can download different chunks in parallel from multiple servers simultaneously.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage efficiency:&lt;/strong&gt; Allows Google Drive to manage billions of massive files without bottlenecking individual hardware units.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 11: Chunk Replication for Durability
&lt;/h2&gt;

&lt;p&gt;Hardware failures happen constantly in large-scale systems. To prevent data loss, the system doesn't just store those distributed chunks once — it immediately creates identical redundant copies of each chunk across different physical locations.&lt;/p&gt;

&lt;p&gt;The system clones the chunks across isolated zones:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Copy 1 of all chunks → Data Center A (e.g., Oregon)

Copy 2 of all chunks → Data Center B (e.g., Iowa)

Copy 3 of all chunks → Data Center C (e.g., Belgium)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If a specific server node crashes, a rack loses power, or an entire data center goes offline due to a natural disaster, the file remains fully intact and accessible from another region.&lt;/p&gt;

&lt;p&gt;This geographic replication strategy ensures near-perfect data durability.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 12: Background Processing
&lt;/h2&gt;

&lt;p&gt;The upload may be complete, but additional work still needs to happen.&lt;/p&gt;

&lt;p&gt;Google typically performs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Virus scanning&lt;/li&gt;
&lt;li&gt;Thumbnail generation&lt;/li&gt;
&lt;li&gt;Search indexing&lt;/li&gt;
&lt;li&gt;OCR processing&lt;/li&gt;
&lt;li&gt;Video transcoding&lt;/li&gt;
&lt;li&gt;Preview generation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instead of blocking the upload, these tasks run asynchronously in the background. As a result, users gain access to their files quickly while additional processing continues behind the scenes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;A file upload may look simple on the surface, but behind the scenes it involves a sophisticated distributed system. This architecture enables Google Drive to provide a fast, reliable, and scalable experience while handling billions of files across the globe.&lt;/p&gt;

&lt;p&gt;The next time you drag a file into Google Drive, remember that behind a simple progress bar is a massive distributed system working together to ensure your data is uploaded safely and reliably.&lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>java</category>
      <category>googledrive</category>
    </item>
  </channel>
</rss>
