Your laptop has a CPU.
A CPU runs code one operation at a time, very fast. For Python scripts, data processing, and small models, it is completely fine.
Training a neural network is different. A single training step involves millions of matrix multiplications. A CPU does these sequentially. Even a decent laptop CPU takes minutes per epoch on a small image dataset. A real training run can take hours. Or days.
A GPU does those same matrix multiplications in parallel. Thousands of cores, all working simultaneously. What took 4 hours on a CPU takes 8 minutes on a GPU.
You probably do not own a GPU. Google does, and they will let you use one for free.
That is Google Colab.
What Colab Is
Google Colab is Jupyter Notebooks running in the cloud on Google's infrastructure. Open your browser. Go to colab.research.google.com. Start coding. No installation. No setup. Python is already there. The most common data science libraries are already installed.
And you can switch on a free GPU with two clicks.
Every notebook is saved to Google Drive automatically. Share it with anyone via a link. They open the same notebook in their browser and run it too.
Getting Started in Two Minutes
Go to colab.research.google.com.
Click "New notebook."
You see a blank Jupyter-style notebook. Type in the first cell:
print("Hello from Colab")
import sys
print(f"Python {sys.version}")
Press Shift+Enter. It runs. Output appears.
Check what is already installed:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import sklearn
import tensorflow as tf
import torch
print(f"Pandas: {pd.__version__}")
print(f"NumPy: {np.__version__}")
print(f"TensorFlow: {tf.__version__}")
print(f"PyTorch: {torch.__version__}")
Output:
Pandas: 2.1.4
NumPy: 1.25.2
TensorFlow: 2.15.0
PyTorch: 2.1.0+cu121
All the major libraries. Ready. No pip install. No environment setup. No compatibility headaches. Just work.
Enabling the GPU: Two Clicks
This is the most important thing in this post.
Go to: Runtime → Change runtime type → Hardware accelerator → T4 GPU → Save
Then verify it worked:
import torch
print(f"GPU available: {torch.cuda.is_available()}")
print(f"GPU name: {torch.cuda.get_device_name(0)}")
print(f"GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
Output:
GPU available: True
GPU name: Tesla T4
GPU memory: 15.8 GB
15.8 gigabytes of GPU memory. Free. This is a Tesla T4, the same GPU used in production inference servers at major tech companies.
For TensorFlow:
import tensorflow as tf
gpus = tf.config.list_physical_devices('GPU')
print(f"GPUs available: {len(gpus)}")
for gpu in gpus:
print(f" {gpu.name}")
The Speed Difference Is Real
Run this comparison to feel the GPU advantage:
import torch
import time
size = 10000
A = torch.randn(size, size)
B = torch.randn(size, size)
start = time.time()
C_cpu = torch.matmul(A, B)
cpu_time = time.time() - start
A_gpu = A.cuda()
B_gpu = B.cuda()
torch.cuda.synchronize()
start = time.time()
C_gpu = torch.matmul(A_gpu, B_gpu)
torch.cuda.synchronize()
gpu_time = time.time() - start
print(f"CPU: {cpu_time:.3f} seconds")
print(f"GPU: {gpu_time:.3f} seconds")
print(f"Speedup: {cpu_time/gpu_time:.0f}x")
Output:
CPU: 4.821 seconds
GPU: 0.018 seconds
Speedup: 268x
268 times faster on a 10,000×10,000 matrix multiplication. Neural network training is essentially millions of these operations. This is why the GPU matters.
Connecting to Google Drive
Your Colab session is temporary. When the session ends (after 12 hours or when you disconnect), all files you created are gone.
Mount Google Drive to save work permanently:
from google.colab import drive
drive.mount('/content/drive')
A popup asks you to sign in to Google and grant permission. After that:
import pandas as pd
df = pd.read_csv('/content/drive/MyDrive/data/titanic.csv')
print(df.shape)
results = df.groupby('Survived')['Age'].mean()
results.to_csv('/content/drive/MyDrive/results/survival_by_age.csv')
print("Saved to Drive")
Your Google Drive appears at /content/drive/MyDrive/. Read files from it. Write files to it. They persist after the session ends.
This is your workflow: keep datasets in Google Drive, read them into Colab, process with GPU, save results back to Drive.
Installing Packages That Are Not Pre-Installed
Most things are already there. For anything else:
!pip install -q transformers accelerate datasets
import transformers
print(f"Transformers: {transformers.__version__}")
The -q flag suppresses the verbose installation output. The installation is instant because Colab already has most packages cached.
Installations do not persist between sessions. If you disconnect and reconnect, you need to reinstall. Put your installs in the first cell of the notebook so they run automatically when you open the session.
Uploading Files Directly
For small files you want to upload once:
from google.colab import files
uploaded = files.upload()
for filename in uploaded.keys():
print(f"Uploaded: {filename}")
df = pd.read_csv(filename)
print(df.head())
A file picker dialog appears. Select your CSV from your local machine. It uploads to the Colab session's /content/ folder.
Download files back to your machine:
from google.colab import files
files.download('results.csv')
GPU Memory Management
The T4 GPU has 15.8GB but it fills up fast during training. Monitor it:
!nvidia-smi
Output shows:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17 Driver Version: 525.105.17 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 |
| N/A 52C P0 28W / 70W | 3821MiB / 15360MiB | 0% Default |
+-----------------------------------------------------------------------------+
3821MB used out of 15360MB. Plenty of headroom. When training crashes with "CUDA out of memory," reduce your batch size. Cutting batch size in half roughly halves the GPU memory usage.
Free GPU memory manually:
import torch
import gc
del model
torch.cuda.empty_cache()
gc.collect()
print(f"GPU memory after cleanup: {torch.cuda.memory_allocated()/1e9:.2f} GB")
The Colab Limits You Need to Know
Session limit: Free Colab sessions disconnect after about 12 hours of runtime. If training takes longer than 12 hours, use checkpoints.
Idle disconnect: If your browser tab is inactive for too long (around 90 minutes), the session disconnects. Keep the tab open and interact with it occasionally during long runs.
GPU availability: The free tier gives you a GPU most of the time but not always. If GPUs are in high demand, you might get CPU only. Try again later or use a different account.
RAM limit: 12GB of system RAM. Large datasets can fill this. Use chunked loading for very large CSVs.
Not persistent: Files in /content/ vanish when the session ends. Only /content/drive/ persists. Always save important outputs to Drive.
For serious training that takes days, Colab Pro (around $10/month) gives longer sessions, more RAM, and guaranteed GPU access. Worth it when you are in the deep learning phase of this series.
Sharing Your Notebook
Every Colab notebook has a share button in the top right, just like Google Docs.
Click Share → change "Restricted" to "Anyone with the link can view."
Now anyone with the link can open your notebook, see all your code and outputs, and run it themselves on their own Colab session.
This is how you share data science work with collaborators and how you submit homework in courses that use Colab. One link. No setup on their end.
A Real Colab Workflow
Here is what the full workflow looks like when you start a deep learning project.
# Cell 1: Mount Drive and install extras
from google.colab import drive
drive.mount('/content/drive')
!pip install -q wandb
# Cell 2: Verify GPU
import torch
assert torch.cuda.is_available(), "GPU not available. Go to Runtime → Change runtime type."
print(f"GPU: {torch.cuda.get_device_name(0)}")
# Cell 3: Load data from Drive
import pandas as pd
df = pd.read_csv('/content/drive/MyDrive/datasets/train.csv')
print(f"Loaded: {df.shape}")
# Cell 4: Training setup
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Training on: {device}")
# ... training code ...
# Final cell: Save model to Drive
torch.save(model.state_dict(), '/content/drive/MyDrive/models/model_v1.pt')
print("Model saved to Drive")
Mount Drive. Verify GPU. Load data from Drive. Train. Save to Drive. That sequence repeats for every deep learning project you do in this series.
Colab vs Local Jupyter: When to Use Which
Use Colab when:
- Training neural networks that need a GPU
- Working with large models (transformers, image classifiers)
- Sharing work with others quickly via a link
- Your local machine is slow or old
- You want someone else to review your analysis
Use local Jupyter when:
- Working offline
- Your data is sensitive and should not leave your machine
- You have a good local GPU
- You want faster iteration on small experiments
- Long sessions that would time out on Colab
In practice you use both. Explore and clean data locally. Train models on Colab. This series will move to Colab explicitly when neural networks begin in Phase 7.
A Resource Worth Knowing
Weights & Biases has a series of Colab notebooks called "W&B Colab Examples" that show professional-grade training setups with experiment tracking, GPU utilization monitoring, and model checkpointing. These are real production patterns implemented in Colab. Go to wandb.ai/tutorials and look for the PyTorch and TensorFlow Colab examples. They set the standard for how serious practitioners use Colab.
Try This
Open a new Colab notebook. Name it colab_gpu_test.ipynb.
Enable GPU runtime. Verify it is active with torch.cuda.is_available().
Run the CPU vs GPU speed comparison from this post. Print the speedup ratio.
Mount your Google Drive. Create a folder called colab_practice in your Drive. Write a small CSV file (any data you want) directly from Colab into that folder. Read it back and print the first five rows. Confirm the file appears in your Google Drive through the Drive UI.
Train a tiny neural network on the MNIST handwritten digits dataset using PyTorch on the GPU. MNIST is built into torchvision. Train for five epochs. Print training loss per epoch. Save the trained model weights to your Google Drive.
Share the notebook link (view only) and include it in the README of your GitHub repository.
Phase 5 Complete
Five posts covering the tools that make you a professional rather than a hobbyist.
Git so you never lose code. GitHub so your work is visible and your portfolio is building. Jupyter for interactive analysis. Colab for GPU-powered experiments.
These are not exciting topics. Nobody posts on Twitter about mastering git stash. But the engineers who use these tools properly ship better work, collaborate more effectively, and get hired more consistently than engineers who treat them as afterthoughts.
Phase 6 starts now. Machine learning. Real algorithms. Real predictions. Everything from the previous five phases was preparation. This is where it gets real.
Top comments (0)