Aisalkyn Aidarova

Posted on Dec 28, 2025

linux day #6

#cli #linux #tutorial #ubuntu

Full Ubuntu Version Upgrade (Release Upgrade)

1. What Is a “Full Software Upgrade”?

So far, we have used commands like:

apt update
apt upgrade
apt dist-upgrade
apt full-upgrade

These commands upgrade packages only, but they do NOT upgrade the Ubuntu OS version itself.

Ubuntu releases a new OS version every 6 months.
A release upgrade means upgrading from one Ubuntu version to another (for example: 22.04 → 22.10 → 23.04).

2. Important Difference: Package Upgrade vs Release Upgrade

Type	What it upgrades	Ubuntu version changes?
`apt upgrade`	Installed packages	❌ No
`apt dist-upgrade`	Packages + dependencies	❌ No
`do-release-upgrade`	Entire OS version	✅ Yes

3. Critical Things to Check Before Upgrading

A release upgrade is risky, especially on servers. Before upgrading, always check the following.

3.1 Full Backup (Mandatory)

Always have a verified backup
Backup must be accessible even if the system fails to boot
For cloud servers: snapshot + off-server backup

If the system becomes unbootable, the backup is your only recovery.

3.2 Disk Space

You need several GB of free space.

Check disk usage:

df -h

Example:

20% used
80% free → safe for upgrade

3.3 Time for Troubleshooting

Expect problems in ~20% of upgrades
Always plan several hours for fixing issues
Never upgrade a critical system without downtime window

3.4 Wait After Release (Best Practice)

Wait 1–2 weeks after a new Ubuntu release
Early bugs get fixed quickly
Ubuntu may even delay server upgrades until a stable point release

3.5 Third-Party Repositories

Check if all external repos support the new Ubuntu version
Unsupported repos cause:
- dependency conflicts
- broken packages
- failed upgrades

This is a major risk factor.

3.6 Bootable Recovery Media (Desktop Systems)

Prepare a bootable Ubuntu USB
Make sure BIOS allows USB boot
Know disk encryption passwords

This allows you to recover data if the OS fails.

4. LTS vs Non-LTS (Very Important)

Check your current version:

lsb_release -a

Example:

Ubuntu 22.04 LTS

What does LTS mean?

Long Term Support
5 years of security updates
Recommended for:
- servers
- production systems

Non-LTS versions:

Supported for 9 months only
Require frequent upgrades
Not recommended for servers

⚠️ Upgrading from LTS → non-LTS means losing long-term support

5. When Should You Upgrade?

System Type	Recommendation
Production server	Stay on LTS
Business-critical system	Stay on LTS
Workstation / testing	Optional
Learning / demo	Fine

Always ask:

“What problem does this upgrade solve for me?”

6. Upgrade Preparation Steps

Step 1: Fully Update Current System

sudo apt update
sudo apt full-upgrade

This ensures:

latest bug fixes
clean dependency state

Step 2: Reboot (If Kernel Updated)

sudo reboot

Ensures new kernel is active.

Step 3: Install Upgrade Tool

sudo apt install update-manager-core

This provides:

do-release-upgrade

7. Running the Release Upgrade

Step 1: Start Upgrade

sudo do-release-upgrade

If no new LTS is available, you may see:

No new release found

Step 2: Allow Non-LTS Upgrades (If Needed)

Edit:

sudo nano /etc/update-manager/release-upgrades

Change:

Prompt=lts

to:

Prompt=normal

Save and exit.

Step 3: Run Upgrade Again

sudo do-release-upgrade

Ubuntu will:

check system
detect SSH connection
warn about risks
download ~1–2 GB
ask configuration questions

8. During the Upgrade

Configuration Prompts

Examples:

Keyboard layout
Console character set

Default choices are usually safe.

Obsolete Packages

You may be asked:

Remove obsolete packages?

Usually safe
Review list if system is critical

Configuration File Conflicts

Example:

Configuration file '/etc/crontab' has been modified

Options:

Install maintainer version
Keep your version
View differences

Use D to inspect differences before deciding.

9. Kernel Errors (High Risk)

Kernel issues are critical because:

kernel loads first at boot
failure = system won’t start

Causes:

unusual CPU architecture (ARM)
custom kernels
incompatible drivers

10. Final Reboot

sudo reboot

After reboot:

lsb_release -a

If successful, version is upgraded.

11. Real-World Outcome (Important Lesson)

In this case:

Kernel upgrade failed
System became unbootable

This is realistic and valuable:

Upgrades can fail
Backups matter
Recovery skills are required

Troubleshooting an Unbootable Ubuntu System (Real Incident Walkthrough)

1. Why This Failure Is a Good Thing

This is actually a perfect real-world example.

In real DevOps work:

Systems do break
Upgrades do fail
You rarely get a clean, predictable error

This is far more valuable than a “happy-path” demo.

2. Initial Situation: System Does Not Boot

Symptoms:

System powers on
Boot messages appear
Kernel starts loading
System hangs and never completes boot

This tells us:

Hardware is OK
Bootloader likely works
Failure happens during kernel boot

3. First Rule of Incident Response: Stay Calm

Before touching anything:

Accept the system is down
Inform stakeholders if needed
Stop rushing
Think logically

Stress causes bad decisions.
Calm fixes systems.

4. Isolating the Failure: Bootloader vs Kernel

Observations:

Bootloader menu appears
“Booting Linux kernel” message appears
No userspace logs appear

Conclusion:
👉 Kernel is failing, not the bootloader.

5. Best-Case Scenario: GRUB Menu Available

Because GRUB was enabled earlier, we could:

Open Advanced options for Ubuntu
Select an older kernel
Boot successfully

Result:

System boots
Login works
Problem is confirmed: new kernel is broken

This immediately isolates the issue.

6. Worst-Case Scenario: GRUB Menu NOT Available

If GRUB was hidden (default on many systems):

You cannot select older kernels
System appears completely dead

Solution:

👉 Boot from a Live Linux system

7. Booting from a Live Linux System

What is a Live System?

Linux runs from USB/DVD
No changes written to disk
Full access to tools and terminal

Options:

Physical machine → USB or DVD
Virtual machine → attach ISO
Cloud server → provider “rescue mode”

8. Choosing the Correct Live Image

Important rules:

Desktop images usually include live mode
Server images often install immediately
Architecture must match your CPU

Special case (ARM systems):

ARM64 images are harder to find
Daily builds may be required
Older live images may work better

9. Booting the Live System

Steps:

Attach ISO
Set boot order (USB/DVD first)
Restart system
Choose “Try Ubuntu”

If errors appear:

Wait a few minutes
Many hardware warnings are harmless

10. First Priority: Data Access & Backup

Once live system is running:

Your installed system disk is mounted automatically
You can browse:
- /home
- /var/www
- /var/lib/mysql
- application data

👉 Even if repair fails, your data is safe

This alone is a major win.

11. Verifying File System Health

Before touching boot components, rule out disk corruption.

Read-only check (recommended):

sudo fsck /dev/sda2

Result:

No errors → filesystem is healthy
Problem is not disk-related

12. Accessing the Installed System via `chroot`

We need to work inside the broken system.

Step 1: Open terminal inside mounted system

Right-click → Open in Terminal

Step 2: Change root

sudo chroot .

What this does:

Does not boot the system
Redirects / to the installed OS
Commands now act as if system were running

This is critical for recovery.

13. Why Things Still Don’t Work Yet

Inside chroot, commands like:

update-grub

may fail with:

No such device

Why?

/dev, /proc, /sys are kernel-managed
They are missing inside chroot

14. Fixing Missing System Mounts (Critical Step)

We must bind system directories from the live kernel.

Bind mounts:

sudo mount --bind /dev /dev
sudo mount --bind /proc /proc
sudo mount --bind /sys /sys

Now the installed system can:

See disks
Detect kernels
Update bootloader properly

15. Rebuilding GRUB

Now run:

update-grub

This time:

Kernel entries are detected
Boot menu is regenerated correctly

16. Making the Working Kernel the Default

Step 1: Inspect GRUB menu entries

cat /boot/grub/grub.cfg

Find the exact menu entry of the working kernel.

Step 2: Set default kernel

Edit:

nano /etc/default/grub

Set:

GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 5.19.x"

(Exact text must match grub.cfg)

Step 3: Apply changes

update-grub

17. Reboot and Verify

Exit chroot:

exit

Restart system:

reboot

Result:

GRUB automatically selects working kernel
System boots normally
Login successful

18. Important Follow-Up (Next Lecture)

We are not done yet.

Next steps:

Prevent working kernel from being removed
Remove broken kernel safely
Lock kernel packages
Avoid repeat failure

👉 This is mandatory in production

Stabilizing the System After Recovery (Kernel Safety & Cleanup)

Our system is booting again, but recovery is not finished yet.

A recovered system is still fragile unless we prevent the same failure from happening again.

1. The Risk After Recovery

Right now:

The system boots only because an older kernel exists
If that kernel is removed → system becomes unbootable again

Common danger:

sudo apt autoremove

This command may silently remove old kernels if they are marked as auto-installed.

We must protect the working kernel.

2. Identify the Active (Working) Kernel

Check the running kernel:

uname -r

Example:

5.19.0-xx-generic

Kernel files live in:

/boot

Important files:

vmlinuz-<version> → kernel
initrd.img-<version> → initial RAM disk

These files must not disappear.

3. Find Which Package Owns the Kernel File

Linux package manager knows which package created each file.

Check kernel ownership:

dpkg -S /boot/vmlinuz-5.19.0-xx-generic

Output example:

linux-image-5.19.0-xx-generic: /boot/vmlinuz-5.19.0-xx-generic

This tells us:
👉 The kernel comes from this package

4. Mark the Working Kernel as Manually Installed

This is the most important protection step.

sudo apt install linux-image-5.19.0-xx-generic

Why this works:

Even if already installed, APT marks it as manually installed
autoremove will never delete it

APT logic:

Auto-installed → removable
Manually installed → protected

5. Why `initrd.img` Does Not Have a Package

You may notice:

dpkg -S /boot/initrd.img-5.19.0-xx-generic

returns nothing.

That is normal.

Reason:

initrd.img is generated dynamically
Created by post-install scripts of the kernel package

Verify:

sudo dpkg-reconfigure linux-image-5.19.0-xx-generic

You will see:

initramfs regeneration

As long as the kernel package stays installed → initrd stays too.

6. Removing the Broken Kernel (Optional but Recommended)

If a newer kernel breaks boot, remove it.

Step 1: Identify broken kernel package

dpkg -S /boot/vmlinuz-6.x.x-generic

Step 2: Remove dependent headers first

sudo apt remove linux-headers-generic

Step 3: Remove the broken kernel

sudo apt remove linux-image-6.x.x-generic

Why headers first?

Meta-packages depend on latest kernel
Removing headers breaks that dependency safely

⚠️ Do this only if you are sure the kernel is broken

7. Why `linux-headers-generic` Exists

This package:

Does not contain code
Always depends on the latest kernel

Installing it later:

sudo apt install linux-headers-generic

Will:

Pull the newest kernel again

Since the newest kernel caused failure:
👉 Do not reinstall it yet

8. Reboot and Verify Stability

Always test after kernel changes.

Reboot:

sudo reboot

Test:

Default boot entry
Advanced options → working kernel

If both work:
✅ System is stable again

9. Optional Cleanup: Reset GRUB Default

Now that broken kernel is gone:

First GRUB entry boots correctly
You may reset default behavior if desired

This is optional and not urgent.

10. Operational Best Practices (Real DevOps Advice)

A. Practice Recovery on Purpose

Create test failures:

Delete a boot file
Break GRUB config
Recover via live system

Practice makes incident response fast.

B. Servers Without Physical Access

In real servers:

Use provider rescue mode
SSH into recovery system
Use chroot only (no GUI)

Same logic — just CLI only.

C. Always Back Up Before Fixing

Even during rescue:

Copy /home
Copy /var
Copy application data

Never trust recovery until data is safe.

11. Common Causes of Boot Failures

Category	Examples
Kernel	incompatible kernel update
Bootloader	broken GRUB config
Filesystem	disk corruption
Packages	broken third-party drivers
Hardware	disk failure, overheating
Security	firewall blocks SSH
Mounts	`/etc/fstab` errors

Not all require live systems — boot failures do.

Cron Jobs in Linux — Concepts, Configuration, and Real Usage

1. Heads-Up: There Is More Than One Cron Implementation

Before working with cron, you need to know one important thing:

👉 Cron is not one single program.

Historically, multiple cron implementations evolved independently.
They all look similar, but they may differ slightly in:

features
defaults
supported syntax
email behavior

The concepts are the same, but details may vary.

2. What Is Cron?

Cron Daemon

Cron is a background service (daemon)
It wakes up every minute
Checks whether any scheduled jobs must run
Executes commands at predefined times

The name comes from Chronos, the Greek word for time.

3. Where Cron Jobs Are Stored

Cron reads multiple locations.

3.1 User-Specific Cron Jobs (Most Common)

Stored internally in:

/var/spool/cron/crontabs/

One file per user
Never edit these files directly
Permissions are intentionally restrictive

Correct way to manage them:

crontab -e

3.2 System-Wide Cron Jobs

Stored in:

/etc/crontab

Characteristics:

Editable directly
Must be owned by root
Must not be writable by group or others

Used mainly for system-level tasks.

3.3 `/etc/cron.d` (Debian / Ubuntu)

Directory containing cron job files
Often used by third-party software
You normally do not place your own jobs here
Cron loads every file in this directory

This is Debian/Ubuntu-specific behavior.

4. Editing a User Crontab

Open your crontab:

crontab -e

First time: you may be asked which editor to use
Editor choice is stored

Temporarily choose an editor:

EDITOR=vim crontab -e

EDITOR=nano crontab -e

View your crontab:

crontab -l

This is the only safe way to read it without root access.

5. Why You Should Never Edit Cron Files Directly

Even your own crontab:

/var/spool/cron/crontabs/<username>

Has strict permissions
Editing directly may:
- break cron
- corrupt format
- change ownership

👉 Always use crontab -e

6. Crontab File Structure

A crontab has two parts:

Optional environment variables
One or more cron job definitions

7. Environment Variables in Crontab

These apply only to cron jobs, not your shell.

Common ones:

SHELL=/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

Why this matters:

Cron uses a minimal PATH
Many commands fail without a full PATH
Default shell may not be Bash

⚠️ Not all cron implementations support this
(works on Ubuntu/Debian)

8. Cron Job Syntax (Core Knowledge)

General Format:

MINUTE HOUR DAY MONTH DAY_OF_WEEK COMMAND

Field	Range
Minute	0–59
Hour	0–23
Day	1–31
Month	1–12
Day of week	0–7 (Sun=0 or 7)

Example: Every day at 03:05

5 3 * * * command

9. Wildcards (`*`)

* means all possible values.

Example: every minute

* * * * * command

10. Redirecting Output (Very Important)

Cron runs without a terminal.

If you don’t redirect output:

Output may be emailed
Or silently discarded
Or logged elsewhere

Example:

* * * * * ping -c 1 google.com >> ~/ping.log

>> appends
Prevents overwriting

11. Testing a Cron Job

After saving your crontab:

Wait one minute
Check output file

cat ~/ping.log

If output appears → cron works.

12. Limiting Execution Frequency

Every hour at minute 0

0 * * * * command

Every 5 minutes

*/5 * * * * command

Runs at:

00, 05, 10, 15, 20, ...

Specific minutes

0,15,30,45 * * * * command

Hour range (08:00–20:00)

0 8-20 * * * command

Both ends included.

Every 2 hours

0 */2 * * * command

⚠️ This runs every minute during matching hours if minute is *.

Correct way:

0 */2 * * * command

Every 2 hours starting at 01:00

0 1-23/2 * * * command

13. Day of Week Filtering

Day of week acts as a filter.

Example: every Monday at midnight

0 0 * * 1 command

Values:

0 or 7 = Sunday
1 = Monday
6 = Saturday

14. Combining Fields Carefully (Common Pitfall)

This:

* */2 * * * command

Means:

Every minute
During every second hour

Result:

Runs 60 times per active hour
Silent for the next hour

Often not what you want.

15. Practical Example

SHELL=/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

0 * * * * ping -c 1 google.com >> ~/ping_hourly.log

Runs:

Once per hour
Clean output
Predictable behavior

16. Cron Implementations You Should Know

16.1 Vixie Cron

Default on Ubuntu/Debian
Package name: cron
Most common reference behavior

16.2 Anacron

Handles missed jobs
Runs jobs after system was offline
Used for:
- daily
- weekly
- monthly tasks

Ubuntu:

Separate package

CentOS:

Integrated

16.3 Cronie (CentOS / RHEL / Fedora)

Fork of Vixie Cron
Includes Anacron
Same syntax
Slightly different defaults

17. Why This Matters in Real Life

Cron is used for:

backups
log rotation
monitoring
cleanup tasks
report generation
automation glue

Understanding:

timing
output
environment
implementation differences

is mandatory for DevOps and SysAdmins.

Cron Output, Email Notifications, and `flock` (Ubuntu)

Important
This lecture is Ubuntu-specific.
CentOS behaves differently and is covered in the next lecture.

1. What Happens If We Do NOT Redirect Cron Output?

So far, we always redirected output:

>> file.log

But what if:

we don’t redirect stdout, or
the command writes to stderr, or
the command fails?

Answer:

👉 Cron tries to send the output by email to the job’s owner.

2. Default Cron Mail Behavior on Ubuntu

Output is emailed to the local user
Email delivery requires a Mail Transfer Agent (MTA)
Ubuntu does not install one by default

So initially:

cron tries to send mail
mail fails silently
output is discarded

3. Demonstration: Cron Job With Output (No Redirect)

Edit crontab:

crontab -e

Add:

* * * * * ping google.com

This runs every minute and produces output.

Wait one minute.

4. Checking Cron Logs (systemd)

On Ubuntu, cron logs are handled by systemd.

Follow cron logs:

journalctl -u cron -f

You will see:

CRON: No mail transfer agent installed, discarding output

So:

cron executed the job
output existed
but email delivery failed

5. Installing Mail Support on Ubuntu

Cron does not send mail itself.
It delegates email delivery to an MTA.

Install mail support:

sudo apt install mailutils

This installs:

mail command
Postfix (mail transfer agent)

6. Postfix Configuration (Initial)

During install, choose:

General type of mail configuration: Local only

This means:

Mail is delivered only to local users
No internet delivery yet

Finish installation.

7. Where Local Cron Emails Are Stored

Local emails are stored as plain text files:

/var/mail/<username>

Example:

sudo cat /var/mail/youruser

You will see:

email headers
cron output body
one email per execution

👉 This is not internet email
👉 This is local system mail

8. Sending Cron Output to an External Email Address

Cron supports the MAILTO variable.

Edit crontab:

crontab -e

Add at the top:

MAILTO=you@example.com

Now cron will attempt to send output externally.

9. Why External Email Initially Fails

Even with MAILTO, emails may not arrive because:

Postfix is set to local only
External delivery is disabled

10. Reconfiguring Postfix for Internet Mail

Reconfigure postfix:

sudo dpkg-reconfigure postfix

Choose:

Internet Site

Accept defaults for:

system mail name
mailbox size
delivery method

This allows:

outbound email
internet mail delivery

11. Why Emails Go to Spam (This Is Normal)

Your VM:

has no valid domain
no SPF / DKIM
unknown sender reputation

Result:

Gmail usually accepts the mail
but places it in Spam

This is expected.

For production servers:

proper DNS
proper mail relay
trusted domain

12. How to Stop Cron Email Spam

Option 1: Redirect output

* * * * * command >> file.log 2>&1

Option 2: Send output to `/dev/null`

* * * * * command > /dev/null 2>&1

Option 3: Comment out job

# * * * * * command

13. Why `flock` Is Important (Real Production Problem)

Cron does not prevent overlap.

If a job runs every minute:

previous run may still be active
next run starts anyway
leads to:
- database overload
- duplicate jobs
- race conditions

14. What `flock` Does

flock:

locks a file
only one process can hold the lock
others wait or exit

This allows mutual exclusion.

15. Simple `flock` Example

Terminal 1:

flock /tmp/test.lock ping google.com

Terminal 2:

flock /tmp/test.lock ping google.com

Result:

second command waits
runs only after first finishes

16. Non-Blocking `flock` (Cron-Safe)

Use:

flock -n /tmp/test.lock -c "command"

Behavior:

if lock exists → exit immediately
no overlap
cron sees success exit code

17. Why Exit Code Matters

Cron logic:

exit 0 = success
exit !=0 = error → email

Using:

flock -n file -c "cmd" || true

Ensures:

cron sees success
no spam
no duplicate execution

18. Real Production Cron Example

0 */2 * * * /usr/bin/flock -n /tmp/app.lock \
/usr/bin/php /var/www/app/artisan schedule:run

What this does:

runs every 2 hours
prevents overlap
skips execution if still running
safe for databases

19. Why Full Paths Are Used

Cron has a minimal PATH.

Best practice:

always use absolute paths

Find executable path:

which flock
which php

20. Why This Matters in Real Systems

Without flock:

multiple cron runs overlap
jobs collide
data corruption happens

With flock:

single execution guaranteed
predictable behavior
safe automation

System-Wide Cron Jobs and Anacron (Ubuntu)

Important

This lecture is Ubuntu-specific

CentOS / RHEL handle Anacron differently (covered separately)

Concepts are shared, implementations differ

1. User Cron Jobs vs System-Wide Cron Jobs

So far, we worked with user cron jobs:

crontab -e

Key points:

Affects only the current user
Stored internally in /var/spool/cron/crontabs/
Managed only via crontab command

Even this:

sudo crontab -e

still creates a user cron job — this time for the root user.

2. System-Wide Cron Jobs (`/etc/crontab`)

Ubuntu also supports system-wide cron jobs.

Location

/etc/crontab

Key differences

Regular text file
Edited directly (no crontab -e)
Owned by root
Writable only by root
Ignored if permissions are unsafe

3. Why System-Wide Cron Is Safe

Security model:

If someone can write /etc/crontab, they already have root
Root can already do anything
Cron ignores the file if permissions are wrong

So this is not a security risk.

4. System-Wide Cron Syntax

Unlike user crontabs, one extra field exists.

Format

MIN HOUR DAY MONTH DOW USER COMMAND

Example:

* * * * * alice echo "----" >> /home/alice/test.txt

This means:

Runs every minute
Executed as user alice
Writes to Alice’s home directory

5. Example: System-Wide Cron Job

Edit the file:

sudo nano /etc/crontab

Add:

* * * * * alice cd /home/alice && echo "----" >> test.txt

After one minute:

ls /home/alice

Result:

test.txt exists
File owned by alice
Command executed with Alice’s permissions

6. When to Use System-Wide Cron

Use user crontab when:

Job is personal
No root privileges needed
User manages their own tasks

Use system-wide cron when:

Job must run as a specific service user
Example users:
- www-data
- postgres
- mysql
You don’t want to log in as that user

Example:

*/5 * * * * www-data php /var/www/app/artisan cleanup

7. Introducing Anacron (Why Cron Is Not Enough)

Regular cron jobs:

Run only if the system is running
Miss execution if the system is off
Ignore battery state

Anacron solves this.

8. What Is Anacron?

Anacron:

Executes jobs eventually
Designed for:
- laptops
- desktops
- non-24/7 systems
Handles:
- missed executions
- delayed execution after reboot
- power-state awareness

Typical use cases:

log cleanup
cache cleanup
maintenance tasks

9. When to Use Anacron

Use Anacron when:

Exact execution time does not matter
Task must run at least once
Delay is acceptable

Use cron when:

Exact timing matters
Task must run on schedule
Servers are always online

10. Anacron Job Directories (Ubuntu)

Ubuntu integrates Anacron via folders:

/etc/cron.daily/
/etc/cron.weekly/
/etc/cron.monthly/

How it works:

Place an executable file in the folder
Anacron executes it automatically

11. Filename Restrictions (Important)

Allowed characters only:

letters (A–Z, a–z)
digits (0–9)
underscore _
dash -

Invalid filenames are ignored.

12. Example: Daily Anacron Job

List daily jobs:

ls /etc/cron.daily

Example:

/etc/cron.daily/apache2

Open it:

sudo nano /etc/cron.daily/apache2

You’ll see a shell script:

executed once per day
used for maintenance

13. How Anacron Is Configured

Configuration file:

/etc/anacrontab

14. Anacrontab Syntax

Format:

PERIOD DELAY JOB-ID COMMAND

Example:

1   5   cron.daily   run-parts /etc/cron.daily

Meaning:

Period: every 1 day
Delay: wait 5 minutes after boot
ID: unique identifier
Command: execute all scripts in folder

15. Default Ubuntu Anacron Jobs

1   5    cron.daily
7   10   cron.weekly
@monthly 15 cron.monthly

This enables:

/etc/cron.daily
/etc/cron.weekly
/etc/cron.monthly

16. Why `/etc/cron.hourly` Exists

Check /etc/crontab:

17 * * * * root cd / && run-parts --report /etc/cron.hourly

This is:

normal cron, not Anacron
runs hourly
no power-state awareness

If the system is down → job is skipped.

17. Fallback Logic in `/etc/crontab`

You may see lines like:

25 6 * * * root test -x /usr/sbin/anacron || run-parts /etc/cron.daily

Meaning:

If Anacron exists → do nothing
If Anacron is missing → fallback to cron

This guarantees:

daily jobs still run
even without Anacron

18. Battery-Aware Execution (How It Works)

Check Anacron’s systemd unit:

systemctl cat anacron.service

You will see:

ConditionACPower=true

Meaning:

Anacron runs only when plugged in

19. Overriding Battery Behavior (Optional)

To override:

sudo systemctl edit anacron.service

Add:

[Unit]
ConditionACPower=

Now:

Anacron runs even on battery

20. Best Practices for Cron & Anacron

Scheduling

Avoid peak traffic hours
Distribute heavy jobs
Be timezone-aware

Logging & Monitoring

Always log output
Review logs regularly
Monitor after updates

Security

Run jobs with least privilege
Avoid root unless required
Secure scripts and dependencies
Never store secrets in crontab

Permissions

Cron-created files inherit user ownership
Match cron user with application user
Avoid permission mismatches

Testing

Test commands manually
Use absolute paths
Verify PATH differences
Monitor first executions

21. Cron Implementation Differences

Be aware:

Ubuntu / Debian → Vixie cron
CentOS / RHEL → Cronie
Features differ slightly
Environment variables may not work everywhere

When in doubt:

use shell scripts
use absolute paths
avoid assumptions

22. Final Takeaways

User cron ≠ system cron
/etc/crontab allows per-user execution
Anacron handles missed jobs
Ubuntu integrates Anacron via cron folders
Power state matters on laptops
Cron needs planning, logging, and discipline

What Is the Internet? (Big Picture)

Definition

The Internet is a network of networks.

It is made of interconnected nodes (computers, routers, servers)
These nodes form a mesh, not a single direct line
Any node can communicate with almost any other node
No dedicated end-to-end connection is required

This design makes the Internet:

Scalable
Fault-tolerant
Efficient

Why the Internet Works Without Dedicated Connections

Imagine:

You are in Europe
You connect to a server in Australia

You do not have a physical cable to Australia.

Instead:

Data is split into small packets
Each packet is routed independently
Routers choose the best available path at that moment
If a link is congested or broken, traffic is rerouted automatically

This is called packet switching.

Important consequences:

Packets may take different paths
Paths can change dynamically
Packet order is not guaranteed (higher layers fix this)

Visualization: How Data Reaches Google

Example Flow

Your computer
Home router
ISP router
Multiple intermediate routers (hops)
Destination server (e.g., Google)

Each router:

Looks only at the destination address
Forwards the packet to the next best hop
Does not know the full path

Routers are often called hops because packets “hop” through them.

What Must Exist for Internet Communication to Work

To send data to google.com, several things must happen:

Name resolution

Convert google.com → IP address
This is done by DNS

Local delivery

Your computer must send data to the local router
Happens inside your local network

Inter-network routing

Data must cross multiple networks
Handled by the IP protocol

Reliability

Lost packets must be detected and retransmitted
Done by TCP

Application communication

Web, SSH, email, etc.
Done by protocols like HTTP, HTTPS, SSH

This layered approach is intentional.

Working Bottom-Up (How This Chapter Is Structured)

We will study networking from the ground up:

How data is placed on the wire or Wi-Fi
How packets move inside a local network
How packets move between networks (Internet)
How reliability is guaranteed
How applications use the network

This matches how real networking works.

Tool 1: The `ip` Command (Linux Networking)

What Is `ip`?

The ip command is the modern Linux networking tool.

It replaces:

ifconfig
route
netstat

It is:

More powerful
More accurate
Actively maintained

Showing Network Interfaces

ip address show

Output shows:

Network interfaces
IP addresses
Interface state
MAC addresses

Example interfaces:

lo → loopback (localhost)
eth0, ens33, wlp0s20f3 → physical or virtual NICs

Legacy Tool (Still Exists)

ifconfig -a

Older
Still available on some systems
Internally uses older kernel interfaces

In this course, we use ip.

macOS Users: Using `ip` via Homebrew

macOS does not ship with ip.

Options:

Use ifconfig (native)
Install an ip wrapper

Install via Homebrew

brew install iproute2mac

After installation:

ip address show

Notes:

Output may differ slightly
Not all features are supported
Good enough for learning

Tool 2: Wireshark (Traffic Analysis)

What Is Wireshark?

Wireshark is a graphical packet analyzer.

It allows you to:

Capture live network traffic
Inspect packets layer by layer
Visualize real network behavior

This is critical for understanding, not just memorizing.

⚠️ Legal & Ethical Warning (Very Important)

Wireshark can:

Capture private data
Capture MAC addresses
Capture credentials (if unencrypted)

Rules:

Capture only traffic you own or are allowed to analyze
Laws vary by country
In some regions, MAC addresses are personal data

This lecture is for:

Learning
Teaching
Ethical debugging

This is not legal advice.

Installing Wireshark (Ubuntu)

sudo apt install wireshark

Start Wireshark with required privileges:

sudo wireshark

Capturing Traffic

Steps:

Select network interface
Start capture
Generate traffic (open a website)
Stop capture
Analyze packets

Example:

Open google.com
Stop capture
Filter by protocol:

  http

Note:

Modern sites use HTTPS
Payload is encrypted
Metadata is still visible

Why Wireshark Matters

Wireshark shows:

Frames
Packets
Headers
Protocol layers

Right now this looks overwhelming — that’s expected.

By the end of this chapter:

Every section will make sense
Every field will have meaning

Introducing the OSI Model

What Is the OSI Model?

The OSI (Open Systems Interconnection) model is a conceptual framework.

Purpose:

Standardize network communication
Enable interoperability
Provide a shared troubleshooting language

Developed:

Concept in the 1970s
Formalized in the 1980s

Why the OSI Model Exists

Before standards:

Vendors used incompatible protocols
Networks could not interoperate

Today:

Any phone works on any Wi-Fi
Any laptop works on any router
Any OS can talk to any server

That did not happen by accident.

The 7 OSI Layers (Bottom → Top)

Layer	Name	Purpose
1	Physical	Bits on wire (cables, signals)
2	Data Link	Local delivery (MAC, Ethernet)
3	Network	Routing between networks (IP)
4	Transport	Reliability, ordering (TCP/UDP)
5	Session	Session management
6	Presentation	Encryption, compression
7	Application	HTTP, SSH, FTP, SMTP

Key Layer Intuition

Layer 1: Is the cable plugged in?
Layer 2: Can I talk to my router?
Layer 3: Can packets reach the destination?
Layer 4: Are packets reliable?
Layer 7: Does the application work?

This is how real troubleshooting is done.

Why the OSI Model Is Useful for You

1. Modularity

Each layer can evolve independently.

Example:

TCP improvements do not break Ethernet
HTTPS encryption does not affect routing

2. Interoperability

Devices from different vendors work together.

3. Troubleshooting Framework

You can say:

“Layer 1 issue” → cable
“Layer 3 issue” → routing
“Layer 7 issue” → application

This saves hours in production.

OSI Layer 1 – The Physical Layer

What Is the Physical Layer?

The physical layer (Layer 1) is the foundation of networking.

It is responsible for physically transmitting bits from one device to another.

This includes:

Ethernet cables (copper)
Fiber-optic cables
Wi-Fi radio signals
Electrical voltages and light pulses

At this layer, there is no concept of IP addresses, packets, or routing—only raw bits.

Responsibilities of Layer 1

Layer 1 handles:

Physical media
- Copper, fiber, wireless
Signal transmission
- Electrical voltage
- Light pulses
- Radio waves
Bit encoding
- Converting 0s and 1s into signals
Timing & synchronization
Collision avoidance (basic mechanisms)
Basic error detection
- e.g., parity bits

Important detail:

Signals are encoded so the average voltage is zero
This prevents electrical potential buildup between devices

Examples of Physical Layer Failures

Common Layer 1 problems:

Cable unplugged
Broken cable
Power missing
Faulty network card
Electromagnetic interference
Hardware malfunction

If Layer 1 fails, nothing above it can work.

Layer 1 Hardware Examples

Ethernet cables
Fiber cables
Wi-Fi antennas
Physical splitters (old Ethernet hubs)
Network Interface Cards (NICs)

Old Ethernet splitters literally connected wires together.
All devices shared the same electrical medium.

Influencing Layer 1 via Software

You cannot unplug a cable with software.

But you can:

Enable or disable a network interface

This effectively shuts down Layer 1 from the OS perspective.

Enabling / Disabling a Network Interface (Linux)

Step 1: Identify interfaces

ip addr show

Example interface names:

enp0s5 (modern, predictable)
eth0 (older style)
wlan0 (Wi-Fi)

Modern names are stable and tied to hardware location.

Step 2: Disable interface

sudo ip link set dev enp0s5 down

Result:

Interface exists
State becomes DOWN
No traffic flows

⚠️ Warning
If this interface is your SSH connection → you will disconnect immediately.

Step 3: Enable interface

sudo ip link set dev enp0s5 up

Connectivity returns.

Real Example: Remote Device Risk

On systems like:

Raspberry Pi
Remote servers
Cloud VMs

If you disable:

wlan0 (Wi-Fi)
eth0 (Ethernet)

Your remote session will drop.

Recovery requires:

Reboot
Physical access
Console access

This is a classic Layer 1 outage.

OSI Layer 2 – The Data Link Layer

What Is Layer 2?

The Data Link Layer (Layer 2) handles local communication inside one network.

Key responsibilities:

Frame delivery
MAC addressing
Error detection (local)
Collision reduction
Traffic isolation

Layer 2 does not route between networks.

Typical Layer 2 Hardware

Switch
Bridge
Wireless Access Point (WAP)

Note:

A switch is hardware

A bridge is usually software

Functionally, they are similar

Switch vs Wireless Router (Important Distinction)

Switch / Access Point
- Layer 2 only
- No routing
Router
- Layer 3 (and above)
- Connects different networks

A Wi-Fi access point is essentially:

A Layer-2 switch with radio antennas

Why We Need Switches

Old Method: Shared Wire (Hub / Splitter)

All devices share one cable
Every frame reaches every device
Devices discard frames not meant for them
Collisions occur if multiple devices talk

Works for:

Few machines

Fails for:

Many machines
High traffic

How a Switch Solves This

A switch learns MAC addresses.

Each device has its own cable
Switch remembers:
- MAC → port mapping
Frames are forwarded only where needed

Example:

PC A sends frame to PC B
Switch forwards frame only to PC B’s port
Other ports remain silent

Parallel Communication with a Switch

With a switch:

Multiple devices can transmit simultaneously
No shared collision domain
Massive performance improvement

This is why switches replaced hubs.

Transparency of a Switch (Very Important)

From the computer’s perspective:

It does not know a switch exists
It behaves as if:
- It is directly connected to other devices

The switch is completely transparent.

This fact is critical when later learning about:

Routers
Network segmentation
Subnets

What Layer 2 Can and Cannot Do

Can do:

Send frames inside the same network
Reduce collisions
Isolate traffic

Cannot do:

Route between networks
Reach the Internet
Understand IP addresses

That is Layer 3.

Layer 1 vs Layer 2 Summary

Layer	Purpose	Example
Layer 1	Physical transmission	Cable, Wi-Fi
Layer 2	Local delivery	Switch, MAC

OSI Layer 3 – The Network Layer (IP, Routing, Subnets)

Why Do We Need the Network Layer?

On Layer 2 (Data Link), we learned:

Frames are sent from one network card to another
Communication is limited to the local network
Switches are transparent
MAC addresses are local only

➡️ Problem
If two computers are not in the same network, Layer 2 is not enough.

That is why we need Layer 3 – the Network Layer.

What Changes on Layer 3?

Layer	Unit	Address Type	Scope
Layer 2	Frame	MAC address	Local network only
Layer 3	Packet	IP address	Across networks

Key idea

Frames cannot be routed
Packets can be routed

Routing = forwarding data between networks

Packet Encapsulation (Very Important Concept)

When sending data:

Application creates data
Layer 3 wraps it into an IP packet
Layer 2 wraps the packet into an Ethernet frame
Frame is sent on the wire

At every router:

Frame is removed
Packet is inspected
Packet is wrapped into a new frame
Sent to the next hop

This happens extremely fast in hardware.

What Is a Network?

A network is a group of interconnected devices that can communicate.

Important network types

LAN (Local Area Network) Home, office, data center
WAN (Wide Area Network) Internet, country, continent

➡️ The Internet is a WAN made of many LANs

Routers and Gateways

A router:

Connects networks
Operates on Layer 3
Forwards packets

A default gateway:

The router your computer sends packets to
Used when destination is outside your local network

Inspecting Network Configuration (Linux)

Show IP address

ip addr show

You will see:

Interface name
IP address
Subnet mask (CIDR notation)

Example:

192.168.1.23/24

Show routing table

ip route show

Example:

default via 192.168.1.1 dev enp0s5

Meaning:

Anything not local → send to 192.168.1.1
That IP is your router / gateway

Local vs Internet Traffic (Wireshark Proof)

Case 1: Ping Google (Internet)

IP packet destination = Google IP
Ethernet frame destination = router MAC
Router forwards packet

Case 2: Ping local machine

IP packet destination = local IP
Ethernet frame destination = target MAC
Router not involved (acts only as switch if Wi-Fi)

➡️ Same IP protocol
➡️ Different Layer-2 destination

Why Frames Are Addressed Differently

Destination	Ethernet Frame Goes To
Same network	Target device MAC
Different network	Router MAC

This decision is made using the subnet mask.

Subnets – Networks Inside Networks

What Is a Subnet?

A subnet is a logical subdivision of a network.

Used to:

Reduce broadcast traffic
Improve performance
Scale large networks
Control routing

The Problem Subnets Solve

Your computer must answer:

Is the destination IP local or remote?

If local → send frame directly
If remote → send frame to gateway

Subnet Mask (Core Concept)

Example:

IP address:     192.168.1.10
Subnet mask:    255.255.255.0
CIDR:           /24

Subnet mask:

Defines network part
Defines host part

Binary AND Logic (Conceptual)

Subnet mask has:
- 1 = network bits
- 0 = host bits

Logical AND:

Keep bits where mask = 1
Zero out bits where mask = 0

If:

(network part of source)
==
(network part of destination)

➡️ Same subnet

Otherwise ➡️ send to gateway

Computers do this instantly in hardware.

CIDR Notation (Short Form)

Instead of:

255.255.255.0

We write:

/24

Why?

24 bits = 1
Remaining bits = 0

Examples:

CIDR	Hosts (usable)
/24	254
/23	510
/22	1022
/30	2 (point-to-point)

Reserved Addresses in a Subnet

For /24:

.0 → network address
.255 → broadcast address
.1 – .254 → usable hosts

Inspecting Subnet Mask on Linux

ip addr show

Example output:

inet 192.168.1.23/24

This tells you:

Your IP
Your subnet size
How routing decisions are made

Key Mental Model (Very Important)

IP packet → logical destination
Ethernet frame → physical next hop
Subnet mask → decision maker
Router → network boundary

How Does a Computer Know Where to Send a Frame?

(ARP, IP ↔ MAC Resolution, Routes, DHCP)

The Core Question

We know IP packets contain destination IPs
We know Ethernet frames need destination MACs

❓ How does the system know which MAC address to use?

That is the job of ARP.

Packet vs Frame (Quick Reminder)

Layer	Unit	Address Used
Layer 3	Packet	IP address
Layer 2	Frame	MAC address

To send any IP packet, the system must:

Decide where the packet should go
Resolve which MAC address to send the frame to

ARP – Address Resolution Protocol

ARP answers one question only:

“Which MAC address owns this IP address?”

ARP works only inside the local network.

What Happens When You Ping a Local Machine

Step-by-step

You run:

   ping 192.168.1.50

System checks subnet mask → Destination is local
System does ARP request (broadcast):

   Who has 192.168.1.50?

All devices receive it
Only the owner replies:

   192.168.1.50 is at AA:BB:CC:DD:EE:FF

MAC address is cached
Ethernet frame is sent directly to that MAC

What Happens When You Ping the Internet

Step-by-step

You run:

   ping google.com

DNS resolves IP (e.g. 142.250.x.x)
Subnet mask check → Destination is NOT local
System needs gateway MAC
ARP request:

   Who has 192.168.1.1?

Router replies with its MAC
Ethernet frame destination = router MAC
Router forwards packet

➡️ IP destination stays Google
➡️ MAC destination is router

Why ARP Is Always Happening

ARP traffic is normal and frequent:

Devices announce themselves
Devices verify IP conflicts
Routers refresh mappings

In Wireshark:

ARP Who has 192.168.1.X?
ARP 192.168.1.X is at …

This is normal network noise, not a problem.

Changing IP Addresses Manually (Linux)

Show current IPs

ip addr show

Add a secondary IP

sudo ip addr add 192.168.1.232/24 dev enp0s5

Now the interface has two IPs.

✔️ Other devices in the same subnet can reach it
❌ Devices outside the subnet will not

Remove the IP

sudo ip addr del 192.168.1.232/24 dev enp0s5

Why Adding “Any IP” Doesn’t Work

You can add:

8.8.8.8/24

But:

Other machines see it as Internet IP
Frames go to the router
Router does NOT send traffic back to your PC

➡️ IP must match subnet logic, not just syntax.

Routing Table – How the OS Decides Paths

Show routes

ip route show

Typical output:

192.168.1.0/24 dev enp0s5
default via 192.168.1.1 dev enp0s5

Meaning:

Local network → direct
Everything else → router

Ask the OS how it would reach an IP

ip route get 8.8.8.8

ip route get 192.168.1.50

You will see:

Interface used
Gateway (if any)

Manually Adding Routes (Advanced)

Example: Send traffic to wrong gateway

sudo ip route add 9.9.9.9/32 via 192.168.1.100 dev enp0s5

Result:

Packet sent to wrong device
Device does not forward
Traffic fails

Remove it:

sudo ip route del 9.9.9.9/32

Why Manual Routes Matter (Corporate Networks)

Example:

Subnet A: 192.168.1.0/24
Subnet B: 192.168.2.0/24
Router in between

Without route:

Subnet A cannot reach Subnet B

With route:

sudo ip route add 192.168.2.0/24 via 192.168.1.5 dev enp0s5

Now traffic flows.

DHCP – How Devices Get IPs Automatically

What Is DHCP?

Dynamic Host Configuration Protocol

Automatically assigns:

IP address
Subnet mask
Gateway
DNS servers
Lease time

Usually runs on the router.

DHCP 4-Step Process (DORA)

Discover (broadcast)
Offer
Request
Acknowledge

All initial messages are broadcast.

Seeing DHCP in Wireshark

Filter:

dhcp

You’ll see:

Client MAC
Offered IP
Lease duration
Gateway
DNS servers

This is your entire network configuration being delivered.

Debugging DHCP (systemd-networkd)

View logs

journalctl -u systemd-networkd

Look for:

DHCP lease acquired
DHCP lease lost
Link up/down events

If:

Cable is plugged
Wi-Fi is connected
But no IP

➡️ It’s almost always DHCP

Final Mental Model (Very Important)

To send data:

Subnet mask decides:

Local → direct MAC
Remote → gateway MAC
1. ARP resolves MAC
2. Ethernet frame is built
3. Router forwards if needed
4. Routing table controls decisions
5. DHCP provides configuration

NetworkManager vs systemd-networkd (DHCP Logs)

Why Different Linux Systems Use Different Network Tools

Not all Linux distributions manage networking the same way.

Ubuntu Server → usually uses systemd-networkd

Both tools:

Configure interfaces
Run a DHCP client
Assign IP addresses
Manage routes

They just do it differently.

Why NetworkManager Exists

NetworkManager is a more integrated solution.

It supports:

DHCP
DNS integration
Wi-Fi
VPNs
Mobile connections

With systemd:

Networking is split across multiple components
- systemd-networkd
- systemd-resolved
- others

➡️ Recommendation
Use the default tool provided by your distribution unless you have a strong reason to change it.

Inspecting DHCP Logs with NetworkManager

On systems using NetworkManager (e.g. CentOS):

sudo journalctl -u NetworkManager --boot

What you’ll see:

Service startup
Interface detection
DHCP requests
DHCP lease assignment
Lease renewals

Understanding the Logs

Key events you’ll notice:

Interface appears
DHCP client starts
IP address is assigned
Subnet mask, gateway, DNS received
Lease renewals every few minutes

Lease Renewal Explained

DHCP leases expire unless renewed.

The client periodically tells the router:

“I’m still here. Please keep my IP.”

This prevents:

IP conflicts
Stale reservations

Key Takeaway

Regardless of tool:

A DHCP client must exist
It must talk to a DHCP server
Logs tell you why networking works or fails

If:

Cable is connected
Wi-Fi is up
But no IP

➡️ Check DHCP logs first

Ping – The ICMP Diagnostic Tool

What Is Ping?

ping is a Layer-3 diagnostic tool.

It uses ICMP (Internet Control Message Protocol).

What ping does:

Sends ICMP Echo Request
Waits for ICMP Echo Reply
Measures round-trip time

Important Warning About Ping

If ping fails, it does NOT always mean:

The host is down

It may mean:

ICMP blocked by firewall
ICMP disabled on destination
ICMP filtered in between

Ping tests reachability, not availability.

Seeing Ping in Wireshark

Filter:

icmp

You will see:

Echo request
Echo reply
Sequence numbers
Identifiers

Round-trip time (RTT):

Gives latency estimate
Wi-Fi fluctuates more than Ethernet

When Ping Is Useful

Ping helps answer:

Can I reach the host?
Is the network slow?
Is there packet loss?

Ping does not:

Test application health
Guarantee connectivity beyond ICMP

Traceroute – Tracing the Path

What Traceroute Does

Traceroute shows:

Each router (hop) between source and destination
Latency per hop
Where delays appear

Command:

traceroute google.com

Output:

Hop number
Router IP or hostname
Three RTT measurements

Why Three Measurements?

Network latency fluctuates.

Traceroute sends multiple probes to:

Detect instability
Avoid false conclusions

Interpreting Traceroute Output

Typical observations:

First hop = your gateway
ISP routers next
Backbone routers later
Destination last

* * * means:

Router didn’t reply
ICMP blocked
Still forwarding traffic

Long-Distance Latency Example

When tracing overseas destinations:

Sudden jump (e.g. 30 ms → 250 ms)
Caused by:
- Physical distance
- Speed of light
- Undersea fiber cables

This is why:

Global services deploy servers near users

How Traceroute Actually Works (TTL Explained)

The TTL Field

Every IP packet has:

TTL – Time To Live

Purpose:

Prevent infinite routing loops

Traceroute Algorithm (Simplified)

Send packet with TTL = 1
First router:

Decrements TTL → 0
Drops packet
Sends ICMP Time Exceeded
1. Record router IP
2. Increase TTL to 2
3. Repeat until destination reached

➡️ Each step discovers one hop

Why ICMP Appears in Captures

When a router replies:

It embeds the original IP packet
Inside an ICMP message
Inside a new IP packet
Inside a new Ethernet frame

That’s why Wireshark still matches filters.

Why Traceroute Isn’t a Single Packet

Each hop is:

A separate probe
Sent independently

Traceroute assumes:

Routing path remains stable

Real-World Insight: Traceroute Reveals Topology

Traceroute can show:

Multiple routers in your home network
ISP router + your own router
Hidden subnets

Example:

ISP router (mandatory)
Personal router behind it
Double NAT
Multiple internal subnets

Traceroute exposes this.

Final Layer-3 Summary

You now understand:

DHCP (automatic configuration)
ARP (IP → MAC resolution)
Routing tables
Default gateways
Ping (ICMP reachability)
Traceroute (path discovery)
TTL mechanics
Multi-subnet routing

At this point, you have solid Layer-3 knowledge, exactly what:

DevOps engineers
Cloud engineers
SREs

must understand deeply.

Transport Layer (Layer 4) — Why We Need It

So far, Layer 3 (IP) solved routing across networks.
But IP alone has serious limitations:

Problems at Layer 3

Packets can be lost
Packets can be dropped (very common)
Packets can arrive out of order
No retransmission
No flow control
No congestion control

Routers intentionally drop packets when overloaded — this is normal and expected.

➡️ Layer 4 exists to handle these problems

UDP vs TCP — Two Different Philosophies

UDP — “Send and Forget”

UDP (User Datagram Protocol):

No retransmission
No ordering
No congestion control
No connection setup

Why Use UDP?

Because sometimes retransmission is worse than packet loss.

Examples:

Video calls
Live streaming
Online gaming
DNS
NTP (time sync)

If a video frame arrives late → it’s already useless
➡️ Better to drop it and move on

Applications using UDP usually:

Send extra data
Use error correction
Handle loss themselves

TCP — Reliable Data Stream

TCP (Transmission Control Protocol) provides:

Reliable delivery
Ordered data
Retransmission
Flow control
Congestion control

Applications see TCP as a continuous stream, not packets.

What TCP Manages for You

Lost packets → retransmitted
Out-of-order packets → reordered
Receiver overload → sender slows down
Network congestion → speed reduced automatically

➡️ Applications don’t need to care about packet loss.

TCP Internals (High-Level, Practical View)

Each TCP segment contains:

Source port
Destination port
Sequence number
Acknowledgment number
Flags (SYN, ACK, FIN, RST)
Checksum
Payload (data)

Sequence Numbers

Used to:

Order packets
Detect missing data
Acknowledge received bytes

TCP does not count packets — it counts bytes.

TCP Three-Way Handshake (Connection Setup)

Before data transfer, TCP builds a connection:

Step 1 — SYN

Client → Server

SYN flag set
Initial Sequence Number (ISN)

Step 2 — SYN-ACK

Server → Client

SYN + ACK flags
Server’s ISN
Acknowledges client’s ISN

Step 3 — ACK

Client → Server

ACK flag
Acknowledges server’s ISN

➡️ Connection is now established

After this:

Data can flow both ways
Every byte is acknowledged

Seeing the Handshake in Wireshark

When using tools like wget or a browser:

You will see:
- SYN
- SYN-ACK
- ACK
Followed by normal data packets

This knowledge is critical for:

Debugging
Firewall troubleshooting
Port scanning (next topic)

Ports — How Applications Are Identified

Ports are Layer 4 identifiers.

Range: 0 – 65535
TCP and UDP have separate port spaces

A connection is uniquely identified by:

Source IP + Source Port + Destination IP + Destination Port

Port Categories

1. Well-Known Ports (0–1023)

Require root privileges.

Examples:

80 → HTTP
443 → HTTPS
22 → SSH
21 → FTP
25 → SMTP

2. Registered Ports (1024–49151)

Assigned to common services.

Examples:

3306 → MySQL
5432 → PostgreSQL
5900 → VNC

3. Dynamic / Ephemeral Ports (49152–65535)

Used by clients:

Randomly chosen
Temporary
No special privileges required

Source Port vs Destination Port

Example:

Client opens random high port (e.g. 46062)
Server listens on well-known port (e.g. 80)

Server response:

Source port = 80
Destination port = 46062

This allows thousands of simultaneous connections.

Common TCP and UDP Ports (Overview)

Common TCP Ports

80 → HTTP
443 → HTTPS
22 → SSH
21 → FTP
25 → SMTP
110 → POP3
143 → IMAP

Common UDP Ports

53 → DNS
67 / 68 → DHCP
123 → NTP
161 / 162 → SNMP
69 → TFTP
5004 / 5005 → RTP (audio/video)

Why UDP here?

Low latency
No retransmission delays

Port Scanning — Understanding Nmap

What Is Port Scanning?

Port scanning tries to:

Connect to many ports
Observe responses
Determine which services are reachable

Possible Responses

SYN-ACK → Port open
RST → Port closed
No response → Port filtered (firewall)

Legal & Ethical Warning (Important)

Port scanning is a reconnaissance technique
Often used by attackers
Also used by defenders

⚠️ Only scan systems you own or are authorized to scan

Laws vary by country — never assume legality

Nmap Basics

Install:

sudo apt install nmap
# or
sudo dnf install nmap

Basic scan:

sudo nmap localhost

Scans:

Top 1000 TCP ports

Scan Specific Ports

sudo nmap -p 22,80,443 192.168.1.10

Scan All Ports

sudo nmap -p- 192.168.1.10

Scan a Network Range

sudo nmap 192.168.1.1-100

Useful for:

Inventory
Firewall validation
Security hardening

Practical Security Use Case

Port scanning helps you:

Detect unnecessary services
Close unused ports
Reduce attack surface

Example:

MySQL open on all interfaces
Not needed externally
Disable service or firewall it

sudo systemctl stop mysql
sudo systemctl disable mysql

Re-scan:

sudo nmap localhost

➡️ Security improved

Why This Matters for DevOps & Cloud

You now understand:

TCP vs UDP tradeoffs
Ports & services
Connection establishment
How attackers discover services
How defenders harden systems

This knowledge is mandatory for:

Firewalls
Kubernetes networking
Load balancers
Cloud security groups
Incident response

Advanced Nmap Scan Types — Why Scan Type Matters

Not all port scans behave the same way.
Scan type directly affects:

Speed
Detectability
Logging on the target
Legal and operational risk

This is why an Nmap introduction is incomplete without scan types.

1. TCP SYN Scan (`-sS`) — Stealth Scan

What It Does

Sends SYN packet only
Waits for response
Does NOT complete the handshake

Responses

Response	Meaning
SYN-ACK	Port open
RST	Port closed
No reply	Port filtered (firewall)

Why It’s Fast

Only one packet per port
No full TCP connection
Minimal resource usage

Requirements

Root privileges (raw sockets)

sudo nmap -sS localhost

Logging Behavior

Often not logged
No established connection
Lower detection probability

➡️ Default and preferred scan if available

2. TCP Connect Scan (`-sT`) — Full Connection Scan

When Is It Used?

When SYN scan is not possible
- No root access
- IPv6 scanning
- Restricted environments

What It Does

Performs full TCP handshake
- SYN → SYN-ACK → ACK
Uses OS networking stack

nmap -sT localhost

Downsides

Slower (extra packets)
Uses OS resources
Almost always logged
Can trigger alerts or IDS
May stress poorly written services

➡️ High visibility scan — use carefully

3. UDP Scan (`-sU`) — Slow but Necessary

Why UDP Is Hard to Scan

No handshake
No ACKs
Packet loss is normal

How Nmap Interprets Responses

Response	Meaning
UDP reply	Port open
ICMP error	Port closed
No reply	Open or filtered

sudo nmap -sU localhost

Important Notes

Extremely slow
Requires retries
Often inconclusive

But still critical because:

DNS
DHCP
NTP
SNMP
RTP

➡️ TCP scans alone are not enough

Why Nmap Matters for Firewalls

Nmap answers:

What services are exposed?
Which ports must be blocked?
Did my firewall work?

Security hardening workflow:

Scan
Identify unnecessary services
Stop or firewall them
Re-scan to verify

Network Address Translation (NAT)

Why NAT Exists

IPv4 address shortage
Many internal devices → one public IP

Internal IPs (Private)

192.168.0.0/16
10.0.0.0/8
172.16.0.0/12

Not routable on the Internet.

How NAT Works (Outbound)

Internal device sends packet
Router:

Rewrites source IP
Often rewrites source port
1. Router remembers mapping
2. Reply arrives
3. Router reverses translation

➡️ Router maintains a NAT table

Why Inbound Traffic Fails by Default

Incoming traffic:

Router has no idea where to send it
Packet is dropped

➡️ NAT works outbound only

Port Forwarding — Allowing Inbound Access

Example:

External: :80
Internal: 192.168.1.50:8080

Router rewrites:

Destination IP
Destination port

DHCP Reservation — Critical Step

Problem:

Internal IPs change

Solution:

Bind MAC → IP
Prevent forwarding breakage

Always reserve IPs for:

Servers
NAS
Home labs

Dynamic Public IP Problem

Most ISPs:

Assign dynamic IPs
Change periodically

Solution:

Dynamic DNS (DDNS)

Router updates DNS record automatically:

myhome.ddns-provider.com → current public IP

⚠️ Not production-grade

DNS propagation delays
ISP NAT (CGNAT) may block inbound access entirely

OSI Layer 5 — Session Layer

Purpose

Establish
Maintain
Terminate sessions

Adds:

State
Authentication
Session tracking

Examples

Network File Systems
Remote Procedure Calls (RPC)
Session-aware protocols

Modern reality:

Often implemented inside applications
Layers 5–7 are frequently merged

OSI Layer 6 — Presentation Layer

Responsibilities

Data format
Encoding
Encryption
Compression

Common Functions

Encoding

ASCII
UTF-8 / Unicode

Encryption

SSL / TLS
HTTPS

Compression

gzip
deflate
brotli

MIME — Real Example

Emails require:

Character encoding
Attachments
HTML + plain text
Metadata

MIME defines:

How data is represented
Not how it’s transported

➡️ Transport (SMTP) ≠ Representation (MIME)

Modern Reality of OSI Layers

Important truth:

OSI is a conceptual model
Real protocols blur boundaries

Example:

HTTP/3 (QUIC)
- Uses UDP
- Implements encryption
- Handles congestion control
- Manages sessions internally

➡️ One protocol can span multiple OSI layers

Final Takeaways

You now understand:

Why scan type matters in Nmap
SYN vs Connect vs UDP scans
NAT behavior and limitations
Port forwarding & DHCP reservations
Why inbound traffic fails by default
How higher OSI layers overlap in reality

This knowledge is essential for:

Firewall configuration
Cloud networking
Kubernetes ingress
Security audits
Incident response

OSI Layer 7 — Application Layer

The application layer (Layer 7) consists of protocols used by applications, not the applications themselves.

Important distinction

Firefox / Chrome / Outlook → applications (software)
HTTP, HTTPS, IMAP, SSH → application-layer protocols

Example:

Firefox uses HTTPS
Thunderbird uses IMAP
Terminal uses SSH

According to the OSI model, the protocol is Layer 7, not the program you click.

Common Layer 7 Protocols

Protocol	Purpose
HTTP / HTTPS	Web access
IMAP	Access emails on server
POP3	Download emails
SMTP	Send emails
SSH	Remote shell, file transfer
FTP / SFTP	File transfer
DNS	Name resolution
Custom APIs	REST, gRPC, proprietary

Layer 7 protocols depend on all lower layers:

Reliable transport (TCP/UDP)
Routing (IP)
Switching (Ethernet)
Physical transmission (bits)

DNS — Domain Name System (Layer 7)

DNS is an application-layer protocol that converts human-readable names into IP addresses.

Example:

google.com → 142.250.x.x

Why DNS Exists

Humans remember names better than IP addresses.
Computers require IP addresses to communicate.

DNS bridges that gap.

DNS Resolution Flow (Step by Step)

1. Browser Cache

The browser checks:

“Have I resolved this domain recently?”

If yes → use cached IP.

2. Operating System Cache

If browser cache misses:

Browser asks the OS
OS checks its DNS cache

If found → return IP.

3. DNS Resolver (ISP or Custom)

If OS cache misses:

OS queries a DNS resolver
Usually provided by ISP or configured manually (e.g. 8.8.8.8)

Resolvers also cache results heavily.

Full Recursive Resolution (If Not Cached)

4. Root Name Servers

Resolver queries one of 13 root servers (A–M)
Root servers do not know google.com
They know who manages .com

Response:

Ask the .com TLD servers

5. TLD Name Servers (.com)

Resolver queries .com TLD servers
TLD servers respond:

Ask Google’s authoritative name servers

6. Authoritative Name Servers

Resolver queries ns1.google.com
Gets final DNS records:

google.com → IP addresses

7. Response Propagation

Resolver → OS
OS → Browser
Browser connects to IP

✅ DNS resolution complete.

Common DNS Record Types

Record	Purpose
A	Domain → IPv4
AAAA	Domain → IPv6
CNAME	Alias to another domain
MX	Mail servers
NS	Authoritative name servers
TXT	Verification, SPF, DKIM, metadata

Viewing DNS Records

host -a google.com

Example output:

IPv4 + IPv6 addresses
Name servers
MX records
TXT verification records

Why IPs Change?

Load balancing
High availability
Different users → different servers

Manual DNS Resolution (Bonus — Deep Understanding)

Using dig shows exactly what DNS is doing.

Step 1 — Query Root Server

dig @a.root-servers.net com NS

Returns:

List of .com TLD servers

Step 2 — Query TLD Server

dig @a.gtld-servers.net google.com NS

Returns:

Google’s authoritative name servers

Step 3 — Query Authoritative Server

dig @ns1.google.com google.com A

Returns:

Final IP addresses

This demonstrates DNS hierarchy and delegation clearly.

DNS Security Problems

DNS was designed before modern security threats.

Common Risks

DNS spoofing
Cache poisoning
Man-in-the-middle
ISP or government manipulation

Why HTTPS Matters

Even if DNS is spoofed:

HTTPS verifies server identity
Invalid certificate → browser warning

This mitigates DNS attacks, though it does not fix DNS itself.

DNSSEC (Mention Only)

Cryptographic DNS signatures
Protects integrity of DNS data
Complex, not universally deployed

`/etc/hosts` — Manual DNS Override

File:

/etc/hosts

Format:

IP_ADDRESS   hostname

Example:

127.0.0.1   myproject.local

Why Use `/etc/hosts`

Local development
Testing
Offline resolution
Temporary overrides

⚠️ Overrides DNS entirely

Example

sudo nano /etc/hosts

Add:

127.0.0.1   myproject.local

Test:

ping myproject.local

Overriding Public Domains (Not Recommended)

127.0.0.1 google.com

Result:

Browser connects to localhost
HTTPS fails (certificate mismatch)

Useful only for testing or demos.

DNS Cache Issues & Fixes

Sometimes /etc/hosts changes don’t apply immediately.

Reason:

Local DNS caching

Identify Local DNS Resolver

sudo lsof -i :53

Common:

systemd-resolved
dnsmasq

Flush DNS Cache (systemd)

sudo resolvectl flush-caches

Verify:

resolvectl statistics

Restart dnsmasq (if used)

sudo systemctl restart dnsmasq

Final Takeaways

You now understand:

OSI Layer 7 vs real applications
DNS resolution hierarchy
DNS record types
Manual DNS resolution with dig
DNS security weaknesses
HTTPS mitigation
/etc/hosts overrides
DNS cache flushing

This knowledge is critical for:

DevOps debugging
Kubernetes services
Load balancers
Cloud networking
Incident response

Hostnames in a Local Network

A hostname is a human-readable name assigned to a computer on a network.

Why hostnames exist

Easier identification of devices (e.g. ubuntu, raspberrypi)
Used during DHCP negotiation
Displayed on routers (device lists)
Allows hostname-based access inside local networks

Viewing the Hostname (Linux)

hostname

Example output:

ubuntu

Some shells show it automatically in the prompt, but this is configurable.

Using Hostnames in a Local Network

From another machine:

ping ubuntu
ping ubuntu.local

If hostname resolution is configured correctly, the hostname resolves to an IP.

Changing the Hostname (Linux)

Step 1 — Edit `/etc/hostname`

sudo nano /etc/hostname

Example:

vm-ubuntu

Step 2 — Reboot (required)

sudo reboot

After reboot:

hostname

Output:

vm-ubuntu

Why `/etc/hosts` Must Also Be Updated

The hostname should resolve locally to the loopback interface.

Edit:

sudo nano /etc/hosts

Correct example:

127.0.1.1   vm-ubuntu
127.0.0.1   localhost

Why `127.0.1.1`?

Used by Debian/Ubuntu for hostname binding
Avoids conflicts with localhost
Still loopback (local machine)

Best practice: always update /etc/hosts after hostname change

`.local` Hostnames and mDNS

What is `.local`?

.local is a reserved domain for multicast DNS (mDNS).

It ensures:

No internet DNS lookup
Local-network only resolution
Future-proof against new public TLDs

Why `.local` Is Required

❌ Bad:

server.london

✔ Good:

server.london.local

.local guarantees the name never escapes your LAN.

How mDNS Works (Conceptually)

Device sends a multicast query:

   Who is raspberrypi.local?

All devices receive it
The correct host replies:

   I am raspberrypi.local → 192.168.1.29

No central DNS server required.

mDNS Implementations

OS	Implementation
macOS	Bonjour / Zeroconf
Linux	Avahi
Windows	Partial support
Routers	Often integrated

Linux Requirements (Important)

On some distributions (e.g. CentOS):

sudo dnf install nss-mdns
sudo reboot

Without this:

Others can resolve your host
Your system cannot resolve others

Capturing mDNS Traffic (Wireshark)

Filter:

mdns

You will see:

Multicast IPv6 packets
Query + response messages
Host announcing its IP

Best Practice for Local Networking

✔ Always use:

hostname.local

✔ Or use static IPs if stability is critical

✔ Avoid bare hostnames without .local

HTTP — How the Web Actually Works

HTTP basics

Runs on TCP
Text-based protocol
Request → Response model

Inspecting HTTP in Browser

Right-click → Inspect
Open Network tab
Reload page
Click request → Headers

Example request:

GET / HTTP/1.1
Host: www.google.com
User-Agent: Firefox
Accept: text/html

HTTP Response

HTTP/1.1 200 OK
Content-Type: text/html
Content-Encoding: br

Then:

HTML
CSS
JS
Images (separate requests)

Manual HTTP Using Telnet

Open TCP connection

telnet www.google.com 80

Send HTTP request

GET / HTTP/1.1
Host: www.google.com

(blank line required)

Result

Server replies with headers + HTML
Pure text over TCP

Why This Matters for DevOps

Test server behavior
Debug load balancers
Validate HTTP compliance
Fuzz malformed requests safely

Example malformed request:

HELLO WORLD HTTP/9.9

Expected result:

400 Bad Request

A good server never crashes.

IPv4 vs IPv6 (Practical View)

IPv4

32-bit
~4.3 billion addresses
NAT required
Still dominant

Example:

192.168.1.10

IPv6

128-bit
3.4 × 10³⁸ addresses
No NAT required
Hierarchical routing
Better scalability

Example:

2001:db8::1

IPv6 Address Shortening

Full:

2001:0db8:0000:0000:0000:0000:0000:0001

Shortened:

2001:db8::1

Rules:

Remove leading zeros
:: only once

Why IPv6 Is Better

✔ No NAT
✔ Easier routing
✔ Every device gets public IP
✔ Firewalls replace NAT for security

Why IPv4 Still Matters

Many ISPs still IPv4-only
Legacy systems
IPv6 transition is slow

Dual Stack Is the Correct Strategy

✔ IPv4 + IPv6 enabled
✔ Servers reachable on IPv4
✔ IPv6 preferred internally
✔ Fallback always works

DevOps Recommendation

Scenario	Recommendation
Internal networks	Dual stack
Public servers	IPv4 mandatory
Future-proofing	Add IPv6
Debugging	Test both stacks

Wireshark IPv6 View

You’ll see:

Longer IP headers
ICMPv6
mDNS heavily uses IPv6

IPv6 ≠ exotic — it’s already active.

Key Takeaways

✔ Hostnames simplify local networking
✔ Always update /etc/hosts
✔ Use .local for LAN resolution
✔ mDNS uses multicast (not DNS)
✔ HTTP is plain text over TCP
✔ Telnet is a powerful debug tool
✔ IPv6 removes NAT limitations
✔ IPv4 still required today

SSH — Secure Shell (Concepts & Real Usage)

What is SSH?

SSH (Secure Shell) is a cryptographic network protocol used to securely access and manage remote systems over a network.

SSH provides:

Confidentiality (encryption)
Integrity (tamper detection)
Authentication (verifying identities)

SSH is one of the most important tools for Linux, DevOps, Cloud, and Security engineers.

Common SSH Use Cases

Remote shell access

Execute commands on a remote server
Administer systems without physical access

Secure file transfer

scp (Secure Copy)
sftp (SSH File Transfer Protocol)

Tunneling / Port forwarding

Securely forward other protocols through SSH

In this course, we focus on shell access, file transfer, and security basics.

SSH Architecture

SSH always consists of two components:

1. SSH Server (`sshd`)

Runs on the remote machine
Listens for incoming connections
Usually installed on servers

2. SSH Client (`ssh`)

Runs on your local machine
Used to connect to the SSH server
Preinstalled on Linux, macOS, Windows

Real-World Context

Cloud servers do not have monitors
You never “log in physically”
SSH is the primary control channel

Everything you practice here applies directly to:

AWS EC2
Azure VMs
Google Cloud
Data center servers
Raspberry Pi devices

Network Setup Options for SSH Practice

You need two machines that can reach each other.

Method 1 — Host → Virtual Machine (Recommended)

VM uses Bridged Adapter
VM becomes a real device on your LAN
Host connects directly to VM via SSH

Pros

Simple
Realistic
Easy debugging

Cons

May be blocked on corporate networks

Method 2 — VM → VM (Always Works)

Two VMs inside a NAT Network
VMs can reach each other
No dependency on host or corporate LAN rules

Pros

Works everywhere
Fully isolated

Cons

Slightly less realistic than bridged mode

VirtualBox NAT Network Setup (Reliable)

Key steps:

Power off VM
Clone VM
Generate new MAC addresses
Create NAT Network
Attach both VMs to that network
Boot both machines
Verify connectivity

Verify IP addresses

ip addr show

Verify connectivity

ping <other_vm_ip>
ping ubuntu.local

If ping works → SSH will work.

Bridged Networking (Host → VM)

What Bridged Mode Does

VM shares physical NIC (Ethernet/Wi-Fi)
VM gets real IP from your router
Appears as a separate device on LAN

After enabling bridged mode

ip addr show

You should see:

192.168.x.x

Test from host

ping 192.168.x.x
ping ubuntu.local

Installing SSH Server (Ubuntu)

On the machine you want to control:

sudo apt update
sudo apt install openssh-server

Verify service:

systemctl status ssh

SSH server starts automatically.

Connecting with SSH

Basic Syntax

ssh username@host

Examples:

ssh user@192.168.1.40
ssh user@ubuntu.local
ssh user@example.com

If username is omitted:

ssh host

SSH uses your local username by default.

First Connection Warning (Fingerprint)

You may see:

The authenticity of host cannot be established.

This is normal.

Type:

yes

This stores the server’s host key fingerprint.

We will cover this security mechanism in detail later.

Successful SSH Session

Once connected:

Your terminal controls the remote machine
Commands behave exactly like local shell
exit closes the connection

SSH Security: Essential Practices

SSH is encrypted, but exposure still matters.

1. Use Strong Passwords

Long
Unique
Mixed characters
Avoid dictionary words

Bad:

sanfrancisco

Good:

A9$eP7!xQm

2. Protect Active Sessions

Never leave SSH sessions unattended
Lock screen or disconnect
Anyone with your open terminal has server access

3. Change Default SSH Port

Why?

Port 22 is scanned constantly
Automated bots attempt brute force logins
Log files become noisy
Changing ports reduces noise (not absolute security)

Change SSH Port (Ubuntu)

Edit config:

sudo nano /etc/ssh/sshd_config

Change:

Port 22

To:

Port 2222

Save file.

Validate & Restart SSH

Always validate before restart:

sudo sshd -t

If no output → config is valid.

Restart service:

sudo systemctl restart ssh

Existing sessions remain active.

Connect Using New Port

ssh -p 2222 user@host

Example:

ssh -p 2222 user@192.168.1.40

Important Port Warning

Some networks block uncommon ports:

Coffee shops
Corporate Wi-Fi
Public hotspots

Solutions

Choose another port
Use VPN
Use SSH over port 443 if required

SSH Logs & Monitoring

Ubuntu / Debian

/var/log/auth.log

CentOS / RHEL

/var/log/secure

View SSH activity:

grep sshd /var/log/auth.log

You will see:

Successful logins
Failed password attempts
Source IPs
Target usernames

Changing the SSH port makes real attacks visible, not buried in noise.

Why This Matters in Production

SSH is:

Your primary control channel
Your highest-risk exposed service
The first target of attackers

Understanding SSH deeply is non-negotiable for:

DevOps
Cloud Engineers
SRE
Security Engineers

Restrict SSH Access to Specific Users

By default:

All local users with passwords can SSH

This is not ideal.

Allow Only Specific Users

In sshd_config:

AllowUsers yannis

Multiple users:

AllowUsers yannis deploy admin

Validate & restart:

sudo sshd -t
sudo systemctl restart sshd

⚠️ Lockout Warning (Very Important)

If you mistype the username:

SSH will reject everyone
If this is a remote server → you are locked out

How to Avoid Locking Yourself Out (CRITICAL)

Golden Rule

Always keep one SSH session open.

Why?

SSH sessions are independent processes
Existing sessions survive SSH restarts

Safe Workflow

Open Terminal A
Connect via SSH
Make SSH changes
Test from Terminal B
If broken → fix using Terminal A
Only close Terminal A when confirmed

Example: SSH stopped

sudo systemctl stop sshd

Existing session → still alive
New connections → rejected

You can fix it:

sudo systemctl start sshd

This prevents rescue-mode recovery.

SSH Key Authentication (Passwordless & Secure)

Passwords are:

Guessable
Brute-forceable
Inconvenient for automation

SSH keys solve all of this.

How SSH Keys Work (Concept)

Private key → stays on your machine
Public key → copied to server
Server verifies identity without passwords
Private key is never transmitted

Generate SSH Key (Client Machine)

ssh-keygen -t rsa -b 4096

Press Enter for defaults.

Files created:

~/.ssh/id_rsa       (PRIVATE – never share)
~/.ssh/id_rsa.pub   (PUBLIC – safe to share)

Copy Public Key to Server

ssh-copy-id -i ~/.ssh/id_rsa.pub -p 2222 user@server

Enter your password once.

Login Without Password

ssh -p 2222 user@server

✔ No password
✔ Secure
✔ Perfect for automation

Server-Side Key Storage

Location:

~/.ssh/authorized_keys

Permissions:

~/.ssh            → 700
authorized_keys   → 600

Each line = one allowed public key
Comments help identify owners.

Why SSH Keys Are Essential

Impossible to brute-force
Required for CI/CD
Required for automation
Required for production security

SSH keys are not optional in real environments.

Disable Password Authentication for SSH (Key-Only Login)

Why Disable Password Authentication?

Now that public/private key authentication is configured, allowing password login is unnecessary and risky.

Security Benefits

Massive attack-surface reduction

Passwords can be brute-forced
SSH keys are thousands of characters long
Practically impossible to guess

Two-layer security

SSH key → login
Password → sudo (privilege escalation)

Even if SSH access is compromised

Attacker still needs the user password
Root login is already disabled
Privilege escalation is blocked

How Authentication Works After This Change

Login

Uses private key
No password accepted

System changes

sudo <command>

Still requires user password

This means:

SSH key ≠ root access

Verify Key-Based Login Works (Before Disabling Passwords)

From your client:

ssh -p 2222 user@server

You should log in without a password prompt.

⚠️ If this does not work, STOP. Do not continue.

Disable Password Authentication

Edit SSH server configuration:

sudo nano /etc/ssh/sshd_config

Set:

PasswordAuthentication no

(Optional but recommended)

PermitEmptyPasswords no

Validate & Apply Configuration

sudo sshd -t

No output = configuration is valid

Restart SSH:

sudo systemctl restart sshd

Test Enforcement (Important)

Switch to a user without SSH keys (or another local user):

ssh -p 2222 user@server

Expected result:

Permission denied (publickey).

✔ Password login is now fully disabled
✔ Only authorized SSH keys can log in

Critical Warnings (Very Important)

1. Other Users Will Be Locked Out

If teammates still use passwords:

They must add SSH keys
Otherwise access is lost

2. Losing Your Private Key = Lost Access

If your laptop is:

Lost
Damaged
Encrypted drive wiped

You cannot log in.

Best Practice

Create at least two SSH keys
Store on different devices
Add both public keys to authorized_keys

3. If Private Key Is Leaked

If someone gets your private key:

ALL servers using that key are compromised
You must:

Remove public key from all servers
Generate a new key pair
Re-deploy keys everywhere

Prevent SSH Connection Drops (Keep-Alive)

The Problem

SSH connections may drop if:

No activity for a long time
NAT, firewall, or router times out
You take a break (lunch, meeting, coffee)

This is annoying and dangerous:

Lost working directory
Lost environment state
Possible lockout during SSH changes

The Solution: Keep-Alive Packets

SSH can send empty packets periodically to keep the connection alive.

Best practice:

Configure this on the client, not the server.

Configure SSH Keep-Alive (Client Side)

Edit user SSH config:

nano ~/.ssh/config

Add:

Host *
    ServerAliveInterval 60
    ServerAliveCountMax 3

Meaning

Every 60 seconds → send keep-alive packet
Allow 3 missed responses
Prevents idle disconnects

Secure the Config File

chmod 600 ~/.ssh/config

Result

SSH sessions stay alive for hours
No random disconnects
Safe during breaks
Extremely useful during server maintenance

As long as:

Internet does not drop completely
Laptop stays powered

Your SSH session remains active.

Why This Matters in Production

These features prevent:

Locking yourself out
Losing work mid-operation
SSH disconnects during critical changes

This is mandatory knowledge for:

DevOps Engineers
Cloud Engineers
SREs
Linux Administrators

SSH Fingerprints: Why They Are Critical for Security

What Is an SSH Fingerprint?

Every SSH server generates host keys when sshd is installed
A fingerprint is a cryptographic hash of that host key
It uniquely identifies that exact server

When you connect for the first time, SSH asks:

“Are you sure you want to continue connecting?”

Once accepted, the fingerprint is saved locally.

Where Fingerprints Are Stored (Client Side)

~/.ssh/known_hosts

This file maps:

hostname → fingerprint

From that moment on:

SSH expects the fingerprint to remain the same
Any change triggers a security warning

Why Fingerprint Warnings Must NEVER Be Ignored

Fingerprint Change = Red Flag

If SSH says:

WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!

Possible causes:

DNS now resolves to a different server
Server was reinstalled
Man-in-the-middle attack

What Is a Man-in-the-Middle (MITM) Attack?

Instead of connecting directly:

You → Attacker → Real Server

The attacker:

Creates their own SSH host key
Forwards traffic to the real server
Can read passwords or commands

SSH fingerprints are what detect this attack.

Why Encryption Alone Is Not Enough

SSH traffic is encrypted, but:

Encryption only protects the transport
If the endpoint is wrong, encryption does not help

Fingerprint verification proves the server identity.

How to Manually Verify an SSH Fingerprint (Best Practice)

Step 1: Get the fingerprint from the server (trusted access)

On the server:

ssh-keygen -lf /etc/ssh/ssh_host_ed25519_key.pub

(Or RSA if used)

Step 2: Compare with client warning

SSH shows:

SHA256:QOG...ELkww

Both must match exactly.

Only then should you type:

yes

Important Reality Check

If SSH is your only access:

First connection always requires one trust decision
Best practice: trust once, verify manually, then never ignore warnings again

After first trust:

Any future warning = stop and investigate

SFTP: Secure File Transfer Over SSH

What Is SFTP?

SSH File Transfer Protocol
Built on top of SSH
Fully encrypted
Uses same authentication (password or SSH key)

SSH includes:

Shell access
SFTP access

(Some servers allow SFTP only, no shell)

GUI Access via SFTP (Linux)

In file manager:

sftp://user@hostname

Supports SSH keys automatically
Shows fingerprint warning on first connection
Permissions enforced by Linux users

CLI File Transfer Using SCP

Copy file from server → local

scp user@server:/home/user/file.txt .

Copy directory recursively

scp -r user@server:/home/user/folder .

Copy local → server

scp file.txt user@server:/home/user/

Specify SSH port

scp -P 2222 file.txt user@server:/home/user/

⚠️ SCP uses uppercase -P
⚠️ SSH uses lowercase -p

Using Cyberduck (Mac & Windows)

Cyberduck supports:

SFTP
SSH key authentication
Drag & drop
Permission management

Steps:

Select SFTP
Enter host, port, username
Choose SSH private key (recommended)
Verify fingerprint
Connect

`screen`: Shared & Persistent Terminal Sessions

What Is `screen`?

Terminal multiplexer
Creates a virtual terminal
Multiple users can attach to the same session
Session survives SSH disconnects

Why DevOps Engineers Use `screen`

Collaborative debugging
Long-running processes
Server maintenance
Pair troubleshooting over SSH

Install `screen`

# Ubuntu
sudo apt install screen

# CentOS
sudo dnf install screen

Basic `screen` Workflow

Start a session

screen

Detach (leave it running)

Ctrl + A, then Ctrl + D

List sessions

screen -ls

Reattach

screen -x <session-id>

Sharing a Terminal with a Colleague

You start screen
Colleague SSHs into same server
Colleague runs:

screen -x

Now:

Both see the same terminal
Both can type
Ideal for live collaboration

Exit vs Detach (Important)

Action	Effect
`exit`	Terminates the session
`Ctrl+A Ctrl+D`	Detaches safely

To fully stop screen:

exit
exit

(Exit shell → exit screen)

Why `screen` Belongs in Your Toolbox

No external software
Works over SSH
Extremely reliable
Used in real production environments

Summary

SSH Security

Fingerprints protect against MITM
Never ignore fingerprint warnings
Verify once, trust forever

File Transfer

SFTP = secure, encrypted, simple
SCP for CLI automation
GUI tools supported

Collaboration

screen enables shared terminals
Safe, fast, SSH-native

This completes a professional, real-world SSH workflow used daily by DevOps engineers.

Full Ubuntu Version Upgrade (Release Upgrade)

1. What Is a “Full Software Upgrade”?

2. Important Difference: Package Upgrade vs Release Upgrade

3. Critical Things to Check Before Upgrading

3.1 Full Backup (Mandatory)

3.2 Disk Space

3.3 Time for Troubleshooting

3.4 Wait After Release (Best Practice)

3.5 Third-Party Repositories

3.6 Bootable Recovery Media (Desktop Systems)

4. LTS vs Non-LTS (Very Important)

What does LTS mean?

Non-LTS versions:

5. When Should You Upgrade?

6. Upgrade Preparation Steps

Step 1: Fully Update Current System

Step 2: Reboot (If Kernel Updated)

Step 3: Install Upgrade Tool

7. Running the Release Upgrade

Step 1: Start Upgrade

Step 2: Allow Non-LTS Upgrades (If Needed)

Step 3: Run Upgrade Again

8. During the Upgrade

Configuration Prompts

Obsolete Packages

Configuration File Conflicts

9. Kernel Errors (High Risk)

10. Final Reboot

11. Real-World Outcome (Important Lesson)

Troubleshooting an Unbootable Ubuntu System (Real Incident Walkthrough)

1. Why This Failure Is a Good Thing

2. Initial Situation: System Does Not Boot

3. First Rule of Incident Response: Stay Calm

4. Isolating the Failure: Bootloader vs Kernel

Observations:

5. Best-Case Scenario: GRUB Menu Available

6. Worst-Case Scenario: GRUB Menu NOT Available

Solution:

7. Booting from a Live Linux System

What is a Live System?

Options:

8. Choosing the Correct Live Image

Special case (ARM systems):

9. Booting the Live System

10. First Priority: Data Access & Backup

11. Verifying File System Health

Read-only check (recommended):

12. Accessing the Installed System via chroot

Step 1: Open terminal inside mounted system

Step 2: Change root

13. Why Things Still Don’t Work Yet

14. Fixing Missing System Mounts (Critical Step)

Bind mounts:

15. Rebuilding GRUB

16. Making the Working Kernel the Default

Step 1: Inspect GRUB menu entries

Step 2: Set default kernel

Step 3: Apply changes

17. Reboot and Verify

18. Important Follow-Up (Next Lecture)

Stabilizing the System After Recovery (Kernel Safety & Cleanup)

1. The Risk After Recovery

2. Identify the Active (Working) Kernel

3. Find Which Package Owns the Kernel File

4. Mark the Working Kernel as Manually Installed

5. Why initrd.img Does Not Have a Package

6. Removing the Broken Kernel (Optional but Recommended)

Step 1: Identify broken kernel package

Step 2: Remove dependent headers first

Step 3: Remove the broken kernel

7. Why linux-headers-generic Exists

8. Reboot and Verify Stability

9. Optional Cleanup: Reset GRUB Default

10. Operational Best Practices (Real DevOps Advice)

A. Practice Recovery on Purpose

B. Servers Without Physical Access

C. Always Back Up Before Fixing

11. Common Causes of Boot Failures

Cron Jobs in Linux — Concepts, Configuration, and Real Usage

1. Heads-Up: There Is More Than One Cron Implementation

12. Accessing the Installed System via `chroot`

5. Why `initrd.img` Does Not Have a Package

7. Why `linux-headers-generic` Exists

3.3 `/etc/cron.d` (Debian / Ubuntu)

9. Wildcards (`*`)

Cron Output, Email Notifications, and `flock` (Ubuntu)

Option 2: Send output to `/dev/null`

13. Why `flock` Is Important (Real Production Problem)

14. What `flock` Does

15. Simple `flock` Example

16. Non-Blocking `flock` (Cron-Safe)

2. System-Wide Cron Jobs (`/etc/crontab`)