Shafiq Ur Rehman

Posted on Apr 21

HTTP vs HTTPS: One Letter Between You and a Hacker's Best Day

#networking #security

HTTP sends your passwords in plain text. HTTPS stops that. But understanding why every mechanism in HTTPS exists makes you a sharper engineer and a better security thinker. This article breaks down the full picture, starting from what breaks without protection and working up through each fix.

1. What HTTP Actually Does (And Why That Is a Problem)

HTTP (HyperText Transfer Protocol) sends every request and response as raw, readable text. Every router, ISP node, and transit server between your device and the destination sees the full content of every request, including passwords, session tokens, and personal data.

This is not a flaw that crept in through negligence. The protocol was designed in 1991 for an academic network where trust was assumed. The internet grew into banking, healthcare, and global commerce without updating that foundational assumption.

TCP/IP, the delivery system beneath HTTP, moves packets between machines. It was never designed to hide what is inside them from the machines doing the routing.

Warning: On public Wi-Fi, every device on the same network running HTTP traffic can read your data with freely available tools. HTTPS is the minimum bar for any site that handles user input.

Key problems with plain HTTP:

Credentials sent as readable text across every network hop
Session tokens visible to anyone on the same network
No way to confirm the server you reached is the server you intended to reach
No detection of content modification in transit

Real-world case: In 2010, a Firefox extension called Firesheep was released publicly. It automated the capture of unencrypted session cookies on shared Wi-Fi networks. Anyone on the same coffee shop network could hijack Facebook, Twitter, and Flickr sessions with a single click. This forced major platforms to adopt HTTPS for all traffic, not just login pages.

[Further reading: RFC 7230 - HTTP/1.1 Message Syntax and Routing]

Background: What Is TCP/IP?

TCP/IP is the foundational communication standard of the Internet. TCP (Transmission Control Protocol) splits your data into packets and ensures they arrive correctly. IP (Internet Protocol) addresses and routes those packets across networks. Together, they form the postal system of the internet. They deliver packets reliably, but they do not encrypt or authenticate what is inside those packets.

2. The Key Distribution Problem

Symmetric encryption (like AES-256) is fast and computationally strong. Both sides encrypt and decrypt using the same key. The problem: both sides must already share that key before the encrypted conversation starts.

The core paradox: To share the key securely, you need a secure channel. To have a secure channel, you need the key. You cannot solve one without the other.

If you send the key over the same network you want to protect, an attacker intercepting the key can decrypt every message that follows. You have added encryption without adding security.

This problem blocked practical, secure internet communication for decades. It was not solved until public-key cryptography became viable.

Note: This is called the key distribution problem, and it is one of the most consequential unsolved problems in cryptography before the 1970s. The Diffie-Hellman key exchange (1976) was the first published solution. RSA followed in 1977.

Counter-view: Some argue that pre-shared keys work fine in closed systems, such as military or enterprise networks, where physical key distribution is possible. They are right. The key distribution problem is specifically a problem for open, anonymous communication across untrusted networks, which is what the public internet requires.

[Further reading: Diffie, W. and Hellman, M. - New Directions in Cryptography (1976)]

3. Asymmetric Encryption: How the Key Exchange Problem Gets Solved

Asymmetric encryption uses two mathematically linked keys. What the public key encrypts, only the private key can decrypt. The public key is shared openly. The private key never leaves the server.

How this solves the distribution problem:

The client gets the server's public key (sent openly; anyone can see it)
The client encrypts a secret value with that public key
Only the server holding the private key can decrypt it
Both sides now share a secret that never crossed the network in usable form

Why not use asymmetric encryption for all traffic?

RSA encryption is roughly 1,000 times slower than AES. Encrypting a video stream or a large API response with RSA would make the web unusable. TLS uses a hybrid model:

Asymmetric encryption handles the key exchange (one time per session)
Symmetric AES uses the resulting session key for all actual data transfer

ECDHE (Elliptic Curve Diffie-Hellman Ephemeral) is the modern replacement for RSA in key exchange. It produces the same security with smaller key sizes and faster computation. The "Ephemeral" part means the keys are one-time-use and discarded after each session, which is critical to Perfect Forward Secrecy (covered in section 7).

Real-world case: In 2017, researchers demonstrated that RSA-1024 keys (once considered adequate) could be factored in practical time using modern hardware clusters. This accelerated the industry-wide shift to ECDHE, which offers equivalent security with 256-bit keys compared to RSA's 2048-bit minimum.

[Further reading: NIST SP 800-56A Rev.3 - Elliptic Curve Key Establishment Schemes]

Background: What Is a Session Key?

A session key is a temporary symmetric key generated fresh for each connection. It exists only for the duration of one TLS session. After the session ends, the key is discarded. All the actual web traffic during that session is encrypted and decrypted using this key. Because it is symmetric, encryption and decryption are fast.

4. The TLS Handshake: Four Phases, Four Problems Solved

Each phase of the TLS handshake solves a specific attack. Skipping any one phase opens a specific class of vulnerability.

Phase 1: Capability Negotiation

The client sends supported TLS versions and cipher suites, plus a random value (nonce). Without this phase, an attacker positioned between client and server could strip the negotiation and force both sides to use an older, weaker TLS version. This is called a downgrade attack. The nonce prevents replay attacks, where a recorded handshake is played back to establish a fraudulent session.

Phase 2: Identity Assertion

The server sends its certificate. The certificate contains the server's public key, its domain name, and a digital signature from a Certificate Authority (a trusted third party that verifies domain ownership). Without this phase, the client has no way to confirm it is talking to the intended server. Encrypting traffic to an impostor is functionally the same as sending it in plaintext.

Phase 3: Key Exchange

Both sides run the ECDHE algorithm using their respective key material to independently derive the same session key. The session key never travels across the network. An attacker watching the exchange sees only public parameters, from which deriving the private session key is computationally infeasible.

Phase 4: Transcript Verification

Both sides hash the complete record of every handshake message and compare the results. If any message was altered or injected mid-handshake, the hashes will not match, and the connection terminates immediately. This phase confirms that the negotiation itself was not tampered with.

Warning: TLS 1.0 and 1.1 are deprecated and should be disabled on all servers. They lack protection against attacks like BEAST and POODLE. TLS 1.3, standardized in 2018, is the current secure baseline. It removed all cipher suites that do not provide Perfect Forward Secrecy.

Real-world case: In 2014, the POODLE attack (Padding Oracle On Downgraded Legacy Encryption) demonstrated that an active attacker could force a TLS 1.2 connection to downgrade to SSL 3.0, then exploit a padding oracle vulnerability to decrypt session cookies. The attack required control of the network between client and server, a realistic position for an attacker on shared Wi-Fi.

[Further reading: RFC 8446 - The Transport Layer Security (TLS) Protocol Version 1.3]

Pros and Cons of TLS Overhead

Aspect	Benefit	Cost
Encryption	Prevents eavesdropping on all traffic	Marginal CPU overhead per connection
Handshake	Establishes authenticated, shared key	Adds 1-2 round trips on first connection
Certificate validation	Confirms server identity	Requires OCSP or CRL check for revocation status
TLS 1.3 0-RTT	Allows resuming sessions with zero round trips	Replay attacks possible on non-idempotent requests
PFS (ECDHE)	Past sessions stay secure after key compromise	Slightly more computation than static RSA
Certificate expiration	Limits damage from key theft	Requires automated renewal management

5. Certificates and Certificate Authorities: The Trust Problem

Without certificates, encryption protects the channel but not the identity at the other end. An attacker positioned between you and your bank can establish two encrypted connections: one with you and one with the real bank. They decrypt, read, and re-encrypt everything. From your perspective, the connection looks secure. You are just talking to the wrong party.

A TLS certificate solves this by binding a server's public key to its domain name, with a Certificate Authority (CA) signature as proof.

What a CA actually does:

When you request a certificate for bank.com, the CA independently verifies that you control that domain (through DNS records, HTTP challenges, or email verification). It then signs the certificate with its own private key. Every major OS and browser ships with a pre-installed list of trusted CA public keys.

When your browser connects to bank.com, it checks whether the certificate's CA signature is valid against a trusted CA it already knows. If an attacker substitutes their own public key, the CA signature fails validation, and the browser refuses the connection.

Counter-view: The CA model concentrates trust in a relatively small number of organizations. In 2011, Dutch CA DigiNotar was compromised, and attackers issued fraudulent certificates for google.com, mozilla.com, and other high-value domains. Iranian users' traffic was intercepted using these certificates. The entire DigiNotar CA was subsequently removed from trust lists. This event demonstrated that the CA model's weakest point is not the cryptography; it is the security of the CA organizations themselves.

Real-world case: Certificate Transparency (CT) logs were introduced in 2013 and became mandatory for Chrome in 2018. Every certificate issued by any CA must be logged publicly in append-only CT logs. This means fraudulent certificate issuance becomes detectable, because the certificate will appear in a public log even if the intended domain owner was not notified.

[Further reading: RFC 9162 - Certificate Transparency Version 2.0]

Background: What Is SHA-256?

SHA-256 is a hashing algorithm. You feed it any input (a document, a certificate, a password) and it produces a fixed 256-bit fingerprint. Two different inputs rarely produce the same fingerprint (this is called a collision). You cannot reverse a SHA-256 hash to recover the original input. CAs sign the SHA-256 hash of a certificate rather than the certificate itself, because RSA has size limits and because a hash collision would allow attaching a legitimate signature to a fraudulent certificate.

6. Perfect Forward Secrecy: Protecting Past Sessions

Before Perfect Forward Secrecy became standard, session keys were mathematically derived from the server's long-term private key. This created a retroactive vulnerability.

The "record now, decrypt later" attack:

An attacker records all encrypted traffic between users and a server today
Five years later, the attacker obtains the server's private key (through a breach, a legal order, or social engineering)
The attacker can now decrypt every session ever recorded, retroactively

ECDHE defeats this by generating fresh, independent key pairs for every session. The session key derives from these ephemeral keys, not from the server's long-term key. When the session ends, the ephemeral keys are permanently destroyed. An attacker holding the server's private key gains nothing from it for past sessions.

TLS 1.3 made PFS mandatory. Every cipher suite in TLS 1.3 requires ephemeral key exchange. All static RSA key exchange cipher suites were removed.

Warning: TLS configurations that still allow cipher suites like TLS_RSA_WITH_AES_256_CBC_SHA have no Perfect Forward Secrecy. Audit your server's TLS configuration regularly. Tools like SSL Labs' server test (ssllabs.com/ssltest) check for this explicitly.

Real-world case: The Snowden documents (2013) revealed that intelligence agencies were storing large volumes of encrypted internet traffic. The stated rationale was that future advances in cryptanalysis or access to private keys could make currently unreadable traffic readable later. PFS directly limits the value of bulk collection by ensuring that traffic encrypted with ephemeral keys cannot be retroactively decrypted.

[Further reading: RFC 7457 - Summarizing Known Attacks on TLS and DTLS]

7. What HTTPS Does Not Protect: The Application Layer

TLS secures the transport pipe between the browser and the server. It does not inspect the payload flowing through that pipe. A SQL injection string arrives at your database:

Encrypted in transit (TLS did its job)
Intact and unmodified (no tampering occurred)
Fully executable (TLS never looked at the content)

The payload '; DROP TABLE users; -- is delivered correctly. What your application does with it is entirely outside TLS's scope.

Threats outside TLS's responsibility:

SQL Injection: Malicious database commands embedded in user input, executed when the application fails to sanitize them
XSS (Cross-Site Scripting): Malicious scripts injected into web pages, executed in other users' browsers
CSRF (Cross-Site Request Forgery): Tricks authenticated users into submitting requests they did not intend to make
Authentication bypass: Logic flaws in how the server verifies identity, unrelated to encryption
DDoS at the application layer: Floods of legitimate-looking HTTPS requests that exhaust server resources

Real-world case: The 2012 LinkedIn breach exposed 6.5 million password hashes. The passwords were hashed without salt using SHA-1, making the majority crackable within hours using rainbow tables. The site used HTTPS. The encryption protected traffic in transit; it had no bearing on how the server stored passwords internally.

Warning: Deploying HTTPS and considering security "complete" is one of the most common and costly security misconceptions in web development. HTTPS handles one threat model. Your application, database, authentication system, and infrastructure each have separate attack surfaces that require separate controls.

Defense layers beyond HTTPS:

Input validation and parameterized queries protect against SQL injection and XSS
CSRF tokens protect against cross-origin request forgery
WAF (Web Application Firewall) filters malicious patterns at the application boundary
IAM and MFA control who can authenticate and what they can access
DNSSEC and HSTS prevent DNS poisoning and protocol downgrade before TLS starts
Logging and monitoring detect what all other layers missed

[Further reading: OWASP Top Ten - owasp.org/www-project-top-ten]

8. SSH on AWS EC2: Same Cryptography, Different Trust Model

SSH connections to AWS EC2 instances use the same asymmetric cryptography as HTTPS: key pairs, encryption, and integrity checks. But the trust model is completely different.

How EC2 SSH works:

AWS generates a key pair when you create the instance
You receive the private key file (.pem) once, at creation time
The public key is placed in the instance's ~/.ssh/authorized_keys file
On connection, the client proves possession of the private key through a cryptographic challenge

No CA is involved. Trust comes from directly holding the key. You control both sides of the connection.

TOFU (Trust On First Use):

On the first SSH connection to an EC2 instance, your terminal displays the server's fingerprint (a hash of the host's public key) and asks you to verify it. You confirm manually. The fingerprint is cached locally. Future connections verify automatically against the cached value.

Why HTTPS cannot use TOFU:

A developer logs into perhaps five EC2 instances. Manual fingerprint verification per connection is practical. A browser user visits millions of different websites over the years of browsing. Manually verifying every server's fingerprint on the first visit is not operationally possible. The CA model automates the trust establishment that TOFU requires you to perform by hand.

Note: When you see the SSH warning "Host key verification failed," this means the server's fingerprint changed since your last connection. This is normal after rebuilding an EC2 instance, but on a server you have not touched recently, it warrants investigation. It could indicate an MITM attack.

Real-world case: In misconfigured automated deployment pipelines, StrictHostKeyChecking=no is sometimes set to prevent SSH from prompting on first connection. This disables TOFU entirely and accepts any host key, including a forged one. In 2020, several CI/CD pipeline security audits found this configuration common in enterprise environments, leaving deployments vulnerable to supply chain attacks.

[Further reading: OpenSSH Manual - ssh_config(5)]

Summary: Each Mechanism and the Attack It Prevents

Mechanism	Attack prevented	Removed if missing
Encryption in transit	Eavesdropping at any network hop	Credentials, tokens, and data visible to all intermediaries
Asymmetric key exchange (ECDHE)	Key interception during setup	Symmetric key useless to share over the channel, it must be secured
TLS certificates	MITM via impostor public key	Encrypted tunnel to the wrong party
Certificate Authorities	Self-signed certificate fraud	No scalable way to verify domain ownership
SHA-256 in certificate chains	Certificate forgery via hash collision	Valid CA signatures attachable to fraudulent certificates
Phased TLS handshake	Downgrade attacks, injected messages	Each phase depends on guarantees from the previous one
Perfect Forward Secrecy	Record-now, decrypt-later attacks	Long-term key compromise exposes all past sessions
Certificate expiration	Indefinite use of a stolen private key	One stolen key grants permanent impersonation
Application layer controls	SQL injection, XSS, CSRF, auth bypass	TLS secures the pipe but never inspects what flows through it

HTTPS is the first defense, not the only one. Every layer listed above addresses a different attacker capability. Remove any one layer, and a specific class of attack becomes practical. That is why the architecture is built the way it is, and why "we have HTTPS" is the start of a security conversation, not the end of one.

[Further reading: OWASP Web Security Testing Guide - owasp.org/www-project-web-security-testing-guide]

Top comments (1)

zain • May 16

Great Explanation, and thank you for Further reading notes.