Arseniy Potapov

Posted on Mar 3 • Originally published at potapov.dev on Feb 28

Build Your Own Passwordless OTP Auth on AWS Lambda

#python #security #backend #prototype

I was adding authentication to a side project and started evaluating managed auth services. Auth0 gives you 25,000 MAU free. Clerk gives you 50,000. Cognito gives you 10,000. For a side project, any of them would cost zero dollars.

But all of them meant routing my auth through someone else's infrastructure. My users, my verification flow, my data - controlled by a third party. All I needed was for users to prove they own an email or phone number. No passwords, no social login, no user profiles. Just "enter your email, get a code, get a token."

So I built one. Two Lambda functions, a DynamoDB table, and 180 lines of Python. It's been running in production since February 2023 and costs about a dollar a month.

This article walks through the real code, the trade-offs, and a build-vs-buy framework so you can decide whether owning your auth stack is worth it - or whether a managed service is the right call.

You can try the live demo and browse the source code on GitHub.

How OTP Works

Most OTP tutorials generate codes with random.randint(100000, 999999). That's not a one-time password - that's a random number with no cryptographic guarantees. An attacker who intercepts one code learns nothing about the next, but there's no mathematical relationship enforcing that. A proper OTP uses HOTP (HMAC-based One-Time Password), defined in RFC 4226. HOTP takes two inputs: a shared secret (a random base32 string, minimum 128 bits per the RFC) and a counter (an integer that increments). Run them through HMAC-SHA1 and dynamic truncation, and you get a 6-digit code that's cryptographically tied to that specific secret and counter value. The pyotp library implements this algorithm, so you don't write any cryptography yourself - you pass in the secret and counter, and it gives you back a code.

import pyotp

secret = pyotp.random_base32()
hotp = pyotp.HOTP(secret)

code = hotp.at(0)   # first code
hotp.verify(code, 0) # True - matches counter 0
hotp.verify(code, 1) # False - wrong counter

Three properties make HOTP work for authentication:

Can't reuse. Each code is tied to a specific counter value. Once verified, the record is deleted, so the same code never works again.
Can't predict. Without the secret, there's no way to compute the next code. The secret never leaves the server.
Can't go backwards. A code generated for counter 5 doesn't work for counter 4.

HOTP vs TOTP

You've probably used TOTP - Time-based One-Time Password - with authenticator apps like Google Authenticator. TOTP is HOTP with time as the counter. Codes rotate every 30 seconds.

	HOTP (counter-based)	TOTP (time-based)
Code valid until	Used or expired by TTL	~30 seconds
Use case	"Send me a code" flows	Authenticator apps
Clock sync needed	No	Yes
User pressure	None - enter when ready	Must type within window

For "type your email, get a code" flows, HOTP is the right choice. TOTP would mean the user has 30 seconds to check their email and type the code - that's stressful and leads to failed attempts. With HOTP, the code stays valid until the record expires (5 minutes in my implementation) or the user enters it successfully.

The Architecture

The whole system is four components: two Lambda functions, one DynamoDB table, and whichever delivery service you prefer for sending codes. I use Mailgun for email and Twilio for SMS.

I chose serverless because OTP verification is the definition of a bursty workload. Most of the time nobody is logging in. Then 50 people sign up after a Product Hunt launch and you need to handle them all. Lambda scales to zero when idle and handles bursts without provisioning anything. For a service that processes a few requests per day on average, paying for a running server would be wasteful.

The Data Model

Each OTP record lives in DynamoDB with a TTL that auto-deletes expired codes. The pynamodb ORM keeps the model definition clean:

from pynamodb.models import Model
from pynamodb.attributes import UnicodeAttribute, NumberAttribute

class OTP(Model):
    class Meta:
        table_name = "simple-otp-secrets"
    id = UnicodeAttribute(hash_key=True)
    otp_secret = UnicodeAttribute()
    counter = NumberAttribute()
    expires = NumberAttribute()

The id is the user's email or phone number - one record per identity. otp_secret is the random base32 string that seeds HOTP generation. counter tracks how many codes have been generated for this secret. expires is a Unix timestamp for DynamoDB's TTL - after 5 minutes, DynamoDB deletes the record automatically.

What It Costs

This is where serverless gets interesting. Here's what I actually pay:

Component	Monthly Cost
Lambda (2 functions)	$0.00 (free tier: 1M requests/mo)
DynamoDB (on-demand)	$0.00 (free tier: 25 GB storage, 25 WCU/RCU)
API Gateway	$0.00 (free tier: 1M calls/mo)
Mailgun (email delivery)	$0.00 (free tier: 100 emails/day)
Twilio (SMS delivery)	~$0.008 per SMS + $1.15/mo for a number
Total	<$2/mo (or $0 if email-only)

The only component that costs real money is Twilio SMS. If you only need email verification, the entire service runs within free tiers indefinitely. You could also replace Mailgun with AWS SES ($0.10 per 1,000 emails) and Twilio with AWS SNS (~$0.007/SMS) to keep everything in AWS - I used Mailgun and Twilio because I already had accounts, but SES would make the email path truly $0. I've been running this for three years and my AWS bill has never exceeded $0.50 in a single month.

Deployment uses the Serverless Framework, which wraps CloudFormation and handles the API Gateway + Lambda wiring. Here's the core of serverless.yml:

functions:
  otpVerificationStart:
    handler: main.otp_verification_start
    events:
      - http:
          path: "/otp-verification/start"
          method: POST
          cors: true
    reservedConcurrency: 1
    timeout: 30

  otpVerificationComplete:
    handler: main.otp_verification_complete
    events:
      - http:
          path: "/otp-verification/complete"
          method: POST
          cors: true
    reservedConcurrency: 1
    timeout: 30

resources:
  Resources:
    simpleOtpSecrets:
      Type: AWS::DynamoDB::Table
      Properties:
        TableName: simple-otp-secrets
        AttributeDefinitions:
          - AttributeName: id
            AttributeType: S
        KeySchema:
          - AttributeName: id
            KeyType: HASH
        ProvisionedThroughput:
          ReadCapacityUnits: 1
          WriteCapacityUnits: 1
        TimeToLiveSpecification:
          AttributeName: expires
          Enabled: true

Two functions, one DynamoDB table with TTL enabled, IAM permissions for DynamoDB access. First deploy takes about 3 minutes. Subsequent deploys take under 30 seconds.

The Two Endpoints

The entire service is two Lambda functions behind API Gateway. One creates the OTP and sends it. The other verifies it and issues a JWT.

/start - Create and Send the Code

When a user requests a code, the start endpoint creates or resets an OTP record in DynamoDB, generates the next HOTP code, and sends it via Mailgun (email) or Twilio (SMS).

def otp_verification_start(event, context):
    params = event["queryStringParameters"]
    email = params.get("email", "").strip().lower() or None
    phone = re.sub(r"\D", "", params.get("phone", "").strip()) or None

    otp_id = email or phone
    try:
        otp = OTP.get(otp_id)
        otp.counter += 1
    except OTP.DoesNotExist:
        otp = OTP(otp_id, otp_secret=pyotp.random_base32(), counter=0,
                   expires=int(time.time()) + 300)
    otp.save()

    code = pyotp.HOTP(otp.otp_secret).at(otp.counter)

    if email:
        requests.post(f"https://api.mailgun.net/v3/{MAILGUN_DOMAIN}/messages",
                      auth=("api", MAILGUN_API_KEY),
                      data={"from": FROM_EMAIL, "to": [email],
                            "subject": "Verify your email", "text": f"Your PIN: {code}"})
    elif phone:
        requests.post(f"https://api.twilio.com/2010-04-01/Accounts/{TWILIO_SID}/Messages.json",
                      auth=(TWILIO_SID, TWILIO_TOKEN),
                      data={"To": f"+{phone}", "From": TWILIO_NUMBER,
                            "Body": f"Your PIN: {code}"})

If a record already exists for this email or phone, I increment the counter instead of creating a new secret. This means the previous code is immediately invalidated - pyotp.HOTP.verify() checks against the exact counter value, so only the latest code works. If a user requests a second code before entering the first one, they need to use the new code.

The Mailgun and Twilio calls are raw HTTP requests. No SDK. For a function this small, pulling in boto3 or the Twilio SDK would double the deployment package for no benefit.

/complete - Verify and Issue JWT

When the user enters their code, the complete endpoint looks up their OTP record, verifies the HOTP code, deletes the record, and returns a signed JWT.

def otp_verification_complete(event, context):
    params = event["queryStringParameters"]
    pin = params["pin"]
    otp_id = params.get("email", "").strip().lower() or re.sub(r"\D", "", params.get("phone", ""))

    otp = OTP.get(otp_id)  # raises DoesNotExist if expired or never created
    if not pyotp.HOTP(otp.otp_secret).verify(pin, otp.counter):
        return error_response(400, "Invalid PIN")

    otp.delete()

    sub = f"email:{params['email']}" if params.get("email") else f"tel:{params['phone']}"
    token = jwt.encode({
        "iss": "https://otp.potapov.dev/",
        "aud": "https://api.potapov.dev/",
        "sub": sub,
        "iat": datetime.now(timezone.utc),
        "exp": datetime.now(timezone.utc) + timedelta(days=1),
    }, os.environ["JWT_SECRET"], algorithm="HS256")

    return success_response({"token": token})

I delete the record on successful verification rather than incrementing the counter. This prevents code reuse and avoids stale records piling up in DynamoDB. If verification fails, the record stays - the user can try again with the same code until TTL expires.

One shortcut I should be honest about: rate limiting. RFC 4226 explicitly requires throttling to resist brute force attacks - a 6-digit code has only a million possible values. I handle this with Lambda's reservedConcurrency set to 1 per function, which you can see in the serverless.yml above. That's not real rate limiting per user - it's a concurrency cap that serializes requests to the endpoint, but doesn't stop a patient attacker from trying codes sequentially for a single email. For a personal demo that handles a few logins per day, it's acceptable. For anything beyond that, you'd want per-IP or per-user throttling at the API Gateway level, or a DynamoDB counter that locks out after 3 failed attempts.

JWT: From Code to Client

After successful OTP verification, the service issues a signed JWT. This token is how downstream APIs know the user proved ownership of their email or phone number.

Verifying on the Receiving End

The /complete endpoint issues the token. The interesting part is what happens when a downstream API receives it. Here's what validation looks like:

import jwt

claims = jwt.decode(
    token,
    os.environ["JWT_SECRET"],
    algorithms=["HS256"],
    audience="https://api.potapov.dev/",
    issuer="https://otp.potapov.dev/",
)
user_id = claims["sub"]  # "email:user@example.com"

The algorithms parameter is required in PyJWT 2.x - omitting it raises DecodeError. The audience and issuer parameters enforce that the token was meant for this specific API and issued by our OTP service. If either doesn't match, PyJWT raises an exception before your code ever sees the claims. The 24-hour expiry is generous for a demo - in production I'd cut it to 1-4 hours depending on the use case.

One Python detail worth noting: the datetime.now(timezone.utc) call in the /complete endpoint (with from datetime import datetime, timedelta, timezone). If you're reading older tutorials that use datetime.utcnow(), that's deprecated since Python 3.12 and now emits a DeprecationWarning. The old version returns a naive datetime, the new one returns a timezone-aware datetime. PyJWT 2.x handles both, but the new form is correct and won't spam your logs with warnings.

HS256 vs RS256

I use HS256 (symmetric signing) because this is a single-service setup. The same secret that signs the token also verifies it. Simple, fast, one environment variable.

The limitation: only services that know the JWT_SECRET can verify tokens. If I wanted the OTP service to act as a third-party identity provider - where other apps verify tokens without knowing the signing key - I'd switch to RS256. RS256 uses a private key to sign and a public key to verify. You publish the public key, and any service can validate tokens without accessing secrets.

For a side project or internal tool, HS256 is the right call. For a product where external services consume your tokens, RS256 is worth the extra setup.

Client-Side Decoding

JWTs are signed, not encrypted. The payload is base64-encoded JSON that any client can read:

import { jwtDecode } from "jwt-decode";

const claims = jwtDecode(token);
const userId = claims.sub; // "email:user@example.com"

That's by design. The client needs to know who's logged in without making a server round-trip. But it means you should never put sensitive data in JWT claims - anything in the payload is readable by anyone with the token.

Build vs Buy

Every OTP tutorial skips this part. They show you how to build it, declare victory, and leave you to figure out whether you should have used Auth0 instead. I've run this service for three years alongside projects that use Cognito and Clerk. Here's when each makes sense.

When to Build Your Own

Build your own OTP service when all of these are true:

Your auth needs are simple: email or phone verification, nothing else.
You don't need SSO, SAML, or social login.
You're comfortable deploying to Lambda (or any serverless platform).
You want to own your auth stack. No vendor can change pricing, deprecate an API, or sunset a feature you depend on. Your data stays in your AWS account.

Cost isn't the main reason to build - managed services have generous free tiers now. The real advantage is simplicity and control. When something breaks (and it will, eventually), you can read every line of the service in 10 minutes. Try doing that with Cognito's documentation.

When to Buy

Buy a managed auth service when any of these are true:

You need MFA beyond OTP (FIDO2, passkeys, authenticator apps).
Your customers require SSO/SAML integration.
You have compliance requirements (SOC 2, HIPAA, GDPR consent flows).
You're building a team product where auth touches user roles, permissions, organizations.
You'd rather not think about auth at all - the free tiers are generous enough that cost isn't a factor.

I wouldn't use this OTP service for a B2B SaaS product. Enterprise customers expect SSO. Security auditors expect documented auth flows, not a Lambda function someone built on a weekend. The moment you need "forgot password" or "link social account" or "enforce password rotation," you're reinventing a wheel that Auth0 and Clerk have spent years refining.

The Decision

Criteria	Build (DIY OTP)	Buy (Auth0/Cognito/Clerk)
Monthly cost (low traffic)	<$1	$0 (free tiers: 10-50K users)
Monthly cost at scale	Scales with SMS only	$35-240+/mo for paid features
Setup time	2-4 hours	30 minutes
SSO/SAML	No	Yes
MFA options	OTP only	OTP, FIDO2, passkeys
Vendor lock-in	None	Medium to high
Compliance certifications	You own it	Provider handles it
Maintenance burden	Near zero (serverless)	Near zero (managed)
Customization	Total	Limited by provider

The honest answer is that most teams should buy. Auth is a solved problem, and the managed services handle edge cases (account recovery, brute force protection, session management) that you'll eventually need. I kept running my own because the cost is negligible and I like owning the stack. That's a preference, not a recommendation.

There's a grey area worth mentioning: projects that start with simple OTP and grow. You build this for your MVP, users love it, now you need "remember this device" and "sign in with Google" and "admin can revoke sessions." At that point you're building an auth platform, not an OTP service. The right move is to migrate to a managed provider before you've reinvented half of it. I've seen teams spend months adding features to DIY auth that Auth0 ships out of the box.

If you're a solo founder building an MVP, a side project, or anything where "user enters email, gets a code, gets a token" is the entire auth story - building takes an afternoon and costs nothing. The moment auth becomes a feature instead of plumbing, switch to a managed service and spend your time on what makes your product different.

Three Years in Production

I deployed this service in February 2023. It's now 2026 and it's still running. Here's what that looks like in practice.

What Went Right

The service has had exactly zero outages that I caused. Lambda functions don't go stale, DynamoDB tables don't need vacuuming, and there's no server to patch. I haven't SSH'd into anything because there's nothing to SSH into. The only maintenance I've done in three years is updating the Mailgun sending domain when I moved to a new domain registrar.

Total operational cost over three years: under $30, almost entirely Twilio SMS charges. The AWS components have never exceeded free tier.

What Went Wrong

Twice, Mailgun rate-limited my sending domain because I was on the free tier and hit the 100 emails/day cap during a demo. Not a code problem - just a free tier limit I should have anticipated. I've since switched to a self-hosted mail server, but AWS SES would have been the simpler fix.

Once, a DynamoDB TTL deletion was delayed by about 15 minutes (TTL is best-effort, not exact), which meant an expired code still technically existed in the database. The HOTP verification still failed because the counter didn't match, so no security issue - just a confusing "OTP code not found" vs "Invalid PIN" error message distinction.

What I'd Change

If I were hardening this for production use beyond my own projects, four things would change.

I'd switch to RS256 for JWT signing, for the reasons I described above - the moment a second service needs to verify tokens, symmetric signing becomes a liability. I'd also add real per-user rate limiting instead of the reservedConcurrency workaround I mentioned in the endpoints section. A DynamoDB counter that locks out after 3 failed attempts per email is straightforward and would actually satisfy the RFC 4226 throttling requirement.

The error responses need work. Right now they're generic ("Invalid PIN", "OTP code not found"). Distinguishing between "expired" and "never created" helps the frontend show the right UX - "Your code expired, request a new one" is more useful than "something went wrong."

And I'd wrap the OTP secret with KMS before storing it. DynamoDB encrypts at rest by default, so it's not sitting unencrypted on disk, but application-level encryption with KMS would add negligible cost and measurable security.

None of these are deal-breakers for a personal demo. They're the kind of improvements that matter when the service grows beyond "my side project" into "something other people depend on."

What You Get

The whole service is 180 lines of Python, two Lambda functions, and one DynamoDB table. It handles email and SMS verification with proper HOTP codes, issues signed JWTs, and cleans up after itself. I've been running it for three years without touching it.

Should you build this? If your app needs auth and your budget is zero, you now have working code. If your app needs auth and your budget isn't zero, you now know exactly when a managed service is worth it - and you can make that call based on features you actually need, not on the assumption that auth is too hard to own.

Fork the repo, try the live demo to see the flow in action, and decide for yourself.

This article was created with the assistance of Claude Code.

DEV Community