MAGIC LINKS, OTP, AND PASSWORD RESETS
Auth flows look simple from outside and turn out to have a hundred sharp edges inside. AI generators get the happy path right and miss most of the edges. Here are the four specific edges they miss most.
The scenario referenced below runs on gapbench.vibe-eval.com — a public security benchmark we operate.
Auth flows are deceptive
Auth looks like a finished product. Users sign up, sign in, recover their password, change their email. Common-case software. The libraries exist. The pattern is well-known. AI generators reproduce the happy path quickly and convincingly.
The trouble is that auth has dozens of distinct edge cases, each with its own well-documented attack pattern, and the AI’s training corpus is heavily biased toward the happy path. The flows look correct, ship correct enough to fool a code review, and break in specific edge cases that have been written about for years.
I’ll cover four. There are more.
Brute-forceable OTP
app.post('/login/otp', async (req, res) => {
const { phone, code } = req.body
const stored = await db.otp.findUnique({ where: { phone } })
if (stored.code === code && stored.expiresAt > new Date()) {
return res.json({ token: createSession(phone) })
}
res.status(401).send('Invalid')
})
Six-digit OTP, no rate limit. An attacker who knows a phone number can submit all 1,000,000 codes. Even at modest rates (say, 100 attempts per second), that’s three hours to a guaranteed login. With a botnet doing parallel attempts, it’s minutes.
Live: https://gapbench.vibe-eval.com/site/magic-link-otp/.
The fix is rate-limit-on-target. Per (phone, IP) allow N attempts then back off; per phone alone allow X attempts before locking the account; per IP allow Y attempts before captcha. Twilio and other OTP providers offer this for you — the AI doesn’t always wire it up.
Weak magic-link tokens
const token = crypto.randomBytes(8).toString('hex') // 16 hex chars = 64 bits
64 bits of entropy is fine for short-lived tokens. 64 bits of entropy on a token that’s valid for 24 hours, in a system that returns a deterministic 401 for invalid tokens, is brute-forceable in theory and merits investigation in practice. The standard recommendation is 128 bits or larger, with strict expiry (5-15 minutes for magic links is reasonable).
The other variant: the token is good but the link is sent to the user’s email and stored in their email provider’s history forever. If the email provider gets compromised — or the user forwards the email by accident — the link is still live until expiry. Strict expiry is the only mitigation.
The third variant: the token is bound to the email but not to the originating session. Anyone who clicks the link gets logged in, regardless of whether they were the one who requested it. Some products consider this a feature (it lets users log in from a different device by emailing themselves the link). For sensitive accounts, bind the token to a cookie set when the request was initiated, and verify both at click time.
Host-header reset poisoning
The classic. Your password-reset email is constructed like:
const resetUrl = `https://${req.headers.host}/reset?token=${token}`
sendEmail(user.email, `Reset your password: ${resetUrl}`)
That looks reasonable until an attacker sends a POST to /forgot-password with Host: attacker.example and the email body now has a link to attacker.example/reset?token=.... The user clicks. The attacker has the token.
Frameworks have varying defenses. Most modern frameworks normalize or refuse the Host header, but only if you’ve enabled the “trusted hosts” config — which the AI generator usually skips. Worth pinning your URL construction to a config value rather than trusting the request.
Live: https://gapbench.vibe-eval.com/site/host-header-injection/.
Email-change without re-auth
User has a session. User PATCHes their account, changing email from victim@example.com to attacker@example.com. No password confirmation. No verification email to the old address. The change goes through.
Now the attacker triggers a password reset. The reset email goes to the new address — the attacker’s. The attacker resets the password. The original user can no longer log in.
This is account takeover by way of session theft → email rotation. It only matters if the attacker had a valid session in the first place — but sessions get stolen regularly, through stolen passwords, through XSS, through malware, through every other auth bug we list elsewhere. Re-auth on email change is defense in depth: even if the attacker has a session, they need the current password to keep the account.
Live: https://gapbench.vibe-eval.com/site/email-change-no-reauth/.
The fix: any sensitive change (email, password, MFA settings, deletion) requires either a password re-entry or a step-up auth (MFA challenge). Some products send a confirmation to the old email when an email change is requested, with a “this wasn’t me” link. Both are good. At least one is necessary.
Related: session fixation, weak password policy
Two more, short:
session-fixation (/site/session-fixation/): the session ID is not regenerated after login. An attacker pre-creates a session ID, tricks the victim into using it (via a prepared link), waits for the victim to log in, and now shares the authenticated session.
weak-password-policy (/site/weak-password-policy/): the AI accepts password123 because it didn’t add a strength check. Most users will pick passwords that fall to a 10,000-entry dictionary. Add a strength check at registration. Don’t enforce arbitrary symbol-mixing rules — those are user-hostile and don’t help — but do reject the most common passwords.
A specific incident — magic-link token replay
Anonymized. A SaaS used magic-link auth — user types email, gets a link, clicks the link, is logged in. The link contained a token good for 24 hours, single-use after first click.
The bug was subtle. The “single-use” was implemented as: on click, check the token’s used_at field; if null, mark used and create session; if not null, reject. Standard pattern. Single round-trip to the database between read and write. Race window: a few milliseconds.
A user accidentally double-clicked the link in their email client. The first click hit, marked the token used, created a session, started a redirect to the dashboard. The second click hit during the redirect, before the email client noticed the response — and because the read had happened on a database replica that hadn’t yet replicated the write, the second click also saw used_at = null, marked it (again), and created another session. The user ended up with two valid sessions for the same magic-link click.
This wasn’t an exploit, just an accidental discovery. But it’s the same shape as a deliberate replay — submit the magic link from two devices simultaneously, get sessions on both. Combine with a stolen email account and the attacker has a session even if the legitimate user clicked first.
The fix was a SELECT FOR UPDATE inside a transaction (proper row lock), plus reading from primary not replica for auth flows. Three lines of code. The race had been live for ~8 months with no known exploitation but plenty of opportunity.
OTP brute force in detail
The math: 6-digit OTP = 1,000,000 codes. At 100 attempts per second, the entire space takes ~3 hours. A botnet of 100 IPs each doing 100 attempts/second is ~30 seconds. Without rate limiting, brute force is trivial.
The naive rate limit: “after 5 failed attempts, lock the account for 15 minutes.” Insufficient because:
- The attacker tries from multiple IPs simultaneously, splitting the budget.
- The attacker tries multiple accounts, parallelizing across phone numbers.
- The lockout becomes a DoS vector — submit 5 failed attempts to lock out a victim.
The robust shape:
// Per-phone budget
const phoneAttempts = await redis.incr(`otp:phone:${phone}`)
if (phoneAttempts > 5) return res.status(429).end()
await redis.expire(`otp:phone:${phone}`, 600) // 10-minute window
// Per-IP budget (across all phones)
const ipAttempts = await redis.incr(`otp:ip:${req.ip}`)
if (ipAttempts > 50) return res.status(429).end()
await redis.expire(`otp:ip:${req.ip}`, 600)
// Constant-time comparison (avoid timing oracle)
if (!timingSafeEqual(submittedCode, expectedCode)) return res.status(401).end()
// Burn the code on success — single use
await redis.del(`otp:code:${phone}`)
Plus: don’t lock accounts on failed OTP, lock the device fingerprint or IP. Account-level lockout creates DoS.
Password reset — the parts AI generators get wrong
Beyond host-header injection (covered above), the variants:
- Tokens that don’t expire. Reset link from 6 months ago still works. AI codegen sometimes omits the expiry check.
- Tokens not invalidated on password change. User resets password, then later resets again — the first reset’s token still works.
- Reset links over HTTP. Works if the user is on a non-HTTPS network and the redirect happens after the link is fetched. HSTS partially mitigates.
- No notification to the user when a reset is requested. Attacker triggers a reset, intercepts the email somehow, resets — and the legitimate user has no idea anything happened until next login.
- Email enumeration via reset-request response. “Email sent if account exists” is the safe response; “no account with that email” is the unsafe one. AI codegen sometimes returns the unsafe one.
Wrong fix vs right fix — magic link
// WRONG: read-then-write race
const link = await db.magicLink.findUnique({ where: { token } })
if (!link || link.usedAt || link.expiresAt < new Date()) return res.status(400).end()
await db.magicLink.update({ where: { token }, data: { usedAt: new Date() } })
const session = await createSession(link.userId)
// RIGHT: atomic conditional update
const result = await db.magicLink.updateMany({
where: { token, usedAt: null, expiresAt: { gt: new Date() } },
data: { usedAt: new Date() }
})
if (result.count === 0) return res.status(400).end()
const link = await db.magicLink.findUnique({ where: { token } })
const session = await createSession(link.userId)
How we detect
OTP: we hit the OTP endpoint repeatedly with synthetic codes against a known phone. If we get past 100 attempts without backoff, finding.
Magic-link tokens: we read the token format from a captured email, evaluate entropy, and probe the verification endpoint with random tokens at high rate to estimate the brute-force boundary.
Host-header poison: we send /forgot-password with a hostile Host header and read the resulting reset URL from the email (when we have email side-channel access for testing) or from the response body if the URL is rendered server-side.
Email change re-auth: we PATCH the account email without supplying the password and observe whether the change persists.
Session fixation: we get a session ID before login, log in, check whether the post-login session ID is the same.
Password policy: we attempt registration with password, 123456, qwerty — if any succeed, finding.
All of these are cheap and runtime. Static scanners won’t reliably catch any of them.
CWE / OWASP
- CWE-307 — Improper Restriction of Excessive Authentication Attempts (OTP)
- CWE-640 — Weak Password Recovery Mechanism (reset)
- CWE-644 — Improper Neutralization of HTTP Headers (host-header)
- CWE-384 — Session Fixation
- CWE-521 — Weak Password Requirements
- OWASP API Top 10 — API2:2023 Broken Authentication
- OWASP Top 10 — A07:2021 Identification and Authentication Failures
Reproduce it yourself
- Magic link / OTP weakness: https://gapbench.vibe-eval.com/site/magic-link-otp/
- Password reset flaws: https://gapbench.vibe-eval.com/site/password-reset-flaws/
- Host-header injection: https://gapbench.vibe-eval.com/site/host-header-injection/
- Email change without re-auth: https://gapbench.vibe-eval.com/site/email-change-no-reauth/
- Session fixation: https://gapbench.vibe-eval.com/site/session-fixation/
- Weak password policy: https://gapbench.vibe-eval.com/site/weak-password-policy/
- General auth scenario: https://gapbench.vibe-eval.com/site/auth-system/
Related reading
- Pattern: JWT alg=none is not dead
- Pattern: SSRF, open redirects, and OAuth redirect_uri
- Tool: vibe-code-scanner
COMMON QUESTIONS
TEST YOUR AUTH FLOW
We probe magic links, OTP, password reset, and email-change paths for the well-documented bugs.