What is OTP brute force and why does AI generate it?

OTP is a 6-digit code sent to your phone or email to log in. There are only one million possible codes. If your server doesn't rate-limit OTP submission, an attacker can guess all million in minutes. AI-generated OTP code typically validates the code but skips the rate limit, because the rate limit is implementation-specific and the AI doesn't always pick a strategy.

What's wrong with magic-link tokens?

Two failure modes. First, the token is short (8-12 characters of base64) and the link doesn't expire fast enough — long enough to be brute-forceable. Second, the token is bound to the email address but not to the session — meaning if the user requests a magic link, then someone else clicks it, the someone else gets the session. The fix is short tokens with strict expiry, or long tokens (>=128 bits of entropy) with looser expiry, plus binding the token to the originating browser via a separate cookie.

What is host-header reset poisoning?

Your password-reset flow sends an email containing a link like https://your-site.com/reset?token=.... Your server constructs that URL using the Host header from the incoming HTTP request. An attacker requests a password reset, but with Host: attacker.example. The reset email goes to the victim with a link to attacker.example/reset?token=.... The victim clicks. The token leaks to the attacker. The fix is to never derive the URL from the request — pin it to a config value.

What is email-change-no-reauth?

Account takeover via email rotation. The attacker has a valid session — maybe from a stolen but expired-soon password, maybe from a session-fixation chain. They change the account's email to one they control, then trigger a password reset to the new email. They now own the account. The fix is to require re-authentication (or a confirmation step on the original email) before allowing email change.

Where can I see this on a real URL?

https://gapbench.vibe-eval.com/site/magic-link-otp/ has brute-forceable OTP and weak magic-link tokens. https://gapbench.vibe-eval.com/site/password-reset-flaws/ has multiple reset-flow bugs. https://gapbench.vibe-eval.com/site/host-header-injection/ has the host-header poison. https://gapbench.vibe-eval.com/site/email-change-no-reauth/ has the takeover-via-email-rotation flow. https://gapbench.vibe-eval.com/site/session-fixation/ has session not regenerated after auth. https://gapbench.vibe-eval.com/site/weak-password-policy/ has weak credentials accepted.

What CWE does this map to?

CWE-307 (Improper Restriction of Excessive Authentication Attempts) for OTP brute force, CWE-640 (Weak Password Recovery Mechanism) for reset weakness, CWE-20 (Improper Input Validation) and CWE-644 (Improper Neutralization of HTTP Headers) for host-header poison. OWASP API #2:2023 (Broken Authentication), OWASP A07:2021 (Identification and Authentication Failures).

Magic links, OTP, and password resets — the auth flows AI generators get half right

The scenario referenced below runs on gapbench.vibe-eval.com — a public security benchmark we operate.

Auth flows are deceptive

Auth looks like a finished product. Users sign up, sign in, recover their password, change their email. Common-case software. The libraries exist. The pattern is well-known. AI generators reproduce the happy path quickly and convincingly.

The trouble is that auth has dozens of distinct edge cases, each with its own well-documented attack pattern, and the AI’s training corpus is heavily biased toward the happy path. The flows look correct, ship correct enough to fool a code review, and break in specific edge cases that have been written about for years.

I’ll cover four. There are more.

Brute-forceable OTP

app.post('/login/otp', async (req, res) => {
  const { phone, code } = req.body
  const stored = await db.otp.findUnique({ where: { phone } })
  if (stored.code === code && stored.expiresAt > new Date()) {
    return res.json({ token: createSession(phone) })
  }
  res.status(401).send('Invalid')
})

Six-digit OTP, no rate limit. An attacker who knows a phone number can submit all 1,000,000 codes. Even at modest rates (say, 100 attempts per second), that’s three hours to a guaranteed login. With a botnet doing parallel attempts, it’s minutes.

Live: https://gapbench.vibe-eval.com/site/magic-link-otp/.

The fix is rate-limit-on-target. Per (phone, IP) allow N attempts then back off; per phone alone allow X attempts before locking the account; per IP allow Y attempts before captcha. Twilio and other OTP providers offer this for you — the AI doesn’t always wire it up.

Weak magic-link tokens

const token = crypto.randomBytes(8).toString('hex')  // 16 hex chars = 64 bits

64 bits of entropy is fine for short-lived tokens. 64 bits of entropy on a token that’s valid for 24 hours, in a system that returns a deterministic 401 for invalid tokens, is brute-forceable in theory and merits investigation in practice. The standard recommendation is 128 bits or larger, with strict expiry (5-15 minutes for magic links is reasonable).

The other variant: the token is good but the link is sent to the user’s email and stored in their email provider’s history forever. If the email provider gets compromised — or the user forwards the email by accident — the link is still live until expiry. Strict expiry is the only mitigation.

The third variant: the token is bound to the email but not to the originating session. Anyone who clicks the link gets logged in, regardless of whether they were the one who requested it. Some products consider this a feature (it lets users log in from a different device by emailing themselves the link). For sensitive accounts, bind the token to a cookie set when the request was initiated, and verify both at click time.

Host-header reset poisoning

The classic. Your password-reset email is constructed like:

const resetUrl = `https://${req.headers.host}/reset?token=${token}`
sendEmail(user.email, `Reset your password: ${resetUrl}`)

That looks reasonable until an attacker sends a POST to /forgot-password with Host: attacker.example and the email body now has a link to attacker.example/reset?token=.... The user clicks. The attacker has the token.

Frameworks have varying defenses. Most modern frameworks normalize or refuse the Host header, but only if you’ve enabled the “trusted hosts” config — which the AI generator usually skips. Worth pinning your URL construction to a config value rather than trusting the request.

Live: https://gapbench.vibe-eval.com/site/host-header-injection/.

Email-change without re-auth

User has a session. User PATCHes their account, changing email from victim@example.com to attacker@example.com. No password confirmation. No verification email to the old address. The change goes through.

Now the attacker triggers a password reset. The reset email goes to the new address — the attacker’s. The attacker resets the password. The original user can no longer log in.

This is account takeover by way of session theft → email rotation. It only matters if the attacker had a valid session in the first place — but sessions get stolen regularly, through stolen passwords, through XSS, through malware, through every other auth bug we list elsewhere. Re-auth on email change is defense in depth: even if the attacker has a session, they need the current password to keep the account.

Live: https://gapbench.vibe-eval.com/site/email-change-no-reauth/.

The fix: any sensitive change (email, password, MFA settings, deletion) requires either a password re-entry or a step-up auth (MFA challenge). Some products send a confirmation to the old email when an email change is requested, with a “this wasn’t me” link. Both are good. At least one is necessary.

Two more, short:

session-fixation (/site/session-fixation/): the session ID is not regenerated after login. An attacker pre-creates a session ID, tricks the victim into using it (via a prepared link), waits for the victim to log in, and now shares the authenticated session.

weak-password-policy (/site/weak-password-policy/): the AI accepts password123 because it didn’t add a strength check. Most users will pick passwords that fall to a 10,000-entry dictionary. Add a strength check at registration. Don’t enforce arbitrary symbol-mixing rules — those are user-hostile and don’t help — but do reject the most common passwords.

A specific incident — magic-link token replay

Anonymized. A SaaS used magic-link auth — user types email, gets a link, clicks the link, is logged in. The link contained a token good for 24 hours, single-use after first click.

The bug was subtle. The “single-use” was implemented as: on click, check the token’s used_at field; if null, mark used and create session; if not null, reject. Standard pattern. Single round-trip to the database between read and write. Race window: a few milliseconds.

A user accidentally double-clicked the link in their email client. The first click hit, marked the token used, created a session, started a redirect to the dashboard. The second click hit during the redirect, before the email client noticed the response — and because the read had happened on a database replica that hadn’t yet replicated the write, the second click also saw used_at = null, marked it (again), and created another session. The user ended up with two valid sessions for the same magic-link click.

This wasn’t an exploit, just an accidental discovery. But it’s the same shape as a deliberate replay — submit the magic link from two devices simultaneously, get sessions on both. Combine with a stolen email account and the attacker has a session even if the legitimate user clicked first.

The fix was a SELECT FOR UPDATE inside a transaction (proper row lock), plus reading from primary not replica for auth flows. Three lines of code. The race had been live for ~8 months with no known exploitation but plenty of opportunity.

OTP brute force in detail

The math: 6-digit OTP = 1,000,000 codes. At 100 attempts per second, the entire space takes ~3 hours. A botnet of 100 IPs each doing 100 attempts/second is ~30 seconds. Without rate limiting, brute force is trivial.

The naive rate limit: “after 5 failed attempts, lock the account for 15 minutes.” Insufficient because:

The attacker tries from multiple IPs simultaneously, splitting the budget.
The attacker tries multiple accounts, parallelizing across phone numbers.
The lockout becomes a DoS vector — submit 5 failed attempts to lock out a victim.

The robust shape:

// Per-phone budget
const phoneAttempts = await redis.incr(`otp:phone:${phone}`)
if (phoneAttempts > 5) return res.status(429).end()
await redis.expire(`otp:phone:${phone}`, 600)  // 10-minute window

// Per-IP budget (across all phones)
const ipAttempts = await redis.incr(`otp:ip:${req.ip}`)
if (ipAttempts > 50) return res.status(429).end()
await redis.expire(`otp:ip:${req.ip}`, 600)

// Constant-time comparison (avoid timing oracle)
if (!timingSafeEqual(submittedCode, expectedCode)) return res.status(401).end()

// Burn the code on success — single use
await redis.del(`otp:code:${phone}`)

Plus: don’t lock accounts on failed OTP, lock the device fingerprint or IP. Account-level lockout creates DoS.

Password reset — the parts AI generators get wrong

Beyond host-header injection (covered above), the variants:

Tokens that don’t expire. Reset link from 6 months ago still works. AI codegen sometimes omits the expiry check.
Tokens not invalidated on password change. User resets password, then later resets again — the first reset’s token still works.
Reset links over HTTP. Works if the user is on a non-HTTPS network and the redirect happens after the link is fetched. HSTS partially mitigates.
No notification to the user when a reset is requested. Attacker triggers a reset, intercepts the email somehow, resets — and the legitimate user has no idea anything happened until next login.
Email enumeration via reset-request response. “Email sent if account exists” is the safe response; “no account with that email” is the unsafe one. AI codegen sometimes returns the unsafe one.

Wrong fix vs right fix — magic link

// WRONG: read-then-write race
const link = await db.magicLink.findUnique({ where: { token } })
if (!link || link.usedAt || link.expiresAt < new Date()) return res.status(400).end()
await db.magicLink.update({ where: { token }, data: { usedAt: new Date() } })
const session = await createSession(link.userId)

// RIGHT: atomic conditional update
const result = await db.magicLink.updateMany({
  where: { token, usedAt: null, expiresAt: { gt: new Date() } },
  data: { usedAt: new Date() }
})
if (result.count === 0) return res.status(400).end()
const link = await db.magicLink.findUnique({ where: { token } })
const session = await createSession(link.userId)

How we detect

OTP: we hit the OTP endpoint repeatedly with synthetic codes against a known phone. If we get past 100 attempts without backoff, finding.

Magic-link tokens: we read the token format from a captured email, evaluate entropy, and probe the verification endpoint with random tokens at high rate to estimate the brute-force boundary.

Host-header poison: we send /forgot-password with a hostile Host header and read the resulting reset URL from the email (when we have email side-channel access for testing) or from the response body if the URL is rendered server-side.

Email change re-auth: we PATCH the account email without supplying the password and observe whether the change persists.

Session fixation: we get a session ID before login, log in, check whether the post-login session ID is the same.

Password policy: we attempt registration with password, 123456, qwerty — if any succeed, finding.

All of these are cheap and runtime. Static scanners won’t reliably catch any of them.

CWE / OWASP

CWE-307 — Improper Restriction of Excessive Authentication Attempts (OTP)
CWE-640 — Weak Password Recovery Mechanism (reset)
CWE-644 — Improper Neutralization of HTTP Headers (host-header)
CWE-384 — Session Fixation
CWE-521 — Weak Password Requirements
OWASP API Top 10 — API2:2023 Broken Authentication
OWASP Top 10 — A07:2021 Identification and Authentication Failures

Reproduce it yourself

Magic link / OTP weakness: https://gapbench.vibe-eval.com/site/magic-link-otp/
Password reset flaws: https://gapbench.vibe-eval.com/site/password-reset-flaws/
Host-header injection: https://gapbench.vibe-eval.com/site/host-header-injection/
Email change without re-auth: https://gapbench.vibe-eval.com/site/email-change-no-reauth/
Session fixation: https://gapbench.vibe-eval.com/site/session-fixation/
Weak password policy: https://gapbench.vibe-eval.com/site/weak-password-policy/
General auth scenario: https://gapbench.vibe-eval.com/site/auth-system/

Pattern: JWT alg=none is not dead
Pattern: SSRF, open redirects, and OAuth redirect_uri
Tool: vibe-code-scanner

MAGIC LINKS, OTP, AND PASSWORD RESETS

Auth flows are deceptive

Brute-forceable OTP

Weak magic-link tokens

Host-header reset poisoning

Email-change without re-auth

A specific incident — magic-link token replay

OTP brute force in detail

Password reset — the parts AI generators get wrong

Wrong fix vs right fix — magic link

How we detect

CWE / OWASP

Reproduce it yourself

COMMON QUESTIONS

TEST YOUR AUTH FLOW

Auth flows are deceptive

Brute-forceable OTP

Weak magic-link tokens

Host-header reset poisoning

Email-change without re-auth

Related: session fixation, weak password policy

A specific incident — magic-link token replay

OTP brute force in detail

Password reset — the parts AI generators get wrong

Wrong fix vs right fix — magic link

How we detect

CWE / OWASP

Reproduce it yourself

Related reading

COMMON QUESTIONS

TEST YOUR AUTH FLOW