DATA STUDY / TIME-TO-DETECTION

THE FIRST 60 SECONDS: TIME-TO-FIRST-CRITICAL ACROSS 1,500 AI APPS

Across 1,514 scans, the median time from URL submission to the first proven critical finding was 27 seconds. Eighty-four percent of apps had a critical found inside the first minute. Here is the distribution, by platform and by finding class.

This is a distribution study. We measured wall-clock time from URL submission to the first proven critical finding across 1,514 AI-built apps. The median is 27 seconds. The 90th percentile is 92 seconds. Beyond two minutes, the long tail begins — apps that needed full crawl, multiple probes, or authenticated state to surface a critical.

If your reaction is “that’s fast”, the unsettling implication is that an attacker is not slower.

Headline numbers

Metric	Value
Apps scanned	1,514
Apps where a critical was found	712 (47%)
Median time-to-first-critical (where found)	27 seconds
90th percentile	92 seconds
Fastest observed	4.2 seconds
Slowest within-scan	8 minutes 11 seconds
Window	Nov 2025 – Apr 2026

Distribution

Time bucket	Apps reaching first critical in this bucket	Cumulative share
0-15s	198	28%
15-30s	218	58%
30-60s	184	84%
60-120s	71	94%
120-300s	32	99%
300s+	9	100%

Eighty-four percent of apps with a critical exposed it inside the first minute. The 0-15 second bucket is dominated by a single failure class — secrets in the static bundle.

By finding class

Finding class	Median time-to-detect	Why it ranks here
Secret in static bundle	6s	Detectable on first parse of HTML and JS
Source map shipped to production	8s	Same as above; one extra network request
Open Firebase rules	12s	One unauthenticated read against the public REST endpoint
Permissive Supabase RLS	18s	Anon key extraction + PostgREST table enumeration
RLS off entirely	22s	Slightly slower because the scan has to enumerate the table
CORS allow-all on credentialed endpoint	31s	Requires a preflight test against an API endpoint
BOLA — read	64s	Two test users + cross-account fetch
BOLA — write (PATCH/PUT)	78s	Same as above plus the write-back step
Self-editable role	91s	Requires successful auth, profile fetch, mutated PATCH, re-fetch
Open redirect on auth callback	134s	Requires triggering full auth flow

The five fastest classes — bundle secrets, source maps, open Firebase rules, permissive RLS, RLS off — together cover 76% of all critical findings. They are also the five classes that need zero authenticated probing to detect.

TTFC by detection technique

The same critical can be reached by different detection techniques. The technique determines the latency floor; pick the wrong one and the same finding takes ten times longer.

Technique	Floor	Used for
Static parse of HTML / JS	~4s	Secrets, source maps, inline config
Single unauthenticated PostgREST / API call	~10s	RLS off, open Firebase, public S3 buckets
Anon-key extraction + targeted probe	~15s	Permissive RLS, naked-database backends
Header inspection	~6s	CSP / HSTS missing, CORS misconfig
Two-session cross-account probe	~50s	BOLA on read
Two-session probe + write-back	~70s	BOLA on PATCH/PUT/DELETE
Authenticated browser flow	~120s	Open redirects on auth callback, OAuth flaws
Crawl + dynamic introspection	~180s	GraphQL introspection abuse, Swagger-with-bearer

The class column in the previous table is the what; this column is the how. A scanner that lacks the two-session capability will report 0% BOLA findings — not because the bugs are not there, but because the technique to detect them is not in the scan.

CWE / OWASP mapping for the fastest-discovered classes

The five fastest classes — bundle secrets, source maps, open Firebase rules, permissive RLS, RLS off — together cover 76% of all critical findings in the corpus. Every one of them is “authorization or credentials, exposed at the static-parse layer.”

Class	CWE	OWASP	Floor TTFC
Secret in static bundle	CWE-798 Hard-coded Credentials	A02 / A05	~4s
Source map shipped to production	CWE-538 Externally-Accessible File	A05	~6s
Open Firebase rules	CWE-862 Missing Authorization	A01 / API1	~10s
Permissive Supabase RLS	CWE-863 Incorrect Authorization	A01 / API1	~12s
RLS off entirely	CWE-862 Missing Authorization	A01 / API1	~14s
CORS allow-all on credentialed endpoint	CWE-942 Permissive Cross-domain Policy	A05 / API8	~20s
BOLA — read	CWE-639 Auth Bypass via Key	A01 / API1	~50s
Mass assignment / self-editable role	CWE-915 Mass Assignment	A04 / API6	~60s
Open redirect on auth callback	CWE-601 URL Redirect to Untrusted	A01 / API8	~120s

The CWE-862 / CWE-863 split is the dominant pair. The vast majority of fast-discovered criticals are missing or incorrect authorization — the scanner is not finding clever exploits, it is finding doors with no lock.

Per-platform breakdown

Median time-to-first-critical, on the apps where one was found.

Platform	Apps with critical	Median TTFC	Modal first-critical class
Lovable	355	19s	Permissive RLS
Bolt.new	156	12s	Secret in bundle
Cursor	101	41s	BOLA on read
Replit	88	31s	Open Firebase rules
V0	33	47s	Self-editable role

Bolt’s 12-second median is the shortest because Bolt-built apps fail fastest on the secret-in-bundle class — a detection that needs only a parsed bundle. Cursor’s 41-second median is the longest because Cursor-built apps tend to fail on logic-layer issues (BOLA, self-editable fields) that require authenticated probing.

Faster median is not better here. It means the failure is shallower.

What this means

For builders: the failures that are findable in under a minute are the failures that are exploitable in under a minute. Time-to-first-critical is a credible upper bound on how long your app has to be public before an attacker tooling up a similar scanner finds the same thing.

For researchers: this is a baseline for any future scanner comparison. A claim that a new tool finds critical issues “faster” than VibeEval needs to beat 27 seconds at the median on a comparable corpus. We will publish the corpus URL list under NDA on request.

For investors and acquirers: when you are doing security due diligence on a vibe-coded SaaS, the time it takes to find the first critical is a useful informal scoring axis. A clean URL ten minutes in is meaningful signal; a critical inside thirty seconds is meaningful signal too.

Methodology

Sample. All 1,514 apps in the corpus.

Timer. Started on the first outbound request from the scanner against the target URL. Stopped on the moment a critical-severity finding was captured, replayed in sandbox, and confirmed.

Severity. CVSS 3.1 with the published rubric. Critical = 9.0+.

Equipment. Scans ran from us-east-1 over a one-gigabit connection. Latencies elsewhere will be higher; we expect time-to-first-critical to scale roughly linearly with round-trip time on the bundle-parse-bound classes.

Statistical handling. Where multiple criticals were found, only the first wins. Apps that timed out without a critical are excluded from this study (counted in the main benchmark).

Calibration against ref0. Every probe is also run against ref0, a clean reference site. A probe that fires on ref0 is by construction a false positive and is excluded from the corpus aggregation. The TTFC numbers above are net of false-positive elimination — a probe that incorrectly fires within 5 seconds against a clean target would otherwise be the headline number, and is the primary reason most “fast scanner” claims do not hold up under scrutiny.

Reproduce on the public benchmark

Each detection class above can be reproduced against a live gapbench scenario. The TTFC floor is roughly the same against the gapbench scenarios as it is against real corpus apps — these scenarios are deliberately shaped to mirror the same failure surfaces.

Class	Scenario	URL
Secret in static bundle	Indie SaaS	/site/indie-saas/
Permissive RLS	Supabase clone	/site/supabase-clone/
BOLA on read	Multi-tenant SaaS	/site/multi-tenant-saas/
BOLA on PATCH (balance)	Fintech app	/site/fintech-app/
Mass assignment / self-editable role	Mass assignment	/site/mass-assignment/
Open redirect on auth callback	OAuth redirect_uri	/site/ssrf-open-redirect-oauth/
ref0 (clean control)	ref0	/site/ref0/

For the manifesto-level argument behind this style of measurement, see Why we built gapbench and False positives and the ref0 control.

How to reproduce

Run VibeEval against any URL. The scanner displays a live timer and announces each finding as it lands; the first critical timestamp is preserved in the report.

Citations

VibeEval. The First 60 Seconds: Time-to-First-Critical Across 1,500 AI Apps. May 2026. https://vibe-eval.com/data-studies/time-to-first-critical/

Pattern manifesto: Why we built gapbench
Pattern walkthrough: False positives and the ref0 control — why the TTFC numbers are not just scanner noise
Pattern walkthrough: BOLA in AI-generated CRUD — why BOLA detection is structurally slower than RLS
Data study: 2026 AI App Security Benchmark
Data study: Where Vibe Coders Leak Their Keys
Data study: BOLA in AI-Generated CRUD
Data study: Honeypot Supabase — time-to-abuse on the attacker side
Guide: Solo Founder Pre-Launch Security Checklist

/ REPRODUCE

RUN IT YOURSELF

Each scenario below is live on the public benchmark. The commands are copy-paste ready. Outputs may evolve as we tune the scenarios; the bug stays.

Fastest class — secret in static bundle (median 6s)

time curl -s https://gapbench.vibe-eval.com/site/indie-saas/ | grep -oE 'sk_(live|test)_[A-Za-z0-9]{20,}' | head -1

expected Stripe key returned in well under 1s wall-clock; median TTFC for this class is 6s end-to-end including parse

Permissive RLS — anon-key extraction + PostgREST probe (median 18s)

ANON=$(curl -s https://gapbench.vibe-eval.com/site/supabase-clone/ | grep -oE 'eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+' | head -1) && curl -s "https://gapbench.vibe-eval.com/site/supabase-clone/rest/v1/users?select=*&limit=1" -H "apikey: $ANON"

expected 200 with a row from users — bug confirmed in two requests

BOLA on read — two test users + cross-account fetch (median 64s)

curl -s https://gapbench.vibe-eval.com/site/multi-tenant-saas/api/projects/1 -H 'Authorization: Bearer USER_B_TOKEN'

expected 200 with user A's project — adds setup time for two sessions

Clean control — ref0 produces no critical at any time horizon

time curl -s -I https://gapbench.vibe-eval.com/site/ref0/

expected Scanner runs the full probe set; reports no critical

/ FAQ

COMMON QUESTIONS

Why does time-to-first-critical matter?

It is a structural measure of how shallow the failures are. A critical that takes 5 seconds to find is a critical that takes 5 seconds to find for an attacker too. Long discovery times mean the failure is in a hard-to-reach corner; short discovery times mean it is on the front page.

Q&A

→

What counts as 'time-to-first-critical' in this study?

Wall-clock time from the scanner's first request against the URL to the moment the first critical-severity finding has been captured, replayed, and proven. It includes network latency, page load, bundle parsing, and the first probe that lands a confirmed critical. We exclude scan-queue time.

Q&A

→

Are some criticals faster to find than others?

Yes — substantially. Token leaks in the bundle are detectable as soon as the bundle is parsed. RLS gaps are detectable as soon as the anon key is extracted and a single PostgREST request is made. BOLA findings require setting up two test users and crossing them, so they take longer.

Q&A

→

What is the floor — how fast can a scanner physically go?

The floor is set by the network round-trip plus bundle parse — currently around 4-6 seconds on a fast connection for any DOM-based finding. Anything faster would have to rely on cached or pre-fetched data. The fastest critical we proved was 4.2 seconds (a Stripe sk_live_ in a 12KB inline script).

Q&A

→

Does this measure the scanner or the apps?

Both, conjointly. A faster scanner finds the same finding faster; a more-broken app gives the same scanner a finding faster. We report both axes because builders care about the latter (how fast does my app fail) and tool-comparisons care about the former (how fast can the tool detect).

Q&A

→

Where can I see TTFC measured against deliberately vulnerable scenarios?

https://gapbench.vibe-eval.com/ runs 97 deliberately vulnerable scenarios. Pick any of indie-saas, supabase-clone, multi-tenant-saas, fintech-app and run a timed scan. The TTFC numbers per class above are reproducible against these targets. ref0 is the clean control — same scan, no critical, used to confirm the scanner isn't just generating noise.

Q&A

→

Why is BOLA slower than RLS even though both are 'authorization' bugs?

RLS detection needs one request: extract the anon key, query a table, see rows come back. BOLA detection needs the scanner to provision two test users, complete signup for each, capture session tokens, then make the cross-account request. The setup cost is the bulk of the latency — once both sessions are warm, the actual probe is sub-second. This is why class-level TTFC is more useful than app-level TTFC for comparing scanners.

Q&A

→

How much of the median TTFC is the scanner versus the network?

For the bundle-parse-bound classes (secret leaks, source maps, open Firebase), network round-trip is the dominant cost — usually 50-70% of total TTFC. For probe-based classes (RLS, CORS, BOLA), scanner logic dominates — the test-user provisioning step alone is ~30s on a fast connection. A scanner running closer to the target, or one that cached anonymous probes, would shift the bundle-bound numbers down meaningfully but not the probe-based ones.

Q&A

→

/ NEXT STEP

SEE YOUR APP'S TIME-TO-FIRST-CRITICAL

Run VibeEval against your URL and watch the timer. Most apps fail before the page is fully loaded.

RUN TIMED SCAN →