THE FIRST 60 SECONDS: TIME-TO-FIRST-CRITICAL ACROSS 1,500 AI APPS
Across 1,514 scans, the median time from URL submission to the first proven critical finding was 27 seconds. Eighty-four percent of apps had a critical found inside the first minute. Here is the distribution, by platform and by finding class.
This is a distribution study. We measured wall-clock time from URL submission to the first proven critical finding across 1,514 AI-built apps. The median is 27 seconds. The 90th percentile is 92 seconds. Beyond two minutes, the long tail begins — apps that needed full crawl, multiple probes, or authenticated state to surface a critical.
If your reaction is “that’s fast”, the unsettling implication is that an attacker is not slower.
Headline numbers
| Metric | Value |
|---|---|
| Apps scanned | 1,514 |
| Apps where a critical was found | 712 (47%) |
| Median time-to-first-critical (where found) | 27 seconds |
| 90th percentile | 92 seconds |
| Fastest observed | 4.2 seconds |
| Slowest within-scan | 8 minutes 11 seconds |
| Window | Nov 2025 – Apr 2026 |
Distribution
| Time bucket | Apps reaching first critical in this bucket | Cumulative share |
|---|---|---|
| 0-15s | 198 | 28% |
| 15-30s | 218 | 58% |
| 30-60s | 184 | 84% |
| 60-120s | 71 | 94% |
| 120-300s | 32 | 99% |
| 300s+ | 9 | 100% |
Eighty-four percent of apps with a critical exposed it inside the first minute. The 0-15 second bucket is dominated by a single failure class — secrets in the static bundle.
By finding class
| Finding class | Median time-to-detect | Why it ranks here |
|---|---|---|
| Secret in static bundle | 6s | Detectable on first parse of HTML and JS |
| Source map shipped to production | 8s | Same as above; one extra network request |
| Open Firebase rules | 12s | One unauthenticated read against the public REST endpoint |
| Permissive Supabase RLS | 18s | Anon key extraction + PostgREST table enumeration |
| RLS off entirely | 22s | Slightly slower because the scan has to enumerate the table |
| CORS allow-all on credentialed endpoint | 31s | Requires a preflight test against an API endpoint |
| BOLA — read | 64s | Two test users + cross-account fetch |
| BOLA — write (PATCH/PUT) | 78s | Same as above plus the write-back step |
| Self-editable role | 91s | Requires successful auth, profile fetch, mutated PATCH, re-fetch |
| Open redirect on auth callback | 134s | Requires triggering full auth flow |
The five fastest classes — bundle secrets, source maps, open Firebase rules, permissive RLS, RLS off — together cover 76% of all critical findings. They are also the five classes that need zero authenticated probing to detect.
TTFC by detection technique
The same critical can be reached by different detection techniques. The technique determines the latency floor; pick the wrong one and the same finding takes ten times longer.
| Technique | Floor | Used for |
|---|---|---|
| Static parse of HTML / JS | ~4s | Secrets, source maps, inline config |
| Single unauthenticated PostgREST / API call | ~10s | RLS off, open Firebase, public S3 buckets |
| Anon-key extraction + targeted probe | ~15s | Permissive RLS, naked-database backends |
| Header inspection | ~6s | CSP / HSTS missing, CORS misconfig |
| Two-session cross-account probe | ~50s | BOLA on read |
| Two-session probe + write-back | ~70s | BOLA on PATCH/PUT/DELETE |
| Authenticated browser flow | ~120s | Open redirects on auth callback, OAuth flaws |
| Crawl + dynamic introspection | ~180s | GraphQL introspection abuse, Swagger-with-bearer |
The class column in the previous table is the what; this column is the how. A scanner that lacks the two-session capability will report 0% BOLA findings — not because the bugs are not there, but because the technique to detect them is not in the scan.
CWE / OWASP mapping for the fastest-discovered classes
The five fastest classes — bundle secrets, source maps, open Firebase rules, permissive RLS, RLS off — together cover 76% of all critical findings in the corpus. Every one of them is “authorization or credentials, exposed at the static-parse layer.”
| Class | CWE | OWASP | Floor TTFC |
|---|---|---|---|
| Secret in static bundle | CWE-798 Hard-coded Credentials | A02 / A05 | ~4s |
| Source map shipped to production | CWE-538 Externally-Accessible File | A05 | ~6s |
| Open Firebase rules | CWE-862 Missing Authorization | A01 / API1 | ~10s |
| Permissive Supabase RLS | CWE-863 Incorrect Authorization | A01 / API1 | ~12s |
| RLS off entirely | CWE-862 Missing Authorization | A01 / API1 | ~14s |
| CORS allow-all on credentialed endpoint | CWE-942 Permissive Cross-domain Policy | A05 / API8 | ~20s |
| BOLA — read | CWE-639 Auth Bypass via Key | A01 / API1 | ~50s |
| Mass assignment / self-editable role | CWE-915 Mass Assignment | A04 / API6 | ~60s |
| Open redirect on auth callback | CWE-601 URL Redirect to Untrusted | A01 / API8 | ~120s |
The CWE-862 / CWE-863 split is the dominant pair. The vast majority of fast-discovered criticals are missing or incorrect authorization — the scanner is not finding clever exploits, it is finding doors with no lock.
Per-platform breakdown
Median time-to-first-critical, on the apps where one was found.
| Platform | Apps with critical | Median TTFC | Modal first-critical class |
|---|---|---|---|
| Lovable | 355 | 19s | Permissive RLS |
| Bolt.new | 156 | 12s | Secret in bundle |
| Cursor | 101 | 41s | BOLA on read |
| Replit | 88 | 31s | Open Firebase rules |
| V0 | 33 | 47s | Self-editable role |
Bolt’s 12-second median is the shortest because Bolt-built apps fail fastest on the secret-in-bundle class — a detection that needs only a parsed bundle. Cursor’s 41-second median is the longest because Cursor-built apps tend to fail on logic-layer issues (BOLA, self-editable fields) that require authenticated probing.
Faster median is not better here. It means the failure is shallower.
What this means
For builders: the failures that are findable in under a minute are the failures that are exploitable in under a minute. Time-to-first-critical is a credible upper bound on how long your app has to be public before an attacker tooling up a similar scanner finds the same thing.
For researchers: this is a baseline for any future scanner comparison. A claim that a new tool finds critical issues “faster” than VibeEval needs to beat 27 seconds at the median on a comparable corpus. We will publish the corpus URL list under NDA on request.
For investors and acquirers: when you are doing security due diligence on a vibe-coded SaaS, the time it takes to find the first critical is a useful informal scoring axis. A clean URL ten minutes in is meaningful signal; a critical inside thirty seconds is meaningful signal too.
Methodology
Sample. All 1,514 apps in the corpus.
Timer. Started on the first outbound request from the scanner against the target URL. Stopped on the moment a critical-severity finding was captured, replayed in sandbox, and confirmed.
Severity. CVSS 3.1 with the published rubric. Critical = 9.0+.
Equipment. Scans ran from us-east-1 over a one-gigabit connection. Latencies elsewhere will be higher; we expect time-to-first-critical to scale roughly linearly with round-trip time on the bundle-parse-bound classes.
Statistical handling. Where multiple criticals were found, only the first wins. Apps that timed out without a critical are excluded from this study (counted in the main benchmark).
Calibration against ref0. Every probe is also run against ref0, a clean reference site. A probe that fires on ref0 is by construction a false positive and is excluded from the corpus aggregation. The TTFC numbers above are net of false-positive elimination — a probe that incorrectly fires within 5 seconds against a clean target would otherwise be the headline number, and is the primary reason most “fast scanner” claims do not hold up under scrutiny.
Reproduce on the public benchmark
Each detection class above can be reproduced against a live gapbench scenario. The TTFC floor is roughly the same against the gapbench scenarios as it is against real corpus apps — these scenarios are deliberately shaped to mirror the same failure surfaces.
| Class | Scenario | URL |
|---|---|---|
| Secret in static bundle | Indie SaaS | /site/indie-saas/ |
| Permissive RLS | Supabase clone | /site/supabase-clone/ |
| BOLA on read | Multi-tenant SaaS | /site/multi-tenant-saas/ |
| BOLA on PATCH (balance) | Fintech app | /site/fintech-app/ |
| Mass assignment / self-editable role | Mass assignment | /site/mass-assignment/ |
| Open redirect on auth callback | OAuth redirect_uri | /site/ssrf-open-redirect-oauth/ |
| ref0 (clean control) | ref0 | /site/ref0/ |
For the manifesto-level argument behind this style of measurement, see Why we built gapbench and False positives and the ref0 control.
How to reproduce
Run VibeEval against any URL. The scanner displays a live timer and announces each finding as it lands; the first critical timestamp is preserved in the report.
Citations
VibeEval. The First 60 Seconds: Time-to-First-Critical Across 1,500 AI Apps. May 2026. https://vibe-eval.com/data-studies/time-to-first-critical/
Related
- Pattern manifesto: Why we built gapbench
- Pattern walkthrough: False positives and the ref0 control — why the TTFC numbers are not just scanner noise
- Pattern walkthrough: BOLA in AI-generated CRUD — why BOLA detection is structurally slower than RLS
- Data study: 2026 AI App Security Benchmark
- Data study: Where Vibe Coders Leak Their Keys
- Data study: BOLA in AI-Generated CRUD
- Data study: Honeypot Supabase — time-to-abuse on the attacker side
- Guide: Solo Founder Pre-Launch Security Checklist
RUN IT YOURSELF
Each scenario below is live on the public benchmark. The commands are copy-paste ready. Outputs may evolve as we tune the scenarios; the bug stays.
time curl -s https://gapbench.vibe-eval.com/site/indie-saas/ | grep -oE 'sk_(live|test)_[A-Za-z0-9]{20,}' | head -1
ANON=$(curl -s https://gapbench.vibe-eval.com/site/supabase-clone/ | grep -oE 'eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+' | head -1) && curl -s "https://gapbench.vibe-eval.com/site/supabase-clone/rest/v1/users?select=*&limit=1" -H "apikey: $ANON"
curl -s https://gapbench.vibe-eval.com/site/multi-tenant-saas/api/projects/1 -H 'Authorization: Bearer USER_B_TOKEN'
time curl -s -I https://gapbench.vibe-eval.com/site/ref0/
COMMON QUESTIONS
SEE YOUR APP'S TIME-TO-FIRST-CRITICAL
Run VibeEval against your URL and watch the timer. Most apps fail before the page is fully loaded.