AI PENTEST FOR WEB APPLICATIONS: AUTOMATED SECURITY TESTING FOR SPAS & AI-BUILT APPS | VIBEEVAL

Modern web apps fail in modern ways. SPA bundles leak service-role keys, SSR pages skip middleware on the routes that matter, server actions trust the client to send the right user ID. An AI pentest is scoped to the architecture you actually shipped — not a generic OWASP checklist.

What an AI pentest for web applications actually covers

A modern web app is not one thing. It is a JavaScript bundle, a tree of server-rendered routes, a fleet of API handlers, an edge layer, a backend, and a stack of third-party integrations. An AI pentest covers each layer with techniques specific to that layer, not a single generic crawl that treats every URL the same.

Across the apps we audit, the bugs that matter cluster into a small set of recurring patterns. The methodology below is built around catching those patterns, not around a generic OWASP checklist run with the safety set to “polite.”

AI-generated apps are especially vulnerable

Vibe-coded apps from Lovable, Bolt, v0, and Cursor ship with predictable vulnerability patterns that AI pentest agents are trained to find. These tools generate code fast but often skip authentication checks, expose API keys in client bundles, and leave authorization wide open. See the Vibe Code Scanner for a free dynamic scan of any AI-generated app.

Web application pentest checklist

Follow these 10 steps to thoroughly pentest your web application. Critical items represent the most commonly exploited attack vectors.

  1. Map the application attack surface. Crawl the app, capture every route the client hits, and enumerate forms, API calls, server actions, and third-party endpoints.
  2. Test authentication flows. Probe login, registration, password reset, magic links, and session management for bypass and logic flaws.
  3. Probe authorization boundaries. Verify that users cannot access resources or actions beyond their assigned roles, including cross-tenant in multi-tenant apps.
  4. Scan for XSS. Test every input for reflected, stored, and DOM-based cross-site scripting, with special attention to dangerouslySetInnerHTML usage.
  5. Test SQL/NoSQL injection vectors. Attempt injection on every database query path including search, filters, and dynamic queries.
  6. Check CSRF protection. Verify state-changing requests include proper anti-CSRF tokens and SameSite cookie attributes.
  7. Analyze client-side JavaScript bundles. Inspect bundles, source maps, and chunk files for hardcoded secrets, internal endpoints, and exposed environment variables.
  8. Test file upload functionality. Attempt to upload malicious files, bypass file-type restrictions, and probe for path traversal in upload handlers.
  9. Verify security headers. Check for Content-Security-Policy, X-Frame-Options, Strict-Transport-Security, and Referrer-Policy.
  10. Test WebSocket connections. Validate authentication on the handshake and probe for message injection and authorization bypass on every channel.

Benefits of AI pentest for web apps

Tests like a real attacker

AI pentest agents chain vulnerabilities together the way human attackers do, finding exploitable paths not just individual bugs.

Covers OWASP Top 10 automatically

Every scan tests for all OWASP Top 10 categories including injection, broken access control, and security misconfiguration.

Works with any framework

Whether your app is built with React, Next.js, Remix, SvelteKit, Vue, or Astro, the methodology adapts to the rendering and routing model.

No code changes required

Point the AI agent at your running application and it discovers and tests everything without instrumenting your code.

Pentest by web architecture

The bug taxonomy stays roughly constant; the failure surface changes with each architecture. Treating an SSR app like a static SPA misses the bugs that matter most.

Architecture Distinctive surface Most common failure mode
SPA (Vite / CRA / static React) Client bundle plus a single API Service-role keys baked into the bundle, missing auth on the API
SSR (Next.js, Remix) Middleware, route handlers, server actions, server components Middleware skipped on a route, server action that trusts the client
Edge / SSG (Astro, Nuxt static) CDN-cached JSON, ISR endpoints, edge functions Cached responses leaking another user’s data, edge function with no auth
AI-generated React app Generated CRUD wired straight to Supabase Missing RLS, exposed secrets, BOLA on every endpoint

SPA-specific tests

Single-page apps push everything to the browser. The pentest has to treat the bundle as the source of truth for what the client knows.

  • Bundle inspection. Pull every JS chunk, source map, and worker file. Search for secrets, internal endpoints, environment variable strings, and references to admin routes.
  • Route enumeration. Walk the client-side router. Most SPAs leak the entire route table in the bundle, including admin routes that the user never sees.
  • API-only auth. The bundle is a list of API calls. Each call gets probed without auth, with another user’s auth, and with a downgraded role.
  • Local storage and IndexedDB. Inspect what the app stores client-side. Tokens with no expiry, full user objects with PII, and feature flags that gate paid features all show up here.

SSR-specific tests (Next.js, Remix)

SSR apps blur the line between client and server. The pentest has to follow the request through middleware, route handlers, server actions, and back to the client.

  • Middleware coverage. Test every protected route, including dynamic segments. A single missing entry in matcher exposes a route group.
  • Server-action authorization. Server actions accept arbitrary input. Test each action with another user’s resource ID, no session, and a downgraded role.
  • /_next/data JSON exposure. SSR apps expose internal JSON endpoints under /_next/data/<build-id>/.... Test each one with no auth — they sometimes return data that the rendered HTML hid.
  • Hydration mismatches. A component rendered server-side with admin data and hydrated client-side without it leaks the admin data through the initial HTML.

Edge / SSG tests

Edge and SSG add caching, which adds caching bugs.

  • Cache key analysis. A response cached on a key that does not include the user ID will leak across users. The agent requests the same URL as two different users and diffs the response.
  • ISR revalidation. Some apps expose a revalidation endpoint with a weak token. Test it.
  • Edge function isolation. Edge functions run with their own runtime and their own env. Confirm they do not have backend-only secrets that should live in a Node runtime.

AI-generated React apps

The dominant failure pattern is the same regardless of generator. Code wires React to Supabase or Firebase directly, RLS or rules are missing on at least one table, and the anon key plus the missing policy equals an open database. See Vibe Pentesting for the cross-tool methodology and Lovable Pentesting for the Lovable-specific playbook.

Common vulnerabilities AI finds in web applications

Broken access control

Users accessing admin panels, viewing other users’ data, bypassing paywalls. AI tests every route with different user roles to find access control gaps. The number-one web vulnerability per OWASP.

Cross-site scripting (XSS)

Stored, reflected, and DOM-based XSS from unsanitized user inputs. AI agents inject payloads into every input field, URL parameter, and header. AI-generated apps from Lovable and Bolt frequently use dangerouslySetInnerHTML without sanitization.

SQL/NoSQL injection

AI tests every database query path for injection. Supabase apps with custom RPC functions and Firebase apps with unvalidated Firestore queries are common targets.

Exposed API keys

AI scans JavaScript bundles, source maps, and network requests for leaked Stripe keys, Supabase anon keys with overly permissive RLS, and OpenAI API keys. Vibe-coded apps leak secrets at 3x the rate of hand-coded apps. Run the Token Leak Checker for a focused scan.

Authentication bypass

Weak session handling, JWT vulnerabilities, and password reset flaws. AI tests login flows, token validation, and session management end to end.

Missing security headers

CSP, HSTS, X-Frame-Options, X-Content-Type-Options. AI checks every response for proper security header configuration. Run the Security Headers Checker for a focused scan.

Anonymized real-world findings

Examples from apps we have audited. Tenant identifiers and exact routes are altered.

Supabase service role key in _app.tsx

Endpoint: /static/js/main.<hash>.js. Evidence: a string starting with eyJ decoded as a JWT with role: "service_role". Impact: full read/write to every Supabase table, RLS bypassed entirely, no rate limit. Fix: rotate the key in the Supabase dashboard, move every server-side call into a Supabase Edge Function or a Next.js route handler, and replace the client-side import with the anon key. See the Supabase RLS Checker.

Missing middleware auth on /dashboard/billing

Endpoint: /dashboard/billing. Evidence: the route returned 200 with the billing UI when requested with no session cookie. The middleware matcher array covered /dashboard/:path* but the build had a stale config. Impact: unauthenticated users could view the billing page shell and trigger downstream API calls that did re-check auth — but the page itself surfaced the org name and plan tier through SSR. Fix: tighten the matcher to ["/dashboard/(.*)"] and add an explicit session check inside the layout. Cross-reference AI Pentest for SaaS for tenant scoping.

IDOR in a server action

Endpoint: updateInvoice server action. Evidence: the action accepted { invoiceId, amount } and called db.invoices.update({ where: { id: invoiceId }, data: { amount } }) with no ownership check. A logged-in user could pass any other user’s invoice ID. Impact: any user could rewrite any invoice. Fix: add where: { id: invoiceId, ownerId: session.user.id } and return a 404 on miss. See AI Pentest for APIs for the BOLA section.

JWT verification skipped in API route

Endpoint: /api/admin/export. Evidence: the handler called jwt.decode instead of jwt.verify. Any signed token, including unsigned ones with alg: "none", was accepted. Impact: an attacker could forge a token claiming role: "admin" and trigger the export. Fix: use jwt.verify with the secret and reject unsupported algorithms.

Source map shipping to production

Endpoint: /static/js/main.<hash>.js.map. Evidence: the map was world-readable and reconstructed the original TypeScript including comments naming the internal RPC endpoints. Impact: full visibility into business logic and internal endpoint URLs. Fix: set productionBrowserSourceMaps: false in next.config.js (or strip maps in your build pipeline) and rotate any secrets the map referenced.

Hydration leak of admin object

Endpoint: /account. Evidence: a server component rendered an AdminBadge only when the user was admin, but the SSR HTML shipped the full admin user object inside the React Server Component payload regardless of the role check. Impact: any user could read the admin user list from the rendered page. Fix: gate data fetching, not just rendering. Pull admin data only inside an admin-gated server component.

Fix prompts you can paste

These are the prompts we hand to teams after a pentest. Paste them into Cursor, Claude Code, or Lovable next to the flagged file.

Fix exposed Supabase service role key

The Supabase service role key is currently imported in lib/supabase.ts and ends up in the client bundle. Remove the service role key from any file that ships to the browser. Move all queries that need it to a Next.js route handler in app/api/<resource>/route.ts. Use the anon key on the client. Do not introduce a separate “isomorphic” module — keep server and client imports strictly separate.

Add server-action ownership check

The server action updateInvoice in app/actions/invoices.ts does not verify that the invoice belongs to the current user. Update it to load the invoice with where: { id, ownerId: session.user.id }. If no invoice is returned, throw a 404. Do not use findUnique followed by an in-memory check — combine the query and the authorization filter into one.

Tighten Next.js middleware matcher

The middleware in middleware.ts only protects /dashboard/:path*. Audit app/ and add every route group that requires auth to the matcher. Then add an explicit await getSession() check at the top of app/dashboard/layout.tsx so a future middleware regression cannot expose the layout.

Why AI-generated web apps need extra testing

Apps built with AI coding tools ship 10x faster than traditionally coded apps. But speed comes at a cost: AI code generators optimize for functionality, not security. They generate working login flows without rate limiting, database queries without parameterization, and API endpoints without authorization middleware.

Across the AI-generated web apps we audit, the same critical vulnerabilities recur. The most common: missing Row Level Security on Supabase tables, exposed API keys in client-side code, and authentication bypass through direct API access.

Traditional web scanners like OWASP ZAP and Burp Suite find some of these issues, but they can’t understand application context. They don’t know that /api/admin should require admin authentication, or that one user shouldn’t be able to read another user’s /api/orders/:id. AI pentest agents understand these business rules and test them systematically.

Web application pentest scope

Frontend

React, Next.js, Remix, Vue, Svelte, Astro components, client-side routing, form validation bypass, local storage data exposure, source map leaks.

Backend APIs

REST and GraphQL endpoint security, authentication and authorization, input validation, rate limiting, error handling. See AI Pentest for APIs.

Database

SQL injection, NoSQL injection, RLS policy validation, data exposure through API responses. See Backend Security: Firebase Rules and the Supabase RLS Checker.

Infrastructure

HTTPS configuration, security headers, CORS policies, cookie flags, CSP directives.

Third-party integrations

Payment flows (Stripe), auth providers (Auth0, Clerk), file upload services, analytics leaks.

AI pentest vs Burp / ZAP / Snyk for web apps

Test class Burp / ZAP Snyk / SAST AI pentest
Reflected XSS Yes Limited Yes
Stored XSS Manual No Yes
BOLA / IDOR Manual No Yes
Bundle secret scan No Partial (source) Yes
Server-action authz No No Yes
Missing middleware on a route No No Yes
RLS / Firestore rules No No Yes
Compliance report No Partial Yes

See the full breakdown in VibeEval vs Burp Suite, VibeEval vs OWASP ZAP, and VibeEval vs Snyk.

Pentest your web app today

VibeEval’s AI pentest agents find real vulnerabilities in your web application in minutes, not weeks. No setup, no code changes, no false positives.

COMMON QUESTIONS

01
What does an AI pentest for a web application actually test?
Route-level authentication, authorization on every action, client-bundle secrets, server-side input validation, security headers, third-party integration flows, and architecture-specific issues like SSR hydration mismatches, source-map exposure, and middleware bypasses on dynamic routes.
Q&A
02
How is testing a Next.js app different from testing a single-page React app?
An SPA is a static bundle plus an API. The bundle is the client-side attack surface and the API is everything else. A Next.js app blends them: server components, server actions, middleware, route handlers, /_next/data JSON endpoints, and ISR caches all sit between the user and the data. The pentest has to cover each layer separately.
Q&A
03
Why do AI-generated React apps keep shipping with the Supabase service role key in the bundle?
AI coding tools see the service role key in an example and copy it into a config file that ends up imported by client code. Vite or Next.js then bakes that string into the JavaScript bundle. Static analysis catches some of it, but a runtime pentest catches all of it because it inspects the bundle the user actually receives.
Q&A
04
Can an AI pentest find broken authentication on protected dashboard routes?
Yes. The agent enumerates the route table from the rendered app and tries each route with no session, an expired session, a different user's session, and a downgraded role. SSR apps that rely on a single middleware file frequently miss a route — the agent finds the one that slipped through.
Q&A
05
Do source maps in production matter?
Yes. A production source map exposes the entire pre-minified codebase including comments, internal endpoint URLs, environment variable names, and sometimes inlined secrets. The fix is simple — disable source map upload to the public bucket — but the find rate stays high because the default in many bundlers is to ship them.
Q&A
06
How does the pentest handle authenticated routes without credentials?
Either you provide a test account, or you sign up through the app. The agent then drives the same flows a logged-in user would, with the addition of automated cross-user, cross-role, and cross-tenant probes.
Q&A
07
What is a server-action authorization bug?
Next.js server actions and Remix actions accept arbitrary input from the client. If the action does not re-verify ownership of the resource it is mutating, any logged-in user can call the action with any ID. We see this pattern constantly in AI-generated dashboard code.
Q&A
08
Can the AI pentest run on a staging URL or does it need production?
Either. Staging is preferred for the first scan because it removes any concern about traffic in production logs. Once a clean baseline is established, running on every production deploy is the right cadence.
Q&A

SCAN YOUR APP

14-day trial. No card. Results in under 60 seconds.

START FREE SCAN