AI PENTEST FOR APIS: AUTOMATED REST & GRAPHQL SECURITY TESTING | VIBEEVAL
APIs fail in ways that crawlers and scanners cannot see. BOLA needs two valid sessions and an understanding of what each ID means. Mass assignment needs an extra field that the spec does not document. GraphQL introspection needs a schema query, not a URL crawl. AI pentest agents do all three by default.
What an AI pentest for APIs actually covers
APIs are the data plane. Every web app, mobile app, integration partner, and AI agent in your stack talks to the API, and most production-impact bugs live here rather than in the UI. A pentest scoped to APIs walks the entire surface — REST, GraphQL, WebSocket, gRPC, webhooks — and probes each one for the OWASP API Top 10 plus the architecture-specific failure modes that scanners miss.
APIs are the number-one attack surface
91% of web attacks target API endpoints, and AI-generated backends often skip authorization checks entirely. A single missing auth check can expose your entire database to unauthenticated access. AI-generated code from Cursor, Lovable, and Bolt produces working endpoints that route, parse, and persist correctly — but routinely ship without the ownership check that turns an endpoint into a safe one.
API pentest checklist
- Discover all API endpoints. Crawl docs, OpenAPI specs, network traffic, source maps, and client bundles to build a complete endpoint map.
- Test authentication mechanisms. Probe JWT handling, OAuth flows, API key validation, and session management for bypass.
- Verify authorization per endpoint. Ensure every endpoint enforces proper access control and users cannot access other users' resources.
- Test rate limiting. Verify rate limits on auth, password reset, and resource-intensive endpoints.
- Probe input validation. Send malformed, oversized, and unexpected types to every parameter.
- Check for mass assignment. Test whether endpoints accept extra fields that escalate privileges or modify protected attributes.
- Test BOLA / IDOR. Systematically swap object IDs across endpoints to find broken object-level authorization.
- Analyze error responses. Check that errors do not leak stack traces, schemas, or internal information.
- Test GraphQL introspection. Confirm introspection is disabled in production and probe for depth, complexity, and batching attacks.
- Verify API versioning security. Test deprecated versions for vulnerabilities and confirm old endpoints are decommissioned.
Benefits of AI pentest for APIs
Discovers hidden endpoints automatically
AI agents crawl your application to find undocumented endpoints, admin routes, and debug interfaces that manual testing misses.
Tests every parameter combination
Exhaustively tests parameter combinations, edge cases, and boundary conditions that would take human testers weeks.
Catches BOLA / IDOR that scanners miss
AI understands application context to test object-level authorization, the number-one API vulnerability that traditional scanners cannot detect.
Supports REST, GraphQL, and WebSocket
Works with any API architecture including REST, GraphQL, gRPC, and WebSocket endpoints out of the box.
OWASP API Security Top 10 with anonymized examples
How AI pentest agents handle each category in the OWASP API Security Top 10, with an example from a real engagement (anonymized).
API1 — Broken Object Level Authorization (BOLA)
AI swaps user IDs across every endpoint. GET /api/users/123 becomes GET /api/users/456. This is the number-one API vulnerability and the one traditional scanners miss most often.
Anonymized finding. Endpoint: /api/organizations/<id>. A Supabase-powered SaaS returned full org data including billing details and member emails. The endpoint checked authentication but not membership. The AI agent found this in 30 seconds by retrying every observed org ID with each test user. Fix: scope the query by membership.
API2 — Broken Authentication
AI probes JWT handling, tests for token reuse, checks expiration enforcement, and attempts authentication bypass through parameter manipulation.
Anonymized finding. Endpoint: /api/admin/export. The handler called jwt.decode rather than jwt.verify. The agent crafted a token with alg: "none" and the handler accepted it. Fix: replace with jwt.verify, allowlist the algorithm explicitly, reject on failure.
API3 — Broken Object Property Level Authorization
AI sends extra fields in POST and PUT requests to test mass assignment. Can a regular user set admin: true in their profile update?
Anonymized finding. Endpoint: PUT /api/me. The handler did User.update(req.body). The agent sent { "name": "x", "role": "admin", "tenantId": "<other>" } and got a 200. The user was now an admin in another tenant. Fix: pick fields explicitly from the body and reject unknown keys.
API4 — Unrestricted Resource Consumption
AI tests rate limits by sending rapid requests to authentication, search, and data export endpoints.
Anonymized finding. Endpoint: POST /api/auth/login. No rate limit. The agent issued thousands of requests with rotating passwords against a known username. Fix: per-IP and per-account rate limit on auth endpoints, plus exponential backoff.
API5 — Broken Function Level Authorization
AI tests admin endpoints with non-admin tokens. Can a regular user access /api/admin/users?
Anonymized finding. Endpoint: DELETE /api/admin/users/<id>. The handler checked req.user was authenticated but did not check req.user.role === "admin". Any user could delete any other user. Fix: a single role-check middleware on every admin route, asserted in tests.
API6 — Unrestricted Access to Sensitive Business Flows
AI tests business-critical flows like checkout, account creation, and password reset for abuse patterns.
Anonymized finding. Endpoint: POST /api/coupons/redeem. The endpoint did not deduplicate redemption per user, so a single coupon could be redeemed many times. Fix: enforce uniqueness on (userId, couponId) at the DB level.
API7 — Server-Side Request Forgery (SSRF)
AI tests URL parameters for SSRF, attempting to access internal services and cloud metadata endpoints (169.254.169.254).
Anonymized finding. Endpoint: POST /api/webhooks/test. The endpoint accepted a url field and made an outbound HTTPS request from the server. The agent supplied the AWS metadata URL and received the IAM role credentials. Fix: allowlist host, block link-local and private ranges, force IMDSv2 on the host. See AI Pentest for Cloud Infrastructure.
API8 — Security Misconfiguration
AI checks for debug mode, verbose errors, CORS wildcard, missing rate limits, and default credentials.
Anonymized finding. Endpoint: *. CORS was set to Access-Control-Allow-Origin: * with Access-Control-Allow-Credentials: true. Fix: explicit origin allowlist; never send credentials with a wildcard origin.
API9 — Improper Inventory Management
AI discovers undocumented endpoints, old API versions, and debug routes that developers forgot to remove.
Anonymized finding. Endpoint: /v1/admin/impersonate. Deprecated three releases earlier, still live, no auth check. Fix: a kill list in CI that fails the build if deprecated routes remain mounted.
API10 — Unsafe Consumption of APIs
AI tests how your API handles responses from third-party services, checking for injection through upstream data.
Anonymized finding. Endpoint: POST /api/integrations/sync. The integration parsed third-party JSON and rendered fields directly into HTML email. A malicious third-party value triggered stored XSS in the email preview. Fix: treat upstream data as untrusted; sanitize at the boundary.
API pentest by architecture
Different protocols, different attack surfaces. The OWASP API Top 10 still applies, but the probes change.
| Architecture | Surface | Distinctive failure mode |
|---|---|---|
| REST | Per-endpoint, per-method | BOLA on resource IDs, missing auth on /admin paths |
| GraphQL | One endpoint, many resolvers | Introspection enabled, depth bombs, batching attacks, missing per-resolver authz |
| WebSocket | Persistent connection | Handshake-only auth, no per-message authorization |
| gRPC | Method-per-service | Lack of per-method authz, missing TLS, unary versus streaming asymmetries |
| Webhooks | Inbound integration | Missing signature verification, replay, payload mutation |
REST APIs deep dive
REST is the path of least resistance for AI-generated backends, and it shows.
- CRUD on every resource — for each
GET,PUT,PATCH,DELETEon a resource, probe with another user’s ID, with no auth, with a downgraded role, and with a malformed ID. - Pagination scope —
?cursor=<base64>cursors sometimes contain another tenant’s ID. Decode and probe. - Filter parameters —
?filter[ownerId]=<other>and?where[tenantId]=<other>frequently bypass server-side scoping that lives further up the stack. - Bulk endpoints —
POST /resources/bulkis often missing the per-item authz check. The agent submits a batch with mixed ownership.
GraphQL APIs deep dive
GraphQL pentests are different and harder. Pull the schema first, then probe.
- Introspection. Confirm
__schemais disabled in production. If enabled, the agent enumerates every type, query, mutation, and subscription. - Per-resolver authorization. Most GraphQL bugs live at the resolver, not the endpoint. The agent tries each resolver with cross-user IDs.
- Query depth. Send a query nested 20 levels deep through a self-referential type. If the server returns it, you have a DoS vector.
- Query complexity. Send a query that requests a million child records via aliased fields. If the server returns it, you have a DoS vector.
- Batching attacks. Some GraphQL servers accept arrays of operations in one request. The agent submits an array of 1,000 mutations to bypass per-request rate limits.
- Persisted queries. If the server uses persisted queries, confirm the persisted-query allowlist is enforced and the server rejects arbitrary operations from production clients.
WebSocket APIs deep dive
WebSocket is the most underaudited surface in modern stacks. Auth on the handshake is necessary but not sufficient.
- Handshake auth. The agent connects with no token, an expired token, and another user’s token.
- Per-message authorization. A subscribe message asks for a channel. The agent subscribes to channels it should not have access to.
- Cross-tenant subscriptions. Subscribe to another tenant’s channel name. If messages stream in, you have a leak.
- Server-pushed event filtering. Confirm the server filters which events each connection receives. A single broadcast loop with no per-connection filter leaks every message to every client.
- Message injection. Send malformed JSON, oversized payloads, and unexpected message types. Confirm graceful failure rather than connection-wide crashes.
gRPC notes
If the proto is in source or shipped with the client, the agent enumerates services. Each method gets the same treatment as a REST endpoint: cross-user IDs, missing auth, downgraded role, and malformed messages. Streaming methods get the WebSocket-style probes.
What AI pentests find on REST vs GraphQL APIs
Patterns we keep finding that scanners miss.
REST findings
- BOLA on every resource ID we have not personally authorized
- Mass assignment on
PUT /api/mewritingrole,tenantId, orverified - Missing auth on
/api/admin/*because the middleware was registered after the route was mounted - Verbose 500 responses including the SQL query and the parameter values
- API version
v1still live with the bugv2was created to fix
GraphQL findings
- Introspection enabled in production
- Mutations missing authorization checks because they were generated from a schema-first DSL that does not enforce auth at the resolver
- Recursive types allowing depth-based DoS
- Field aliases used to bypass per-field rate limits
- A
_adminnamespace exposed on the same endpoint as the public schema
Fix prompts you can paste
Add ownership scoping to a REST handler
The handler
GET /api/invoices/:idinapp/api/invoices/[id]/route.tsdoes not verify that the invoice belongs to the current user. Update it to query withwhere: { id, ownerId: session.user.id }. Return 404, not 403, on miss to avoid leaking existence. Add a unit test that confirms a request from user A for user B’s invoice ID returns 404.
Disable GraphQL introspection in production
Update the GraphQL server config to disable introspection when
process.env.NODE_ENV === "production". For Apollo Server, setintrospection: false. Add a CI check that asserts__schemareturns an error against the staging URL.
Fix mass assignment on a profile update
The handler
PUT /api/meinapp/api/me/route.tscallsUser.update(req.body). Replace with an explicit field allowlist: onlyname,avatarUrl. Reject unknown keys with a 400. Never acceptrole,tenantId,verified, or any field beginning withis.
API pentest scope
REST
Every CRUD operation on every resource, with every supported auth state. Pagination, filtering, sorting, bulk endpoints. Verify pagination does not leak data beyond authorized scope.
GraphQL
Schema enumeration, per-query and per-mutation authorization, introspection, depth, complexity, batching, and persisted queries.
WebSocket
Handshake auth, channel subscription authorization, per-message validation, cross-tenant subscription attempts.
Webhooks
Signature verification, replay protection, payload mutation, malformed input.
Related guides
- AI Pentest for Web Applications — pentesting the frontend that calls these APIs
- AI Pentest for Cloud Infrastructure — IAM, S3, IMDS, serverless
- AI Pentest for SaaS — multi-tenant API isolation
- Vibe Pentesting — generalized methodology for AI-built apps
- AI Penetration Testing Guide — full methodology
- Vulnerability Scanning vs AI Pentest — why scanners miss BOLA
- Supabase RLS Checker — verify every table has a correct policy
- Firebase Scanner — Firestore Security Rules audit
- Backend Security: Firebase Rules — deep dive on rules
- Token Leak Checker — focused scan for keys exposed in bundles
- VibeEval vs Burp Suite — manual vs autonomous pentest
- VibeEval vs OWASP ZAP — DAST vs continuous AI pentest
- Best Security Scanner for AI Apps — comparison
Pentest your APIs today
VibeEval’s AI pentest agents discover and test every API endpoint automatically. Find BOLA, injection, and authorization flaws before attackers do.
COMMON QUESTIONS
SCAN YOUR APP
14-day trial. No card. Results in under 60 seconds.