What is a public S3 bucket and why is it dangerous?

S3 buckets have permission settings at the bucket and object level. The 'public' state is when ListBucket — listing the contents — and/or GetObject — reading objects — are granted to the everyone principal. The most damaging variant is also having PutObject public, which lets anyone upload, including malware. Public buckets are how user uploads, customer data, and internal backups end up indexed by GrayhatWarfare and similar tools, sometimes years before the owner notices.

What is subdomain takeover?

Your DNS has a CNAME for some.your-site.com pointing at some-app.herokuapp.com. You stopped using the Heroku app and deleted it. The DNS record stays. Anyone can now register some-app.herokuapp.com and serve content there — the browser will resolve some.your-site.com to the attacker's content, and your domain implicitly endorses it. The attack is widely documented and there are tools that find dangling records at scale.

What is GCP metadata SSRF?

Cloud providers expose an internal-only metadata endpoint that gives a workload its credentials and configuration. On GCP it's at metadata.google.internal. On AWS it's 169.254.169.254. If your app has any SSRF — a URL parameter that lets the server fetch arbitrary URLs — an attacker can point it at the metadata endpoint and read credentials. AWS shipped IMDSv2 to mitigate this; GCP requires a custom header that mostly mitigates it. Apps that haven't enabled the mitigations are wide open.

Why do AI generators reproduce these?

S3 buckets get configured permissively because permissive is the path of least resistance — the AI's suggestions for bucket policy are often copied from old docs that predate the public-access-block defaults. Subdomain takeover is rarely about generated code; it's about lifecycle (services deprovisioned without the corresponding DNS cleanup). SSRF, we've covered separately — the 'fetch user-supplied URL' shortcut is in every image-proxy tutorial.

Where can I see this on a real URL?

https://gapbench.vibe-eval.com/site/s3-public-bucket/ has the public-list-and-PUT shape. https://gapbench.vibe-eval.com/site/subdomain-takeover/ has a dangling DNS record pointing at takeover-eligible infrastructure. https://gapbench.vibe-eval.com/site/gcp-metadata-ssrf/ is the metadata SSRF.

What CWE does this map to?

CWE-732 (Incorrect Permission Assignment for Critical Resource) for S3, CWE-918 (SSRF) for the metadata variant, CWE-350 (Reliance on Reverse DNS Resolution) tangentially for takeover. OWASP A05:2021 (Security Misconfiguration) and A10:2021 (SSRF).

S3 public buckets, subdomain takeover, GCP metadata SSRF — the cloud misconfigurations LLMs autocomplete

The scenario referenced below runs on gapbench.vibe-eval.com — a public security benchmark we operate.

The bugs themselves are different, but they have one thing in common: they live at the cloud-config layer, not in your application code. A code review of your repo will not find them. A code-only static scanner will not find them. You only find them by looking at the deployed reality.

Which is exactly why AI-generated apps reproduce them. The AI helps you write code; the cloud configuration is somewhere else. Whoever clicked through the AWS console, or pasted the Terraform the AI wrote, or accepted whatever default the deploy provider showed first — that’s where these bugs ship from.

S3 public buckets

The classic failure mode. AWS’s S3 default has gotten more conservative — Block Public Access is on by default for new buckets — but the failure modes have just gotten more inventive.

What we still find:

Bucket policy explicitly granting s3:GetObject to *. This is the “make it work for the CDN” anti-pattern.
Bucket policy granting s3:ListBucket to *. Often paired with #1, sometimes alone, lets attackers enumerate every key in the bucket.
Per-object ACL set to public-read on individual files. Bucket-level Block Public Access is on, so the whole-bucket policy looks fine, but specific files were uploaded with acl: public-read and are individually fetchable.
CORS policy set to * with Authorization in AllowedHeaders, which lets cross-origin JavaScript read pre-signed URL responses with the browser’s credentials.
The frequent worst case: PUT also public. The AI generated an upload feature, the upload feature uses pre-signed URLs, but the bucket policy was relaxed to “make it work” and now anyone can upload anything.

Live: https://gapbench.vibe-eval.com/site/s3-public-bucket/. The bucket has both list and PUT open.

The fix is layered. Block Public Access on at the bucket level and the account level. Bucket policy that only grants the specific principals you intend. CloudFront in front of any bucket that needs to serve over HTTP, with origin access control configured so direct-to-S3 access is blocked. Per-object ACLs disabled (the new bucket setting BucketOwnerEnforced does this).

Subdomain takeover

Live: https://gapbench.vibe-eval.com/site/subdomain-takeover/. The scenario simulates a dangling CNAME pointing at a takeover-eligible service.

The bug isn’t AI-generated; it’s lifecycle. You spun up a Vercel project for a marketing landing page, pointed beta.your-site.com at it, abandoned the project, deleted it from Vercel. The CNAME stayed. Now beta.your-site.com resolves to a Vercel project that doesn’t exist. An attacker creates a Vercel project with the same name, points it at content they control, and serves whatever they want under your domain.

Why this matters: the attacker now has a subdomain that matches your CORS allowlist (if it’s permissive), your OAuth redirect_uri allowlist (if it includes wildcards), your cookie scope (if your auth cookie is set with Domain=.your-site.com). Each of those becomes a chain step.

The fix:

Audit DNS records quarterly. Anything pointing at a third-party service, verify the service is still active.
Use subdomain-takeover-finder tools — subjack, nuclei with the takeover templates, etc. — and run them against your domain regularly.
Tighten cookie Domain. The ones that are not actively shared between subdomains should be set to the specific subdomain.
Tighten OAuth redirect_uri. No wildcards. Exact match.

GCP metadata SSRF (and the AWS variant)

A cousin of the SSRF article — listed here because the cloud-metadata variant is the highest-impact form of SSRF. If your app has any server-side fetch of a user-supplied URL, and you haven’t enabled the cloud’s metadata-protection mitigations, an attacker fetches http://169.254.169.254/latest/meta-data/iam/security-credentials/<role>/ (AWS) or http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token (GCP) through your SSRF and walks away with the workload’s credentials.

Live: https://gapbench.vibe-eval.com/site/gcp-metadata-ssrf/. See also the SSRF pattern article at /patterns/ssrf-open-redirect-oauth/.

The mitigation is per-cloud:

AWS: enable IMDSv2, which requires a session-token PUT before metadata GETs. Most simple SSRF chains can’t issue PUTs.
GCP: require the Metadata-Flavor: Google header on metadata requests. Most SSRF chains can’t add custom headers.
Both: block egress to the metadata IP at the network layer for any workload that doesn’t legitimately need it.

A specific incident — S3 PutObject and subdomain takeover

Anonymized, two-bug chain. A SaaS used S3 for image hosting via pre-signed URLs. Standard pattern: user requests an upload, server signs a URL, browser PUTs directly to S3. The bucket policy was permissive — the team had relaxed it months earlier when the signing was being debugged. Specifically, the bucket allowed s3:PutObject from any principal as a fallback for “edge cases.”

Separately, the team had assets.example.com pointing at this same bucket via CloudFront, used for serving the uploaded images. They’d also at some point pointed marketing-old.example.com at a defunct Vercel project — DNS record never cleaned up.

The chain. An attacker enumerated subdomains, found marketing-old.example.com was a takeover-eligible Vercel CNAME, and registered the same Vercel project name. Now they controlled marketing-old.example.com. Browser hits to that subdomain came back from the attacker’s content. CSP on the main app was permissive — script-src included *.example.com — so scripts loaded from the takeover-eligible subdomain executed on the main app’s pages.

For the second leg, the attacker’s content used the open s3:PutObject to upload a malicious script to assets.example.com. Scripts served from assets.example.com were also covered by the CSP. The script could now persist beyond a single takeover-eligible subdomain.

The cleanup: tighten S3 bucket policy (PutObject only via pre-signed URLs from the server, no anonymous put). Fix the dangling DNS. Tighten CSP to specific subdomains, not wildcards. Audit for any other dangling DNS records. Each fix small; the chain was the issue.

A taxonomy of S3 misconfigs

The shape of S3 mistakes evolved with AWS’s defaults:

Pre-2018: Bucket public-by-default. Mostly fixed by Block Public Access defaults.
2018-2020: ACLs granting public-read on individual objects. Bucket private, objects public. Addressed by BucketOwnerEnforced setting.
2020-2023: Bucket policy granting s3:GetObject to *. Common as “make CDN work” workaround. Still common.
2023-present: Bucket policy granting s3:PutObject to *. Less common but more catastrophic. Lets attackers upload.
CORS allowing * with Authorization in AllowedHeaders. Lets cross-origin JS read pre-signed URL responses with browser credentials.
Pre-signed URL with overlong expiry. A ?Expires= parameter set to 24 hours or longer means a leaked URL is valid for a long time.

Each of these has its own probe. We run all of them in the vibe-code-scanner S3 sweep.

Subdomain takeover — beyond Heroku/Vercel

The classic takeover targets are services where you can register a project with the same name your dangling DNS points to. The common families:

PaaS: Heroku, Vercel, Netlify, Render, Railway. Each has had takeover-eligible patterns.
CMS/site builders: WordPress.com, Squarespace, Webflow, Shopify. Each has had patterns.
CDN: AWS CloudFront with no origin set, Azure CDN. Same shape.
GitHub Pages: *.github.io pointed at a deleted repo. Common in the early 2010s, mostly mitigated by GitHub now.
AWS S3: Bucket name pointed at, bucket deleted, attacker re-creates with the same name. Mitigated by AWS’s bucket-name reservation policy in some regions.

The general defense: audit DNS quarterly, alert on dangling records, use subjack / nuclei takeover templates as a CI step.

Wrong fix vs right fix — S3 bucket policy

// WRONG: public read for "make CDN work"
{
  "Statement": [{
    "Effect": "Allow",
    "Principal": "*",
    "Action": ["s3:GetObject"],
    "Resource": "arn:aws:s3:::my-bucket/*"
  }]
}

// RIGHT: CloudFront origin access control, bucket private
{
  "Statement": [{
    "Effect": "Allow",
    "Principal": {"Service": "cloudfront.amazonaws.com"},
    "Action": ["s3:GetObject"],
    "Resource": "arn:aws:s3:::my-bucket/*",
    "Condition": {
      "StringEquals": {
        "AWS:SourceArn": "arn:aws:cloudfront::123456789:distribution/E1ABCDEFG"
      }
    }
  }]
}
// Plus: Block Public Access on at the bucket level, BucketOwnerEnforced.

How we detect all three

S3 buckets: we crawl your domain for asset URLs, extract the S3 hostnames, and probe s3.amazonaws.com/<bucket>/?list-type=2. Any 200 with a list of objects is a finding. For PUT we probe PUT s3.amazonaws.com/<bucket>/test-vibeeval-marker with a small payload — if it succeeds, finding.

Subdomain takeover: we enumerate likely subdomains via DNS brute-force and certificate transparency logs, resolve each to its CNAME target, and check whether the target is one of the known takeover-eligible services with no claim. The nuclei takeover templates are roughly the public version of this check.

Metadata SSRF: see the SSRF article — the detection is the same probe set, with metadata IPs at the top of the payload list.

Fix priority

S3 first. Subdomain takeover second. Metadata SSRF third. The reason for that order: S3 leaks data continuously and silently, takeover requires an attacker to notice and act, metadata SSRF requires the attacker to find the SSRF first. Damage on a per-incident basis is comparable; likelihood is highest for S3.

CWE / OWASP

CWE-732 — Incorrect Permission Assignment for Critical Resource (S3)
CWE-918 — Server-Side Request Forgery (metadata)
CWE-350 — Reliance on Reverse DNS Resolution (takeover, tangentially)
OWASP Top 10 — A05:2021 Security Misconfiguration, A10:2021 SSRF

Reproduce it yourself

S3 public bucket: https://gapbench.vibe-eval.com/site/s3-public-bucket/
Subdomain takeover: https://gapbench.vibe-eval.com/site/subdomain-takeover/
GCP metadata SSRF: https://gapbench.vibe-eval.com/site/gcp-metadata-ssrf/

Pattern: SSRF, open redirects, and OAuth redirect_uri
Pattern: Naked databases on the public internet
Pattern: Hosting panel bypass and the “internal” surface that isn’t

S3 PUBLIC BUCKETS AND SUBDOMAIN TAKEOVER

S3 public buckets

Subdomain takeover

GCP metadata SSRF (and the AWS variant)

A specific incident — S3 PutObject and subdomain takeover

A taxonomy of S3 misconfigs

Subdomain takeover — beyond Heroku/Vercel

Wrong fix vs right fix — S3 bucket policy

How we detect all three

Fix priority

CWE / OWASP

Reproduce it yourself

COMMON QUESTIONS

SCAN YOUR CLOUD SURFACE

Three bugs that share a vibe

S3 public buckets

Subdomain takeover

GCP metadata SSRF (and the AWS variant)

A specific incident — S3 PutObject and subdomain takeover

A taxonomy of S3 misconfigs

Subdomain takeover — beyond Heroku/Vercel

Wrong fix vs right fix — S3 bucket policy

How we detect all three

Fix priority

CWE / OWASP

Reproduce it yourself

Related reading

COMMON QUESTIONS

SCAN YOUR CLOUD SURFACE