SSRF defense in depth
Server-Side Request Forgery is the bug that turns "user-supplied URL" into "fetch anything from your private network." Capital One 2019 ($190M settlement), GitLab 2021, Shopify 2022 — all had public SSRF roots. Defenses keep losing because every layer has edge cases. The right approach is layered, not "we filter for 169.254.0.0/16."
Why allowlists alone don't work
Naive defense: "only fetch URLs whose host resolves to a public IP." Fails to:
- DNS rebinding. Attacker DNS responds with a public IP for the first lookup (allowlist passes), then with 169.254.169.254 on the second lookup (when your code actually connects). Solution: resolve once, store the IP, connect to that IP. But then you lose
Host: example.commatching for TLS. - URL parser disagreement.
https://attacker.com#@169.254.169.254/— your parser sees hostattacker.com, the HTTP client library sees host169.254.169.254(libcurl historically did this). 30+ such gadgets in Orange Tsai's papers. - IPv6.
[::ffff:127.0.0.1],[::1],[fe80::%eth0]link-local. Your IPv4 blocklist misses all of these. - Octal/hex/decimal IPs.
http://0177.0.0.1/= 127.0.0.1 in octal.http://2130706433/= 127.0.0.1 in single-decimal form. Some HTTP clients still accept these. - HTTP redirects. Allowed URL returns 302 Location: 169.254.169.254. Your code blindly follows. Solution: re-validate every redirect target.
Layer 1: parse, resolve, validate the IP
Validate the URL with a single parser (don't use one for matching and another for fetching). Resolve the hostname yourself, get the IP, validate the IP against a deny list of private ranges, then connect to that exact IP with the original Host header for TLS SNI.
# Pseudocode
url = parse(user_input)
host = url.host # must be hostname or IP
if not allowed_scheme(url.scheme): reject
ips = dns_resolve(host) # may return multiple; check ALL
for ip in ips:
if is_private(ip) or is_link_local(ip) or is_metadata(ip):
reject
# Connect to the resolved IP, not the hostname
conn = http.connect(ip, url.port, sni=host)
# On redirect, recurse with the same validation
"Resolve and pin to the IP" prevents DNS rebinding. Re-validating on redirect prevents the 302-to-internal trick. Both are necessary.
Layer 2: egress firewall
Application-level validation will fail eventually. The fallback is a network boundary: the workload making the user-driven request runs in a network namespace that cannot route to RFC 1918, link-local, or cloud metadata addresses.
In Kubernetes:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-internal-egress
spec:
podSelector:
matchLabels:
app: url-fetcher
policyTypes: [Egress]
egress:
- to:
- ipBlock:
cidr: 0.0.0.0/0
except:
- 10.0.0.0/8
- 172.16.0.0/12
- 192.168.0.0/16
- 169.254.0.0/16
- 127.0.0.0/8
Cilium NetworkPolicy or Calico extends this with FQDN matching for the allowlist of public destinations. The fetcher pod can hit the open internet but not the rest of your VPC.
Layer 3: lock down IMDSv2
On AWS, the metadata service at 169.254.169.254 returns IAM role credentials. Capital One's 2019 SSRF exfiltrated S3 keys this way. AWS introduced IMDSv2 in 2019: the metadata service requires a token obtained via PUT, with a hop count limit. SSRF payloads can't issue PUT, can't set the hop-count header, can't use the response.
# Force IMDSv2-only on every EC2 instance
aws ec2 modify-instance-metadata-options \
--instance-id i-xxx \
--http-tokens required \
--http-put-response-hop-limit 1
Set this organization-wide via Service Control Policy. Treat instances created without this flag as misconfigured. As of 2024 AWS made IMDSv2 default for new instances; older instances with v1 enabled are your risk.
GCP and Azure have similar metadata services. GCP requires Metadata-Flavor: Google
as a header for any metadata response (since 2018, harder to send via SSRF). Azure
IMDS requires the same kind of header. Lock these down regardless.
Layer 4: egress proxy with allowlist
The most robust enterprise approach: route all outbound HTTP from user-driven workloads through a forward proxy (Squid, Smokescreen, custom Envoy) that maintains a provider-specific allowlist. Even if every other layer fails, the request can only reach the explicit list of approved domains (Slack webhooks, GitHub API, etc.).
Stripe's Smokescreen is the open-source pattern: a Go HTTP CONNECT proxy that validates the destination IP at connection time, after DNS resolution it controls. Each microservice gets a per-purpose allowlist.
Common patterns we see during recon
Endpoints that scream "SSRF surface" during external recon (and that we surface as INFO findings on Extended scans):
- URL preview endpoints (
/api/preview?url=...) - Webhook test endpoints (
/webhooks/test) - Image proxies (
/proxy?src=...) - RSS/sitemap import (
/import?feed_url=...) - OG card fetchers (Slack-bot-shaped: server fetches the URL the user pastes)
- Server-side PDF/screenshot generators
These are not by themselves vulnerabilities. They flag attention. Run an active SSRF audit on them with payload sets like the Burp Collaborator + DNS-rebinding kit.
What our scanner flags
Our api_surface checker reports endpoints that accept a URL parameter
(when discoverable from OpenAPI / Swagger schemas, or from response patterns). The
headers checker flags X-Forwarded-Host reflection, which is
adjacent to SSRF in spirit. We don't probe for SSRF directly — that's an active scan
that requires consent and a callback collector.
Map your SSRF surface
Free Basic scan flags reflected URL parameters and exposed API surfaces. Pair with active SSRF testing.
Run a scan