SSRF in the Cloud: How a URL Input Field Becomes Full Infrastructure Compromise

SSRF in the Cloud: How a URL Input Field Becomes Full Infrastructure Compromise

Jason

Jason

@Jason

In 2019, a former Amazon Web Services employee exploited a server-side request forgery vulnerability in a web application firewall misconfiguration at Capital One. The SSRF allowed her to query the EC2 instance metadata service at 169.254.169.254 from the vulnerable application, retrieving temporary IAM credentials for the role attached to the instance. Those credentials had permissions to list and read S3 buckets. She used them to access approximately 100 million customer records, including names, addresses, credit scores, and Social Security numbers. Capital One paid over $300 million in fines, settlements, and remediation costs.

The Capital One breach was not the first SSRF-to-cloud-credential incident, but it was the one that forced the industry to confront a structural problem: the cloud instance metadata service, designed as a convenience for applications to discover their own identity and credentials, is also the most efficient pivot point for any attacker who achieves server-side code execution or request forgery.

Why SSRF Is Disproportionately Dangerous in Cloud

SSRF, at its core, is simple: an application accepts a URL from user input and fetches it server-side. The vulnerability arises when the application does not adequately restrict which URLs it will fetch, allowing an attacker to direct requests to internal resources that the application's network position can reach but the attacker's cannot.

In a traditional on-premise environment, SSRF is still dangerous (it can probe internal networks, access internal services, and read sensitive files). But in cloud environments, it is qualitatively more destructive because of the metadata service.

Every major cloud provider offers an instance metadata service accessible from any workload running on the platform:

  • AWS: http://169.254.169.254/latest/meta-data/ , provides instance identity, network configuration, and IAM role credentials
  • GCP: http://metadata.google.internal/computeMetadata/v1/ , provides instance identity, project metadata, and service account tokens
  • Azure: http://169.254.169.254/metadata/instance , provides instance details and managed identity tokens

These endpoints are reachable from the instance's own network without authentication (in the case of IMDSv1). They return credentials that are valid for the IAM role or service account attached to the workload. An SSRF vulnerability that can reach 169.254.169.254 can, in a single request, obtain credentials that may have broad access to cloud resources.

The trust chain that SSRF exploits is:

  1. The application trusts user-supplied URLs enough to fetch them
  2. The cloud network trusts the application's IP address (it is running on the instance)
  3. The metadata service trusts any request from the instance's network
  4. The cloud control plane trusts the credentials returned by the metadata service

Each trust relationship is individually reasonable. The composition is catastrophic: a user-supplied URL results in cloud API credentials.

sequenceDiagram participant Attacker participant WebApp as Vulnerable Application participant IMDS as Instance Metadata Service participant CloudAPI as Cloud Control Plane Attacker->>WebApp: POST /fetch-preview<br/>{"url": "http://169.254.169.254/latest/meta-data/iam/security-credentials/"} WebApp->>IMDS: GET /latest/meta-data/iam/security-credentials/ IMDS-->>WebApp: RoleName Note over Attacker: Attacker learns the IAM role name Attacker->>WebApp: POST /fetch-preview<br/>{"url": "http://169.254.169.254/latest/meta-data/iam/security-credentials/RoleName"} WebApp->>IMDS: GET /latest/meta-data/iam/security-credentials/RoleName IMDS-->>WebApp: {"AccessKeyId": "...", "SecretAccessKey": "...", "Token": "..."} WebApp-->>Attacker: Response containing temporary credentials Attacker->>CloudAPI: AWS API calls with stolen credentials Note over CloudAPI: s3:ListBuckets, s3:GetObject, iam:ListRoles, ...

Why Input Validation Keeps Failing

The standard advice for SSRF mitigation is "validate the URL." In practice, URL validation against SSRF is far more difficult than it appears, because the attacker controls the input and the HTTP client is more flexible than the validator assumes.

DNS rebinding. The application resolves the hostname, checks that it resolves to a public IP address, and then fetches the URL. Between the validation check and the fetch, the DNS record changes to point to 169.254.169.254. The application passes the validation (the first resolution was a public IP) but fetches from the metadata service (the second resolution is the link-local address). This is a TOCTOU (time-of-check-time-of-use) race condition, and it is practical to exploit.

Alternate IP representations. 169.254.169.254 can be represented as 0xa9fea9fe (hexadecimal), 2852039166 (decimal), 0251.0376.0251.0376 (octal), or 169.254.169.254.xip.io (dynamic DNS). A denylist that blocks the dotted-decimal string fails against these representations. A robust validator must resolve the hostname, extract the IP address, and check it against address ranges (not string patterns), accounting for every encoding the HTTP client accepts.

Open redirects and redirect chains. The application validates that the initial URL is safe (a public hostname), but the HTTP client follows redirects. The safe URL returns a 302 Location: http://169.254.169.254/... redirect, and the client follows it without re-validation. SSRF mitigation must either disable redirect following entirely or re-validate the destination at each redirect step.

Parser inconsistencies. Different URL parsers handle edge cases differently. http://public.example.com@169.254.169.254/ might be parsed as hostname public.example.com by the validator (treating 169.254.169.254 as a path) but as hostname 169.254.169.254 by the HTTP client (treating public.example.com as a username). These parser differential attacks exploit the gap between the validation parser and the fetch parser.

Protocol smuggling. Some HTTP clients support protocols beyond HTTP/HTTPS , gopher://, file://, dict://. A gopher:// URL can be crafted to send arbitrary bytes to a TCP port, enabling interaction with internal services that speak protocols other than HTTP (Redis, Memcached, SMTP). Validation that only checks the hostname and IP but not the scheme can be bypassed through protocol-level abuse.

The cumulative lesson from two decades of SSRF bypass research is that URL validation, while necessary, is insufficient as a sole defense. The validation logic must account for every encoding, every redirect, every protocol, and every parser inconsistency , a set of conditions that expands with every new bypass technique discovered. Defense-in-depth is not optional.

Defense-in-Depth That Changes the Calculus

IMDSv2 (AWS) / Metadata header requirements (GCP). AWS IMDSv2 requires a PUT request with a custom header (X-aws-ec2-metadata-token-ttl-seconds) to obtain a session token, which must then be included as a header in subsequent metadata requests. Most SSRF vulnerabilities use GET requests and cannot set custom headers, so IMDSv2 blocks the majority of SSRF-based metadata access. GCP requires a Metadata-Flavor: Google header on all metadata requests, providing similar protection. These mitigations are effective against most SSRF vectors but not all , if the attacker can control HTTP headers (through a more permissive SSRF or a proxy misconfiguration), IMDSv2 can still be bypassed.

The critical step is enforcing IMDSv2-only access at the instance or account level. AWS allows configuring HttpTokens: required on instances and in launch templates, which disables IMDSv1 entirely. This should be a default for all new instances and a migration target for existing ones. The migration cost is non-trivial , applications and SDKs that access IMDS must support the two-step token flow , but the risk reduction is substantial.

Network-level egress controls. Blocking outbound access to 169.254.169.254 from workloads that do not need metadata access is a blunt but effective control. In Kubernetes environments, network policies can prevent pods from reaching the IMDS endpoint. In EC2 environments, iptables rules on the host can restrict access. This works regardless of the SSRF technique because it operates at the network layer, below the application layer where bypass techniques operate.

Least-privilege IAM for workloads. Even if the attacker successfully exploits SSRF and retrieves credentials, the damage is bounded by the permissions of those credentials. The Capital One breach was catastrophic not because of the SSRF itself, but because the compromised role had s3:ListBuckets and s3:GetObject on buckets containing customer data. A workload whose IAM role permits only s3:PutObject on a specific logging bucket produces credentials that are useless for data exfiltration, even if they are stolen.

Application-level fetch constraints. Beyond input validation, the fetch mechanism itself should be constrained: use an allowlist of permitted destination domains (not a denylist of blocked addresses), disable redirect following or re-validate at each step, restrict the permitted protocols to HTTPS only, set a short timeout, and validate the response content type before returning it to the user. These constraints do not prevent all SSRF but reduce the exploitable surface to cases where the attacker can find a way to reach the metadata service through an allowed domain via a chain of redirects , a significantly harder exploitation requirement.

The Structural Problem Remains

SSRF exists because modern web applications need to fetch remote resources , link previews, webhook validation, image processing, PDF generation, URL verification. Each of these features requires the server to make HTTP requests based on user input. Eliminating SSRF means eliminating this class of functionality, which is rarely acceptable.

The pragmatic approach is to accept that SSRF is a persistent risk in any application that fetches user-supplied URLs and to build the defensive stack so that a successful SSRF does not cascade into infrastructure compromise. IMDSv2 blocks the most common credential-harvesting technique. Network egress controls block access to internal services that SSRF might target. Least-privilege IAM limits what stolen credentials can do. Application-level fetch constraints raise the exploitation bar.

No single layer is sufficient. The organizations that avoid SSRF-driven breaches are the ones that deploy all of these layers and accept the operational overhead of maintaining them. The ones that deploy URL validation alone and consider the problem solved are the ones whose next breach report will describe a "small input-validation bug" that somehow led to exfiltration of their entire customer database.

Integrate Axe:ploit into your workflow today!