Google Cloud Monitoring (formerly called Stackdriver) is a service, which provides monitoring for cloud
resources (VM instances, App Engine, Cloud functions...). It is available from Google Cloud Console. This service offers monitoring, alerting, uptime checks of cloud resources and much more. It is important to note that the Google Cloud Monitoring service itself is running on Google Cloud virtual machines.
Every virtual machine in Google Cloud stores its metadata on the metadata server. Those metadata include project ID, service account information, information about the virtual machine, or public ssh keys. The metadata might be queried from within the instance (from the IP address 169.254.169.254) or from the Compute Engine API.
One of the services that Google Cloud Monitoring offers are Uptime checks. An Uptime check is a service, that sends periodically requests to a resource to see if it responds. A check can be used to determine the availability of App Engine, VM instance, URL, etc.
I started to test this feature for SSRF by creating an uptime check, which sends a request to an URL/IP address. Most of the URLs and IP addresses, that are usual SSRF targets were blocked. But since the Cloud Monitoring itself is running on Google Cloud VM instances, there was a possibility that I could try to call metadata endpoints, because the request to the metadata endpoint itself would be sent from within the instance.
When the metadata are queried from within the virtual machine, it is required to include header "Metadata-Flavor: Google" for metadata API version "v1" (older versions of metadata API did not require this header). Luckily there was an option to add custom headers to the request, so that was not an issue.
I created the uptime check with the following parameters:
Custom Headers: Metadata-Flavor: Google (required for /v1/ metadata endpoints)
Then I pressed the Test button on the bottom of the Uptime check creation form, which sent a request to the metadata server and then displayed that the check was successful.
The response I saw was:
Because the response code was 200 and the response time was 2 ms, I was sure that the metadata endpoint is reachable via uptime check (request to external URL would take much longer). The problem was that the response body to this request was not visible. Only two things that were returned were the response code and response time. At this point, this was only blind SSRF.
To get the response body, I used another Uptime check feature - Response validation. Response validation is a feature that checks if the response body contains a specific string. The example configuration could be seen in the image below.
The method I used was following - I started by looking for one character that is included in the response. I did this by gradually testing whether the response contained one of all possible characters. Then after one character was found, I tried to find second character by appending or prepending characters to the already found character and trying again if it was contained in the response. This process would be repeated until the full response was parsed from the metadata server.
For example, I would try whether the response contains characters 'a', 'b', 'c'... Let's say I found that 'c' is contained in the response. Then I would continue and try to prepend or append another character and tried to find if the response includes characters 'ca', 'cb', 'cc'... Then if I found that 'ca' is returned in the response, I would try another combinations - 'caa', 'cab', 'cac'... and repeat the process until I got the response.
To do the response body validation I used the endpoint which is called, when the Test button in the Uptime check form is pressed.
The request looked like this:
I created a simple Python script, which parses the response using the described method automatically. The script is available here - https://gist.github.com/nechudav/0b2e0217ffe31a3cd1c1743c590595e6
With this script, I obtained project-level metadata - public SSH key, project name and other information about the Google Cloud Monitoring project. It was also possible to get instance-level metadata which are same for all instances (machine type, CPU platform...). But I struggled with getting instance-level metadata that are unique for each instance or data that are periodically refreshed (for example service account tokens, IP addresses of the instances). It was because Uptime check service is running on multiple instances across the world (there were about 54 running instances) and the requests made to the service are load-balanced. So there was no assurance that multiple requests would be sent to the same instance. Getting unique instance-level metadata would require sending large amount of requests, which was problematic, because the API was rate-limited and it would be very time-consuming. At this point I did not continue in the research.
I reported the issue, it got accepted, and Google VRP rewarded me $31,337 for this bug. I'd like to thank Google VRP team for the reward and quick response.
Time of report: June 2020