O Geez – Abusing the Open Graph Protocol
The Open Graph Protocol (OGP) is an 8-year-old standard originally developed by Facebook. It was created to help website owners integrate their websites into Facebook’s Open Graph technology as a referenceable edge inside their social graph. Whenever you see a “Like this on Facebook” on a News Article, or you share a URL in a post, you are seeing an interaction with OGP in some way. Many companies and websites have adopted this protocol as an easy way to show content in a consistent manner across websites. You probably interact with OGP without even knowing it - anytime you see a preview of something inside a social post, direct message a blog post or Slack channel, this preview most likely uses the standards developed from the OGP project. If you’re unfamiliar with the Open Graph Protocol, I’d recommend reading about it before reading any further. Here’s a general overview of how it works:
A real world example of this is a Wired article that Mark Zuckerberg recently posted on his Facebook page:
In this Facebook post made by Mark Zuckerberg, a Wired article is linked. Notice here that a picture and a title is preloaded, and there is no actual link being displayed. This is OGP at work.
Much like any technology developed to enhance the web user’s experience, we often see attackers leveraging OGP to support nefarious activities. In this post, we’ll explore the Open Graph Protocol implementations of numerous platforms with respect to their preview and card functionality. We’ll explore how these platforms crawl a website for Open Graph meta tags, show how to construct and use an Open Graph Preview object as an effective phishing tool.
Meta Tags
The OGP specification states that any content is compliant if it uses specific RDFa style attributes inside the document body. In simple terms, as long as you have a set of <meta og:XX/>
tags inside the HTML head section of your website, platforms can crawl and display your content according to their user experience. From the Open Graph Protocol website, the four required meta tags are as follows:
There is a myriad of other optional tags, and different platforms tend to use different tags. To help demonstrate this, I pulled a link from a post Elon Musk made about SpaceX’s new BFR Spaceship. A cURL command shows the meta tags listed above:
The title matches that of card, the url points to the website he linked, the type is article and the image is the picture shown. This is where things get interesting - people specify what gets shown and linked through these tags. So, if we can specify it, then there must be some way for these platforms to crawl the websites and collect the necessary information. This could be useful if we want to masquerade links and give them a form of legitimacy, and since OGP is used across all sorts of social and digital channels, it can work cross-platform and be a highly effective attack vector. For example, if an attacker can figure out how these previews are loaded in social networks, they can trick users into thinking that they are following a link to a news article, but instead redirect them to an attack page.
Crawlers for Open Graph Protocol
If a platform implements Open Graph to easily integrate content into their ecosystem, then they must crawl that content to find the meta tags. These crawlers also must have some type of identifier to say they are crawling on behalf of that platform. In order to test this theory, you can launch a free Heroku instance loaded with a LAMP stack. Copy the four HTML metatags that were egrep’d out of Space.com article into the stock headers.html Heroku PHP app (remember, OGP specifies that these headers must be in the <head> section):
Sending the link on Slack, Tumblr and Facebook shows different User Experiences. Listed below are the screenshots of the preview/post, as well as the Apache Log from the HTTP GET.
Slack:
Apache log:
2018-10-04T01:34:23.862711+00:00 app[web.1]: 10.5.196.67 - - [04/Oct/2018:01:34:23 +0000] "GET / HTTP/1.1" 200 3407 "-" "Slackbot-LinkExpanding 1.0 (+https://api.slack.com/robots)
Tumblr:
Apache Log:
2018-10-18T14:12:46.580198+00:00 app[web.1]: 10.35.230.36 - - [18/Oct/2018:14:12:46 +0000] "GET / HTTP/1.1" 200 3407 "-" "Tumblr/14.0.835.186
Facebook:
Apache Log:
2018-10-04T01:42:51.638713+00:00 app[web.1]: 10.95.165.34 – – [04/Oct/2018:01:42:51 +0000] “GET / HTTP/1.1” 200 3407 “-” “facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
Success! We can control what is previewed in each platform, and based on the UX in each respective app, some data is hidden and some is shown. For example, Slack has a light-gray portion of the Heroku URL, Tumblr has a white portion, whereas Facebook does not show the URL. This can be useful for generating payloads that look legitimate but the actual page renders differently to the end-user.
Putting it All Together - Thoughts on Attack Page Infrastructure
Now that we can control what gets displayed in these Open Graph Protocol renders, there are a few different attack scenarios that defenders have to worry about. In the first scenario, the website that serves the OGP tags hosts the attack page:
This scenario was created with the previous example: load the OGP meta tags inside the <head> of the HTML document. The page serving the OGP meta tags and the payload are the same. The problem here, though, is that platforms and the security community in general can track these payload websites and block them. So, as an attacker, it’d be much more efficient if we had a way to differentiate User-Agents that crawl for OGP tags and serve content based on that identifier. That way, we can serve the content and the payload on two or more servers.
This is useful for a few reasons. First, if a page is taken down or removed we can replace the redirection without spawning a new Evil Page. Secondly, these crawlers parse HTML and we can assume they do not parse Javascript - the User-Agent field is a network based indicator the Evil Pages can use to redirect and find legitimate victims rather than crawlers. This infrastructure is outlined below:
In Summary
Defenders should be aware of the Open Graph Protocol and its ability to confuse victims through User Experience. Due to the wide adoption of this technology, an evil website can appear legitimate across many different social and digital technologies, making it easy to attack on multiple vectors while investing minimally into one page. The concept of redirection based on User-Agent will also make detecting this harder for defenders, so they will need ways to crawl and mimic User-Agents when scanning for attack pages.
The ZeroFox platform uses Foxscript to help find malicious payloads affecting your organization on social and digital media. It is also actively finding OGP abusers and uses features of these websites to help alert and remediate these attacks for our customers. To learn more, talk to an expert.