The Basics of JavaScript Framework SEO in AngularJS

Imagine the scene: you’re embarking on the first few moments of a website diagnostic for your SEO audit. You disable JavaScript in your toolbar of choice and off you head, hopeful to discover a JS only navigation and a bunch of site architecture problems.

If you’re still finding your feet in the world of technical SEO, you’re going to freak out when you see something like this:

bomber-mine-non-js

When you hoped you’d see something like this:

bombermine-js

Or you come across something like this:

coworks-non-js

When you’d hoped for something like this:

coworks-js

Or even something a bit like this:

redbull-sound-select-non-js

When all you really wanted to see was this:

redbull-sound-select-js

And finally:

jobfoundry-non-js

When what you’re expecting to see is:

jobfoundry-js

What you’re looking at is a mixture of websites built in the JavaScript Framework Angular.js. Angular uses clever JavaScript rendering on the client. It features lots of ideas that are entirely new to us SEO folk (like “bi-directional data binding”), and it’s really powerful stuff for constructing web applications, fast. Angular.js is a templating language that leaves much of the content rendering almost entirely down to the client (your browser).

This leaves the search engine crawler left a little screwed.

Huh?

OK, think about a HTML web page (like the one you’re reading now). The HTML you’re viewing is a template, constructed and customised by the output of a few PHP files and a database lookup. The HTML itself is compiled on the host webserver whenever you request it, and then it’s served via http. Of course, if someone else has requested this page before, the chances are you’re reading a cached copy, built long before you knew this article even existed.

Right now you’re reading a web page that is in essence, a HTML file that’s been served by a web server. It was delivered after you’d asked for it via a http GET request, and now the deal is done. If you want to see another webpage, you can ask for that too and our web server will happily let you have it. If you want to interact with it, maybe you’ll complete a form and send a POST request. This is how the internet works.

That’s not quite what happens when you land on a web page built within a JS framework like Angular. Essentially, when you make a request to an Angular site, the content you see is a result of Javascript manipulating the DOM. Sure, there’s a lot of back and forth via http (using Angular’s $http service) but the client is doing most of the heavy lifting. The page rendering, the asynchronous exchange of data, content updates without a browser refresh, HTML template construction –  it’s all very clever.

Because of this, Angular belongs to a stack that JS developers love to work with because it’s fast and relatively easy to prototype an application. Side note: this stack is called MEAN: (Mongo, Express, Angular, Node)

Bizarrely, some web developers insist on building websites based in AngularJS when they don’t need to (they’re actually building a website not a web application), or they find themselves building their brochureware (FAQ’s, landing pages, about pages etc) in the same technology as the web application they’re hosting. In any case, if you build a website in a JavaScript framework, your SEO is going to suck out of the box, and you’re going to need to win friends in the engineering team to stand a chance of ranking for anything, ever.

Why?

Take a look at this AngularJS site. The content you can see is rendered in Javascript, and because of that, if you “view source” you’ll see there’s an unusually small amount of HTML (much less than you’d expect, which I’ll try to explain in a moment). Here’s an example:

<div ng-view></div>

Yes, that’s *all* of the content you’re going to get when you make a request to jobfoundry.com. All of it!

That’s why you get the blank page when you visit the site with JS disabled in your browser. “ng-view” is a binding directive that makes the magic happen: instruct the Angular JS to begin manipulate the DOM with whatever content view it’s been bound to. It fills the container with content, basically.

So how does this affect your SEO? It’s pretty obvious you need a solution. Fortunately, there is one!

Making Heavily JS Dependent Websites Crawlable

It’s really important for us to understand how we can help search engines crawl JavaScript dependent websites. If you get this, you’re pretty much at the top of your technical SEO game.

Google and Bing support a directive that allows web developers to serve HTML snapshots of your JS heavy content, via a modified URL structure. Specifically, the “escaped fragment”, a parameter that replaces the hashbang (#!) in a web application URL. That parameter looks like this:

?_escaped_fragment_=

So, imagine you’ve got a bunch of hashbangs in your Angular site:

http://builtvisible.com/#!/1/2/3/products/content

A search crawler should recognise the #! and automatically request a new URL with our parameter included:

http://builtvisible.com/?_escaped_fragment_=/1/2/3/products/content

And provided your web server knows what to do with the modified request (i.e, serve a HTML rendered page), it’s all good.

Here’s how Google explains the process in it’s “Making Ajax Websites Crawlable” documentation:

crawlerserver2

Obviously you’ll need to ensure that HTML pre-fetching is running on your web server – this doesn’t just happen by accident!

Most engineers prefer to devise their own solution, most frequently using phantom.js (in my experience). If you’re able to pre-render your site as HTML the only other thing you’ve got to be able to do before testing is make sure requests containing the escaped fragment parameter are routed to your HTML cache directory on your webserver – that’s a trivial challenge for a good web developer, a good example for which can be found here taken from this excellent article on Year of Moo.

Not Using Hashbangs in Your Angular Site? Good!

If you look in the <head> of  this website, here’s what you’ll get:

rbss-meta

You’ll either see it straight away or you won’t:

<meta name="fragment" content="!">

It’s tempting to jump to the conclusion that the developer of this project has a long way to go before their SEO works: the double curly brace notation {{ }} can be extremely misleading when you’re working through the source. That’s obviously not the case, this developer is very smart: http://www.redbullsoundselect.com/?_escaped_fragment_=  – see how the source code makes sense now?

This website doesn’t use hashbangs in the URL structure, so the the “meta fragment” declaration made in the <head> is an instruction to search engines to request the URL with the ?escaped_fragment_= parameter.

Side note: hashbangs *are* stupid. Don’t use them. Not if you don’t have to. Use Angular’s $location service to construct URLs without the #! via the HTML5 History API and save yourself a lifetime of misery. Trust me.

Side Notes: Testing

I’m not aware of a software SEO crawler that respects this directive. So in terms of testing, you need to manually add the ?_escaped_fragment_= to the end of a URL, like this: http://jobfoundry.com/?_escaped_fragment_=. This means you can’t check by site crawl, you have to test URLs one by one. I’ve found that SEO Tools for Excel is a good friend for this – you concatenate all the URLs with the escaped fragment parameter and do the usual checks via that route. If your site has a working sitemap, you can import it directly into Excel using this technique.

As a second point, these principles apply to BackboneJS and EmberJS and could be applied to custom JavaScript frameworks too. Enjoy!

Resources / Further Reading

AngularJS Developer Guide - http://docs.angularjs.org/guide

AngularJS and SEO (Year of Moo) - http://www.yearofmoo.com/2012/11/angularjs-and-seo.html

Prerender.io - https://prerender.io/

Getting Started with Angular - http://lostechies.com/gabrielschenker/2013/12/05/angularjspart-1/

PhantomJS Documentation - http://phantomjs.org/documentation/



Stay Updated: Sign Up for Webinar & New Blog Alerts

22 thoughts on “The Basics of JavaScript Framework SEO in AngularJS

  1. Ed Fry says:

    Thanks for posting (and loving the mention of jobfoundry here :D).

    Also keen to hear a follow up on Google Analytics implementation on GA. We’ve used this library on a project… alas we’re getting zero data in at the moment.

    Love to hear more about some of the decisions and options with analytics implementation around Angular.js applications.

    Another Angular.js site of note is the new http://www.mixcloud.com/.

  2. Hey Ed! No problem – there’s a few things I’m writing up at the moment – that’s in the list mate. Thanks!

  3. Giuseppe Pastore says:

    Very nice post and very good reminder about escaped fragment URLs. I can remember I had a little headache when I wrote a post about Twitter dupes and at the time they were using hashbangs in URLs. It looks like I really have to dive in History API, anyway. Thanks, Richard.

  4. Simon Glanville says:

    Thanks Richard for this very insightful post. As someone who is keen to further my technical SEO knowledge this is great, wish there were more of these technical SEO posts out there for those of us who are still learning!

  5. YM Ousley says:

    Very interesting and informative post. Can the escaped fragment help with rendering specific elements on a page? For the longest time, I’ve been trying to figure out how to grab information that’s dependent on the selection made in a drop down menu (on JSP pages). In some cases, Google having this information may be a result of feeds, but I’m wondering if this is part of what would make it crawlable. The toughest nut to crack for me has been a site where the dropdown info is contained in an onchange event.

  6. Eli Schwartz says:

    Awesome post, Richard! I spent hours trying to teach myself this stuff for a project I was working on. One thing to note and might be worth adding to the post is that the HTML cache content has to be nearly identical to the JS content or you could have a cloaking issue.

  7. Bruce Werdschinski says:

    Thanks for sharing your insights Richard, some great examples there. We use PhantomJS daily for acceptance test-driven development and it’s been very interesting seeing how SEOgadget has used it for SEO, you’re ahead of the curve once again :)

  8. Anoop Srivastava says:

    Richard, Thanks for your post. I am working as SEO expert and do not know how to handle the Java Script website, how make it crawable. Now i am going to test and share your post to my development team.

  9. Even André Fiskvik says:

    Hi Richard. Nice article! Pinpoints something that should be a concern for a lot of the webapplications being built nowadays.
    I’d love to hear more about your hatred for the hashbang, seems like you have some bad history/relations with it you’re not telling :D
    We’re using hashbangs now, what would be the most compelling reasons to change it and is there any drawbacks not using them? (I’m guessing you are talking about using the AngularJS router’s HTML5 mode)

  10. Joonas Ruotsalainen says:

    Great article! Will this technique work with other search engines too or just Google and Bing? For example does Baidu or DuckDuckGo understand escaped_fragment?

  11. Thanks a lot Simon, really appreciate the feedback!

  12. Hi YM – the ?_escaped_fragment_= parameter is the standardised (by Google + Bing) request parameter for a hTML snapshot generated by something like Phantom.js. At first glance it sounds like your pre-render snapshots would be better linking directly in a hrefs rather than in inline JS links. I’d submit an XML sitemap with the correct URLs, that might encourage Google to index them.

  13. Indeed, I just find them a bit offensive to the very definition of a URL. Take a look at this for a good view on the issue: http://atendesigngroup.com/blog/hashbang-and-pushstate

  14. Hi Joonas, to the best of my knowledge this is something only Bing and Google support, but I’d be delighted to be wrong. Should be a trivial case of looking in your logs and seeing what the other search engines are requesting.

  15. Mike Simmons says:

    Have you considered serving up a basic version of the content within the noscript block, as described here: http://eviltrout.com/2013/06/19/adding-support-for-search-engines-to-your-javascript-applications.html

    This is how Discourse are doing it with their ember app.

  16. Robert Dunne says:

    Hi Richard – nice introduction.

    I wanted to answer the question about which search bots etc. support this approach. We implement it as a SaaS solution: https://ajaxsnapshots.com and can deduce a lot from our logs.

    We wrote up our full findings as a blog post: http://blog.ajaxsnapshots.com/2013/11/googles-crawlable-ajax-specification.html, but to summarise:

    Google: Yes
    Bing: Yes
    Yandex: Yes
    Facebook (open graph reading bot): Only for hashbang URLs
    Twitter Cards bot: No
    Google+ bot: Yes
    LinkedIn bot: No

    We don’t think Baidu support it, but we now have a significant Chinese customer so we’ll know for sure soon.

    Can’t be sure about DuckDuckGo – not seeing them in our logs though so probably not.

  17. Allie says:

    This is great stuff, thanks for putting it together.

    Have you had any experience with services like BromBone that snapshot the HTML and proxy it to search bots?

  18. Hi @Allie – all of those services are not unlike a CDN, just with an additional rendering layer in-between. In my opinion you’re adding a lot of moving parts that are out of your control, although there are obvious and immediate benefits on cost for using a 3rd party service. If you want full control of the content you’re serving to a search engine then it wouldn’t be a bad idea to build this yourselves.

  19. SEO4Ajax says:

    Hi Richard,

    For your information, there is at least one SEO tool that supports the escaped fragment specification: Clicbot http://www.clicbot.com.

    By the way you can have a look at our service at http://www.seo4ajax.com. It’s free during the beta!

  20. Richard says:

    Hi Richard,

    I am not from a very technical background. I ‘ll have to learn to understand all these stuff. But reading this post makes me understand how much is left out to learn

Comments are closed.