admin管理员组

文章数量:1339794

Either my google-fu has failed me or there really aren't too many people doing this yet. As you know, Backbone.js has an achilles heel--it cannot serve the html it renders to page crawlers such as googlebot because they do not run JavaScript (although given that its Google with their resources, V8 engine, and the sobering fact that JavaScript applications are on the rise, I expect this to someday happen). I'm aware that Google has a hashbang workaround policy but it's simply a bad idea. Plus, I'm using PushState. This is an extremely important issue for me and I would expect it to be for others as well. SEO is something that cannot be ignored and thus cannot be considered for many applications out there that require or depend on it.

Enter node.js. I'm only just starting to get into this craze but it seems possible to have the same Backbone.js app that exists on the client be on the server holding hands with node.js. node.js would then be able to serve html rendered from the Backbone.js app to page crawlers. It seems feasible but I'm looking for someone who is more experienced with node.js or even better, someone who has actually done this, to advise me on this.

What steps do I need to take to allow me to use node.js to serve my Backbone.js app to web crawlers? Also, my Backbone app consumes an API that is written in Rails which I think would make this less of a headache.

EDIT: I failed to mention that I already have a production app written in Backbone.js. I'm looking to apply this technique to that app.

Either my google-fu has failed me or there really aren't too many people doing this yet. As you know, Backbone.js has an achilles heel--it cannot serve the html it renders to page crawlers such as googlebot because they do not run JavaScript (although given that its Google with their resources, V8 engine, and the sobering fact that JavaScript applications are on the rise, I expect this to someday happen). I'm aware that Google has a hashbang workaround policy but it's simply a bad idea. Plus, I'm using PushState. This is an extremely important issue for me and I would expect it to be for others as well. SEO is something that cannot be ignored and thus cannot be considered for many applications out there that require or depend on it.

Enter node.js. I'm only just starting to get into this craze but it seems possible to have the same Backbone.js app that exists on the client be on the server holding hands with node.js. node.js would then be able to serve html rendered from the Backbone.js app to page crawlers. It seems feasible but I'm looking for someone who is more experienced with node.js or even better, someone who has actually done this, to advise me on this.

What steps do I need to take to allow me to use node.js to serve my Backbone.js app to web crawlers? Also, my Backbone app consumes an API that is written in Rails which I think would make this less of a headache.

EDIT: I failed to mention that I already have a production app written in Backbone.js. I'm looking to apply this technique to that app.

Share edited Sep 16, 2012 at 13:42 axsuul asked Sep 16, 2012 at 9:44 axsuulaxsuul 7,4909 gold badges59 silver badges72 bronze badges 1
  • 1 Take a look to the talk 'The pipedream of sharing code between node JS and the browser' by Keith Norman. AFAIK, they are using this technique at Groupon. Video: youtube./watch?v=jbn9c_yfuoM More info: spainjs/speakers.html – Sergio Cinos Commented Oct 8, 2012 at 13:53
Add a ment  | 

6 Answers 6

Reset to default 1

First of all, let me add a disclaimer that I think this use of node.js is a bad idea. Second disclaimer: I've done similar hacks, but just for the purpose of automated testing, not crawlers.

With that out of the way, let's go. If you intend to run your client-side app on server, you'll need to recreate the browser environment on your server:

  1. Most obviously, you're missing the DOM (Document Object Model) - basically the AST on top of your parsed HTML document. The node.js solution for this is jsdom.

  2. That however will not suffice. Your browser also exposes BOM (Browser Object Model) - access to browser features like, for example, history.pushState. This is where it gets tricky. There are two options: you can try to bend phantomjs or casperjs to run your app and then scrape the HTML off it. It's fragile since you're running a huge full WebKit browser with the UI parts sawed off.

  3. The other option is Zombie - which is lightweight re-implementation of browser features in Javascript. According to the page it supports pushState, but my experience is that the browser emulation is far from plete - however give it a try and see how far you get.

I'm going to leave it to you to decide whether pushing your rendering engine to the server side is a sound decision.

Because Nodejs is built on V8 (Chrome's engine) it will run javascript, like Backbone.js. Creating your models and so forth would be done in exactly the same way.

The Nodejs environment of course lacks a DOM. So this is the part you need to recreate. I believe the most popular module is:

https://github./tmpvar/jsdom

Once you have an accessible DOM api in Nodejs, you simply build its nodes as you would for a typical browser client (maybe using jQuery) and respond to server requests with rendered HTML (via $("myDOM").html() or similar).

I believe you can take a fallback strategy type approach. Consider what would happen with javascript turned off and a link clicked vs js on. Anything you do on your page that can be crawled should have some reasonable fallback procedure when javascript is turned off. Your links should always have the link to the server as the href, and the default action happening should be prevented with javascript.

I wouldn't say this is backbone's responsibility necessarily. I mean the only thing backbone can help you with here is modifying your URL when the page changes and for your models/collections to be both client and server side. The views and routers I believe would be strictly client side.

What you can do though is make your jade pages and partial renderable from the client side or server side with or without content injected. In this way the same page can be rendered in either way. That is if you replace a big chunk of your page and change the url then the html that you are grabbing can be from the same template as if someone directly went to that page.

When your server receives a request it should directly take you to that page rather than go through the main entry point and the load backbone and have it manipulate the page and set it up in a way that the user intends with the url.

I think you should be able to achieve this just by rearranging things in your app a bit. No real rewriting just a good amount of moving things around. You may need to write a controller that will serve you html files with content injected or not injected. This will serve to give your backbone app the html it needs to couple with the data from the models. Like I said those same templates can be used when you directly hit those links through the routers defined in express/node.js

This is on my todo list of things to do with our app: have Node.js parse the Backbone routes (stored in memory when the app starts) and at the very least serve the main pages template at straight HTML—anything more would probably be too much overhead /processing for the BE when you consider thousands of users hitting your site.

I believe Backbone apps like AirBnB do it this way as well but only for Robots like Google Crawler. You also need this situation for things like Facebook likes as Facebook sends out a crawler to read your og:tags.

Working solution is to use Backbone everywhere https://github./Morriz/backbone-everywhere but it forces you to use Node as your backend.

Another alternative is to use the same templates on the server and front-end. Front-end loads Mustache templates using require.js text plugin and the server also renders the page using the same Mustache templates.

Another addition is to also render bootstrapped module data in javascript tag as JSON data to be used immediately by Backbone to populate models and collections.

Basically you need to decide what it is that you're serving: is it a true app (i.e. something that could stand in as a replacement for a dedicated desktop application), or is it a presentation of content (i.e. classical "web page")? If you're concerned about SEO, it's likely that it's actually the latter ("content site") and in that case the "single-page app" model isn't appropriate; you really want the "progressively enhanced website" model instead (look up such phrases as "unobtrusive JavaScript", "progressive enhancement" and "adaptive Web design").

To amplify a little, "server sends only serialized data and client does all rendering" is only appropriate in the "true app" scenario. For the "content site" scenario, the appropriate model is "server does main rendering, client makes it look better and does some small-scale rendering to avoid disruptive page transitions when possible".

And, by the way, the objection that progressive enhancement means "making sure that a user can see doesn't get anything better than a blind user who uses text-to-speech" is an expression of political resentment, not reality. Progressively enhanced sites can be as fancy as you want them to from the perspective of a user with a high-end rendering system.

本文标签: javascriptUsing nodejs to serve content from a Backbonejs app to search crawlers for SEOStack Overflow