Wednesday, January 7, 2009

Small addition to John Resig's Bringing the Browser to the Server

After reading John Resig's article: which was written over a year ago. I wanted to try out some of things he demonstrated.
By combining Mozilla's Rhino project with a few javascript files, John was able to do
a number of useful things like automated javascript testing and web screen scraping.
He also outlined some pseudo code for creating a web app environment.
The one thing he ran out of time was integrating an html parser into this setup.
At his suggestion I have integrated the nekohtml html parser into this setup.
In order to use this setup you should follow these steps:

1. Get jquery source code . Via SVN at Important get REVISION 2302
2. Get Rhino ( source code optional), site:
3. Get Nekohtml source code, site:
4. Edit /jquery/jquery/build/runtest/env.js (starting at line 135)
window.DOMDocument = function(file){
this._file = file;
//this._dom = Packages.javax.xml.parsers.
// DocumentBuilderFactory.newInstance()
// .newDocumentBuilder().parse(file);

var parser = new;
var source = new,null,null,file,"UTF8");
this._dom = parser.getDocument();

if ( !obj_nodes.containsKey( this._dom ) )
obj_nodes.put( this._dom, this );

5. You can run your javascript file via the Rhino Debugger App
command line:
java -cp build/nekohtml.jar;build/nekohtmlXni.jar;build/xml-apis.jar;build/xercesImpl.jar;build/xercesSamples.jar;build/js.jar filename.js

Note: Some further updates to env.js will be necessary to get it to run the html parser outside the Rhino Debugger App.