[Project_owners] evaluate html document by XMLHttpRequest

Tommi Rautava tommi.rautava at iki.fi
Wed Aug 8 00:02:02 PDT 2007


The problem is that you cannot parse HTML with XML parser. I struggled
with the same problem some time ago and the best solution I have found
so far is the HTML parser example made by Aaron Boodman (the original
author of Greasemonkey):
http://youngpup.net/userscripts/htmlparserexample.user.js

br, Tom

2007/8/7, Foreningen Selvet - Jesper Staun Hansen <jesper at selvet.dk>:
> Hello.
>
>
> I am having some trouble using element.evaluate(properties) with
> documents fetched with XMLHttpRequest, so I am looking for a method to
> do the same as this:
>
> //*****
> var roundinfo = {
>     runde : "56",
>     heroesid : null,
>     heroesnick : null
> }
>
>         var path = '//table/tbody/tr/td';
>         var obj = document.evaluate(path,document,null,
>             XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
>         for (var k = 0; k < obj.snapshotLength; k++) {
>             obj2 = obj.snapshotItem(k).parentNode;
>             var nodes = obj2.childNodes;
>
>             dump("Found keys: "+nodes.length+"\n");
>             var menu = nodes[0].textContent.trim().split("#");
>             roundinfo.heroesnick = menu[0].trim();
>             roundinfo.heroesid = menu[1].trim();
>             dump("Nick is: "+roundinfo.heroesnick+"\n");
>             dump("ID is: "+roundinfo.heroesid+"\n\n");
>         }
> //*****
>
> And the document contains this:
>
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
> <html>
> <head>
> <title></title>
> </head>
> <body>
> <table>
>         <tr>
>                 <td colspan="3"><font face="arial" size="1" color="DarkTurquoise">Axe Decapitators       #130303</font></td>
>         </tr>
>
>
> </table>
> </body>
> </html>
>
> And the dump would be
>
> Nick is: Axe Decapitators
> ID is: 130303
>
> And I tried some things to make it with XMLHttpRequest:
> //******* Attemp:
>             var req = new XMLHttpRequest();
>             req.overrideMimeType('text/html');
>             req.open('GET',
> 'http://www.URL_TO_FETCH_CONTAINING_THE_ABOVE_CODE.dk', true);
>             req.onreadystatechange = function () {
>               if (req.readyState == 4) {
>                  if(req.status != 404) {
>                 var path = '//table/tbody/tr/td/font';
>                 var domParser = new DOMParser();
>                 var dom = domParser.parseFromString(req.responseText,
> "text/xml").documentElement;
>
>                 var obj =
> dom.evaluate(path,dom,null,XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
>                 for (var k = 0; k < obj.snapshotLength; k++) {
>                     obj2 = obj.snapshotItem(k).parentNode;
>                     var nodes = obj2.childNodes;
>
>                     dump("Found keys: "+nodes.length+"\n");
>                     var menu = nodes[0].textContent.trim().split("#");
>                     roundinfo.heroesnick = menu[0].trim();
>                     roundinfo.heroesid = menu[1].trim();
>                     dump("Nick is: "+roundinfo.heroesnick+"\n");
>                     dump("ID is: "+roundinfo.heroesid+"\n\n");
>                 }
>                  }
>                  else
>                   dump("Error loading page with status"+ reg.status +"\n");
>               }
>             };
>             req.send(null);
>
>
> But this only results in this:
> Error: syntax error
> Source File: chrome://browser/content/browser.xul
> Line: 1, Column: 62
> Source Code:
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0
> Transitional//EN">-------------------------------------------------------------^
>
> Any tips on how to do this "right"?
>
>
>
> My regards.
>
> _______________________________________________
> Project_owners mailing list
> Project_owners at mozdev.org
> http://mozdev.org/mailman/listinfo/project_owners
>


More information about the Project_owners mailing list