[Project_owners] evaluate html document by XMLHttpRequest
Tommi Rautava
tommi.rautava at iki.fi
Wed Aug 8 00:02:02 PDT 2007
The problem is that you cannot parse HTML with XML parser. I struggled
with the same problem some time ago and the best solution I have found
so far is the HTML parser example made by Aaron Boodman (the original
author of Greasemonkey):
http://youngpup.net/userscripts/htmlparserexample.user.js
br, Tom
2007/8/7, Foreningen Selvet - Jesper Staun Hansen <jesper at selvet.dk>:
> Hello.
>
>
> I am having some trouble using element.evaluate(properties) with
> documents fetched with XMLHttpRequest, so I am looking for a method to
> do the same as this:
>
> //*****
> var roundinfo = {
> runde : "56",
> heroesid : null,
> heroesnick : null
> }
>
> var path = '//table/tbody/tr/td';
> var obj = document.evaluate(path,document,null,
> XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
> for (var k = 0; k < obj.snapshotLength; k++) {
> obj2 = obj.snapshotItem(k).parentNode;
> var nodes = obj2.childNodes;
>
> dump("Found keys: "+nodes.length+"\n");
> var menu = nodes[0].textContent.trim().split("#");
> roundinfo.heroesnick = menu[0].trim();
> roundinfo.heroesid = menu[1].trim();
> dump("Nick is: "+roundinfo.heroesnick+"\n");
> dump("ID is: "+roundinfo.heroesid+"\n\n");
> }
> //*****
>
> And the document contains this:
>
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
> <html>
> <head>
> <title></title>
> </head>
> <body>
> <table>
> <tr>
> <td colspan="3"><font face="arial" size="1" color="DarkTurquoise">Axe Decapitators #130303</font></td>
> </tr>
>
>
> </table>
> </body>
> </html>
>
> And the dump would be
>
> Nick is: Axe Decapitators
> ID is: 130303
>
> And I tried some things to make it with XMLHttpRequest:
> //******* Attemp:
> var req = new XMLHttpRequest();
> req.overrideMimeType('text/html');
> req.open('GET',
> 'http://www.URL_TO_FETCH_CONTAINING_THE_ABOVE_CODE.dk', true);
> req.onreadystatechange = function () {
> if (req.readyState == 4) {
> if(req.status != 404) {
> var path = '//table/tbody/tr/td/font';
> var domParser = new DOMParser();
> var dom = domParser.parseFromString(req.responseText,
> "text/xml").documentElement;
>
> var obj =
> dom.evaluate(path,dom,null,XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
> for (var k = 0; k < obj.snapshotLength; k++) {
> obj2 = obj.snapshotItem(k).parentNode;
> var nodes = obj2.childNodes;
>
> dump("Found keys: "+nodes.length+"\n");
> var menu = nodes[0].textContent.trim().split("#");
> roundinfo.heroesnick = menu[0].trim();
> roundinfo.heroesid = menu[1].trim();
> dump("Nick is: "+roundinfo.heroesnick+"\n");
> dump("ID is: "+roundinfo.heroesid+"\n\n");
> }
> }
> else
> dump("Error loading page with status"+ reg.status +"\n");
> }
> };
> req.send(null);
>
>
> But this only results in this:
> Error: syntax error
> Source File: chrome://browser/content/browser.xul
> Line: 1, Column: 62
> Source Code:
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0
> Transitional//EN">-------------------------------------------------------------^
>
> Any tips on how to do this "right"?
>
>
>
> My regards.
>
> _______________________________________________
> Project_owners mailing list
> Project_owners at mozdev.org
> http://mozdev.org/mailman/listinfo/project_owners
>
More information about the Project_owners
mailing list