Discussions » Development

Parsing javascript on HTML pages retrieved with XHR

§
Posted: 2023-05-30
Edited: 2023-05-30

Hello

Is it possible to parse javascript on HTML pages retrieved with XHR?

Is there a Javascript library that parses Javascript?

I know it sounds strange but this is an actual use case.

The alternative, which is not preferred, because I want to avoid any javascript execution that is not from the userscript itself, is to load HTML page inisde an iframe, fetch data from it and then remove the iframe.

Thank you

§
Posted: 2023-05-30

ChatGPT says (cause I am lazy to type)

Here is an example of how you can parse javascript on an HTML page retrieved with XHR using the `DOMParser` object:

```javascript
var xhr = new XMLHttpRequest();
xhr.open('GET', 'http://www.example.com/test.html', true);
xhr.onreadystatechange = function() {
if (xhr.readyState === 4) {
var dom = new DOMParser().parseFromString(xhr.responseText, 'text/html');
console.log(dom.querySelector('#someElementId').innerText);
}
}
xhr.send();
```

In this example, an XHR request is made to retrieve an HTML page from `www.example.com/test.html`. Once the request is complete and the response is received, the `DOMParser` object is used to parse the response text into an HTML document. The `querySelector` method is then used to access an element with the id `someElementId` within the parsed HTML and log its inner text to the console.


You should try using it more often

§
Posted: 2023-06-14
Edited: 2023-06-14

I actually did so. See line 2199.

contentReady.then(
  function(request) {
    const domParser = new DOMParser();
    const rawDocument = domParser.parseFromString(request.responseText, 'text/html');

I might need to add scroll events.

Here is a test page (I will delete this link immediately, so don't delete your message before you copy it) [LINK DELETED]

Do not run JavaScript for DOM Parsing. It is just not working. You can just extract the information from DOM Parsing and then run with your own code. All JavaScripts written in page are not reliable. Those results will not be expected as the same as you run it in your browser.

You can run Pure JavaScript (not mixed with HTML) using eval(...) or (new Function(...))() but both are not recommended as the script might crash your browser.

Thank you CY Fung.

Then I'm in a challenge here.

The reason I use XHR is to circumvent page script and it works great most of the times, resulting in not having any unsolicited popups in the viewport.

However, in some cases I do need the local scripts in order to get the full page, or.. I'll probably begin to probe servers and send XHR from the script itself.

You can use embedded iframe (visually hidden in page) to get the full page with javascript execution.

Execute the userscript in both iframe and top window

inside iframe you can control and grab the info

top.postMessage to message back what you want to the main window.

// ==UserScript==
// @name         Userscript Communication Demo
// @match        *://*/*
// ==/UserScript==

// Check if the script is running in the top window
if (window.self === window.top) {
    // Execute code in the top window
    console.log("Running in the top window");

    // Send a message to the iframe
    window.addEventListener('message', function(event) {
        if (event.data === 'Hello from iframe') {
            console.log("Received message from iframe:", event.data);
            event.source.postMessage('Hello from top window', event.origin);
        }
    });
} else {
    // Execute code in the iframe
    console.log("Running in an iframe");

    // Send a message to the top window
    window.parent.postMessage('Hello from iframe', '*');

    // Receive message from the top window
    window.addEventListener('message', function(event) {
        if (event.data === 'Hello from top window') {
            console.log("Received message from top window:", event.data);
        }
    });
}
§
Posted: 2023-07-07

You should use markdown. @hacker09

Post reply

Sign in to post a reply.