My usual REPLACE code is not working on Reverso.net

senderista

§

發表於：2019-02-10

舉報留言

Howdy,

Usually I have no problem replacing text on pages, but for some reason I am not getting it to work on Reverso.net. To demonstrate the problem, here is an absolutely barebones script:

// ==UserScript==
// @name        Reverso Debug
// @namespace   reverso
// @description Demonstrates problem with Reverso page
// @include       https://context.reverso.net/translation/spanish-english/cachorro
// @include       https://en.oxforddictionaries.com/
// @version     1
// @grant       none
// @run-at      document-idle
// ==/UserScript==

(function() {
    document.body.innerHTML= document.body.innerHTML.replace(
        new RegExp('DICTIONARY', 'g'),
        '<b>=== SUCCESS! ===</b>'  
        )
})()

As you can see,

It works on https://en.oxforddictionaries.com/ (the word 'DICTIONARY') at the top left is replaced with 'SUCCESS'
It fails on https://context.reverso.net/translation/spanish-english/cachorro (the word 'DICTIONARY' in the blue bar is not replaced).

Using GM 3.9 which is the highest version supported by my browser. Would be hugely grateful for any help, as I am completely stuck.

In advance, big thanks!

woxxom管理員

§

發表於：2019-02-10

舉報留言

As you can see in Devtools, the exact text is Dictionary but your regexp is not case-insensitive. Use 'gi' instead of 'g'.
Replacing innerHTML destroys all dynamic event listeners which means it'll break functionality on modern sites. The proper method is to enumerate the text DOM nodes and replace individually via TreeWalker API: example1, example2.

senderista

§

發表於：2019-02-10

舉報留言

Thank you, @wOxxOm That sounds a lot more complex than it used to be! Will have a look at the examples you sent.

senderista

§

發表於：2019-02-10

舉報留言

Hello again @wOxxOm,

Hope your day is going well. I spent the last few hours trying to experiment with TreeWalker. I see how it lets us modify the page one element at a time.

With a basic replace example, it is running. But when I start to beef up the script, I must be making a basic mistake with the TreeWalker constructor, and at this stage I have hit a wall. I wonder if you would be willing to take a quick look at this excerpt in case something jumps out at you?

There are two alert statements here: the first works (after defining the filter function), the second doesn't (after trying to create the TreeWalker).

This excerpt starts at the stage where we've figured out that YES we want to modify that page (the do_replace boolean) and we have defined the adequate parse_pattern and replace_pattern.

// WE'VE DONE A LOT OF SETUP AND NOW WE'RE READY TO
// CHANGE THE PAGE

if (do_replace) {
    // Before creating the TreeWalker, we create a Filter function to define what we're interested in
    // In this script we're only interested in h1 and a tags

    myfilter = function (node) {
        const allowable_elements = ["h1", "a"]

        if (allowable_elements.includes(node.tagName.toLowerCase())) {
            return NodeFilter.FILTER_ACCEPT
        }
        else {
            return NodeFilter.FILTER_SKIP
        }
    }
    alert('CREATED FILTER') // WORKS

    // INITIATE THE TREE WALKER
    const walker = document.createTreeWalker(
        document.body,  // root node
        NodeFilter.SHOW_ELEMENT,
        myfilter,
        false
    )
    alert('CREATED WALKER') // DOES NOT WORK

    while (walker.nextNode()) {
        walker.currentNode.innerHTML = walker.currentNode.innerHTML.replace(
            parse_pattern, // DEFINED OUTSIDE OF THIS EXCERPT
            replace_pattern // DEFINED OUTSIDE OF THIS EXCERPT
        )
    } // END of walking

} // END do_replace

Thank you in advance for any thoughts you can spare!

woxxom管理員

§

發表於：2019-02-10

編輯:2019-02-10

舉報留言

Don't replace in innerHTML directly.
Normal elements may have child elements so in your case you should enumerate the elements using the standard getElementsByTagName
Then use TreeWalker to enumerate the text nodes inside

for (const el of document.getElementsByTagName('*')) {
  if (el.tagName !== 'A' &&
      el.tagName !== 'H1') {
    continue;
  }
  const walker = document.createTreeWalker(el, NodeFilter.SHOW_TEXT);
  while (walker.nextNode()) {
    const text = walker.currentNode.nodeValue;
    const newText = text.replace(parse_pattern, replace_pattern);
    if (text !== newText) {
      walker.currentNode.nodeValue = newText;
    }
  }
}

If you need to replace with HTML (that is to create a tag or change one), it's better to rework the while loop and use direct DOM manipulation instead, which is something you can pursue yourself. A simplified and slower version of the loop for HTML is shown below:

  while (walker.nextNode()) {
    const node = walker.currentNode;
    const text = node.nodeValue;
    const newText = text.replace(parse_pattern, replace_pattern);
    if (text === newText) {
      continue;
    }
    if (newText.includes('<') && /<\w+[^>]*>/.test(newText)) {
      const newNode = document.createElement('span');
      newNode.innerHTML = newText;
      node.parentNode.replaceChild(newNode, node);
    } else {
      node.nodeValue = newText;
    }
  }

senderista

§

發表於：2019-02-11

舉報留言

@wOxxOm

Thank you for explaining all this. Alright, now understanding loud and clear that in dynamic pages I shouldn't replace innerHTML directly... Thank you for insisting.

The script I was working on does add a lot of tags. Because I still had a lot to do on it, I finished it in the usual way and shared it here on GreasyFork:

Reverso Spanish Enhancer script

But as soon as I have time I'd like to dive into what you are talking about, and make that my standard way of working going forward. I think once I have one script working in that way, it will be easier, but the first one is always the hardest. :-)

Thank you for your extremely kind and careful help, which I really appreciated. Wishing you a great week!

senderista

§

發表於：2019-02-16

舉報留言

@wOxxOm Hope you've had a great week.

I know how annoying it can be to spend time giving great advice to people who don't follow it... So I just want to let you know that I fully rewrote my Reverso Enhancer Spanish to use direct DOM manipulation instead of innerHTML.replace()

It was a steep learning curve and makes the script quite a bit longer but it was worth it. Of course that's just the beginning — I realize that I have everything to learn as I never sat down to learn JS.

Big thanks for your encouragements and support!!! Wishing you a fun weekend,

-R

Greasy Fork

My usual REPLACE code is not working on Reverso.net

發表回覆