Discussions » Development

How to change exact words

§
Posted: 2017-01-10

How to change exact words

So I've modified this script "Replace Text On Webpages" to make "Japanese DLSite translation" so it could work here for example.
But I don't understand JavaScript code (or any code for that matter), and I don't understand how it works or what to use for it to work as I want it. Code like...
'/\\b年\\b/g' : '/',
'/\\b月\\b/g' : '/',
'/\\b日/g' : '',

...changes only data display from 2014年03月15日 to 2014/03/15 and doesn't touch any similar words elsewhere, but at the same time it won't change "検索条件" to "Search for" if I use:
'/\\b検索条件/g' : 'Search for',
And without \\b script obviously changes everything that matches it, while I need it to change, for example, 検索条件 but not 保存した検索条件. I.e. words that look something like 検索条件 in html.

wOxxOmMod
§
Posted: 2017-01-10

Maybe that element is added dynamically after the page was loaded and processed by your userscript. You can test whether this is the case by delaying the processing: replace (function () { at the beginning with setTimeout(function() { and }()); at the end with }, 1000); to delay for 1000ms (1 second). Or maybe increase the delay. If this is the case, you'll be better off using MutationObserver, I'll show you how.

§
Posted: 2017-01-10

No. Delay happens (I even set it to 5 sec) but '/\\b検索条件/g' or '/\\b検索条件\\b/g' doesn't work (see pic.1) while '検索条件' works like usual (see pic.2).

wOxxOmMod
§
Posted: 2017-01-10

JavaScript regexp support for unicode is buggy with \b. Try '/(?:^|\\W)検索条件(?:\\W|$)/g'

§
Posted: 2017-01-10
Edited: 2017-01-10

Almost! I don't have slightest idea what it doing, but '/(?:^|\\s)FORK/g' works if there's nothing after it, i.e. it seems to be leaving "pitchfork" (but not "forkhead") intact, and '/(|\\s)SPOON\\s/g' works if there's something before and after it, leaving "spoonfeed" and "teaspoon" intact.
Problem with "spoon" is that \\s eat space (and probably numbers and non-japanese letters) after it, i.e "spoon 100" becomes "spoon100".

wOxxOmMod
§
Posted: 2017-01-11
Edited: 2017-01-11

Indeed, there's a problem. It got me interested and I've solved it by separating the words into three classes: separate words, words with boundary at start, substrings anywhere. http://p.ip.fi/rX28 - my code processes the page while it's loading so the translated text is presented as though it was served that way. Not sure it works in Opera 12 you appear to be using judging by useragent icon in your post but, hopefully, it'll be useful at least as an example.

N.B. my code uses different userscript metablock flags!

P.S. in short, all words within a class (see above) are concatenated into a single regexp that correctly preserves preceding and following spaces, [semi]colons, and other punctuation:

var wordStarts = {
    '日' : '',
    ........................
};
var wordContains = {
    '対象性別' : 'Audience',
    ........................
};
var wordEquals = {
    '年' : '/',
    ........................
};
var boundaryChars = '[\\s,./?;:\'"/\\[\\]{}\\-_=+`~!@#$%^&*()<>|\\\\]';
var reEquals = new RegExp('(^|' + boundaryChars + ')(' + wordsAsRegexp(wordEquals) + ')(' + boundaryChars + '|$)', 'g');
var reStarts = new RegExp('(^|' + boundaryChars + ')(' + wordsAsRegexp(wordStarts) + ')', 'g');
var reContains = new RegExp(wordsAsRegexp(wordContains), 'g');

function wordsAsRegexp(words) {
    function escapeStringForRegExp(s) {
        return s.replace(/[{}()\[\]\/\\.+?^$:=*!|]/g, "\\$&");
    }
    return escapeStringForRegExp(Object.keys(words).join(String.fromCharCode(1))).replace(/\x01/g, '|');
}
function doReplace(textNode) {
    var text = textNode.nodeValue;
    var newText = text.replace(reContains, function(word) {
        return wordContains[word];
    }).replace(reStarts, function(s, prefix, word) {
        return prefix + wordStarts[word];
    }).replace(reEquals, function(s, prefix, word, suffix) {
        return prefix + wordEquals[word] + suffix;
    });
    if (newText !== text) {
        textNode.nodeValue = newText;
    }
}
§
Posted: 2017-01-12

Nice. Yeah, it doesn't work in Opera 12, but it works in Firefox so it's fine. But while it's better than what I've done the last time, I've encountered some other problems as well (if you're interested I can elaborate).
Then I had a sudden inspiration: what I need to do with this script and this site it isn't to change words in general, but only exact symbols in "a href", "option value", "span" and "label" from > to < and everything left then could be dealt by current script using '/(?:^|\\s)WORD/g' and the likes, so they won't touch anything in links/titles or previous words. On the other hand, such script would have to deal with words that can simultaneously be in "a href" and "span" and also somehow deal with sidebar links which might have space after words so they look like >word < instead of >word<.
I tried to make something myself or to find ready-made solution/script which already did something like this to insert into current, but failed miserably.

§
Posted: 2017-01-13

Solved problem by using /(:|^)WORD(:|$)/g, /\\bWORD(:|$)/g and /\\bWORD/g. Thanks, everything really helped.

Post reply

Sign in to post a reply.