Discussions » Development

Special Characters / accents

§
Posted: 19 April 2017

Special Characters / accents

var textarea = document.querySelector('textarea#descr'); textarea.value = textarea.value .replace(/((nm=(?=[0-9])))/g, 'nm=nm')

I use this code to make changes to things in a textarea. Yet I can't get it to work with special characters. I tried many ways so came back here to get some help.

I am getting things like this: Le dernier rÃ(c)veillon Émilie (segment "La cireuse électrique")

When it is supposed to look like: Le docteur Féraud (segment "Le roi d'Yvetot") Émilie (segment "La cireuse électrique")

As you can see sometimes the accents are fine and at other times they mess up.

I tried this and many variants: .replace(/Ã(c)/g, 'é')

Any help on replacing special characters would be great.

woxxomMod
§
Posted: 19 April 2017

Normal regexp doesn't work with unicode. Chrome 50+ and Firefox 46+ support u flag e.g. /Ã(c)/gu

§
Posted: 19 April 2017

Tried that in both Chrome and FF and nothing changed.

woxxomMod
§
Posted: 19 April 2017

I didn't read your entire post, sorry. With the u flag you can use the real letters like /Émilie/u

§
Posted: 20 April 2017

There are two ways of drawing accented letters.
First, you may write single character code 'accented A'; second, you may write 'A' followed by special 'accent' code: .
See https://en.wikipedia.org/wiki/Combining_character

§
Posted: 20 April 2017

OK. Yet I still was not able to get it to see the txt in the txt area.

Thanks for the response guys. I am guessing that at this point it is to big an issue to handle.

Also getting "You don't have permission to do that." while working in FF to make a post. Had to jump to Chrome to post this msg.

§
Posted: 11 Desember 2017
Edited: 11 Desember 2017

Finally got it figured out.

.replace(/\u00C3\u02C6/ig, '\u00C8') // È
.replace(/\u00C3\u2030/ig, '\u00C9') // É
.replace(/\u00C3\u0160/ig, '\u00CA') // Ê

Only real issue is when UTF-8 bytes being interpreted as Windows-1252 (or ISO 8859-1) bytes is that these characters can't be resolved.

à matches these letters í Ý Ð Ï Í Ã À and I cant figure out a way to get around it.

Thanks for the help all even though it took me eight months to get it finally figured out.

woxxomMod
§
Posted: 11 Desember 2017

@nickodemos, in new browsers you can try unicode mode in regexp: u flag, see https://mathiasbynens.be/notes/es6-unicode-regex

§
Posted: 12 Desember 2017

Thanks wOxxOm that will help me out later when I rewrite this later with more unicode characters.

Post reply

Sign in to post a reply.