Pasting from MS Word to Text Area causes error

If you have questions or if you want to share your opinion about Aware IM post your message on this forum
Post Reply
kklosson
Posts: 1617
Joined: Sun Nov 23, 2008 3:19 pm
Location: Virginia

Pasting from MS Word to Text Area causes error

Post by kklosson »

A frequent problem encountered by users is an error similar to the attachment when pasting text from MS Word or PDF. Has anyone nailed this down for a fix?
Error.gif
Error.gif (191.19 KiB) Viewed 2343 times
V8.8
MySQL, AWS EC2, S3
PDFtk Toolkit
hpl123
Posts: 2579
Joined: Fri Feb 01, 2013 1:13 pm
Location: Scandinavia

Re: Pasting from MS Word to Text Area causes error

Post by hpl123 »

I also get this in some situations, primarily when trying to copy strings from a Wordpress DB into a Aware DB. Support or anyone know what this is and how to solve it? I guess it has to do with character/charset incompability between the 2 mediums.
Henrik (V8 Developer Ed. - Windows)
joben
Posts: 221
Joined: Wed Nov 06, 2019 9:49 pm
Location: Sweden
Contact:

Re: Pasting from MS Word to Text Area causes error

Post by joben »

Have had similar issues when copy pasting from Sharepoint sites or doing csv imports.
Most often it had to do with a strange space or line break characters.
We did not find an automatic solution. The two methods we used:
* Figuring out what the faulty encoded characters were, and then search replaced them with notepad++ or excel (if they were lists).
* Copy pasting text to a notepad document so that the text became sanitized, unformatted, etc. Then we copy pasted that into AwareIM.

Both of these methods were time-consuming and just felt plain wrong, but this is the best we could come up with.
Regards, Joakim

Image
kklosson
Posts: 1617
Joined: Sun Nov 23, 2008 3:19 pm
Location: Virginia

Re: Pasting from MS Word to Text Area causes error

Post by kklosson »

I have also found that you can copy the text to Notepad and save it as ANSI, then paste into the text area okay. I could do a REPLACE_PATTERN() but not sure what I'm looking for.
V8.8
MySQL, AWS EC2, S3
PDFtk Toolkit
PointsWell
Posts: 1457
Joined: Tue Jan 24, 2017 5:51 am
Location: 'Stralya

Re: Pasting from MS Word to Text Area causes error

Post by PointsWell »

AIM errors like that tend to suggest a database issue. What is the underlying text encoding of the database?
kklosson
Posts: 1617
Joined: Sun Nov 23, 2008 3:19 pm
Location: Virginia

Re: Pasting from MS Word to Text Area causes error

Post by kklosson »

In my case, MySQL, utf8
Screenshot_4.png
Screenshot_4.png (41.82 KiB) Viewed 2290 times
V8.8
MySQL, AWS EC2, S3
PDFtk Toolkit
PointsWell
Posts: 1457
Joined: Tue Jan 24, 2017 5:51 am
Location: 'Stralya

Re: Pasting from MS Word to Text Area causes error

Post by PointsWell »

It seems like MySQL has common problems with some Unicode characters (google 'xF0\x9F\x87\xBa\xF0\x9F')

What is the text encoding that you are pasting in?

Try this to get the encoding
https://www.webatic.com/encoding-explorer

(no warranties as to its efficacy)
BobK
Posts: 544
Joined: Thu Jan 31, 2008 2:14 pm
Location: Cincinnati, Ohio, USA

Re: Pasting from MS Word to Text Area causes error

Post by BobK »

if you decide to remove those characters, the following will do it:

Code: Select all

ExecutiveSummary.Narrative=REPLACE_PATTERN(REPLACE_PATTERN(ExecutiveSummary.Narrative, '[\x00-\x1F]', ''), '[\x7F-\xFF]', '')
That will remove the 'xF0\x9F\x87\xBa\xF0\x9F' that were in you gif plus any other formatting or control characters.
It will also remove any tabs or new line characters. If those are needed, the REPLACE_PATTERN can be tweaked to keep those.

FYI: here is one of many sites that displays ascii codes: https://www.ascii-code.com/
Bob
kklosson
Posts: 1617
Joined: Sun Nov 23, 2008 3:19 pm
Location: Virginia

Re: Pasting from MS Word to Text Area causes error

Post by kklosson »

Thanks BobK, but over the years, I have seen a plethora of characters presented to the user with this error. It doesn't seem I can address every potentiality.
V8.8
MySQL, AWS EC2, S3
PDFtk Toolkit
Post Reply