I am suddenly receiving a pipe separated text file with imbedded pseudo-unicode characters that are supposed to be a macronated 'o'. Unfortunately it translates to a ^Z in the ASCII file.
All the vowels can have macrons in Maori and there is a new policy that all government departments must add macrons. If all software was unicode aware then this might work.
Some programs will handle this, reading the whole file regardless of the ^Z but many stop. The FME CSV2 reader stops. Oddly the FME Textfile reader does handle them with encoding set to DOS-Latin-1 (ibm-850)
What can I do?
The simplest idea is to translate the pair of characters ^Zo back to a plain ascii o.
Surely tr could just strip the ^Z...nope.
I have tried to use utf-8 encoding parameter on the CSV reader and other tricks with tr without success.
I have attached a test sample.asc of two records.
Unix wc -l returns a count of 1, not a good start since I can cat two records.