Solved

I have a map filled with html files encoded in windows-1252. I need to change all files encoding to utf-8. Is that possible in FME?

  • 3 November 2021
  • 9 replies
  • 31 views

Badge +2

I think it takes a lot of time to convert one by one.

icon

Best answer by daveatsafe 8 November 2021, 17:29

View original

9 replies

Userlevel 2
Badge +17

Hi @lily​,

 

Yes, it's possible with a simple Text File to Text File conversion. Set the encoding on the Text File reader to Latin-1 (windows-1252) and the encoding on the Text File writer to Unicode 8-bit (utf-8). This will create an output file identical to the input, except with the different encoding.

 

However, if you have any tags within the HTML identifying the encoding, these will need to be changed as well. You can do this by adding a StringReplacer to the workspace to replace the string 'iso-8859-1' with 'UTF-8'.

Userlevel 4
Badge +25

@daveatsafe​ has the correct solution here - but to illustrate it I made this one of my question-of-the-week and added it to a video here: https://youtu.be/uyF7MEuBdK0

Badge +2

Thank you Dave and Mark! I will try Dave's solution and give a reply as soon as I can!

Badge +2

Hi @lily​,

 

Yes, it's possible with a simple Text File to Text File conversion. Set the encoding on the Text File reader to Latin-1 (windows-1252) and the encoding on the Text File writer to Unicode 8-bit (utf-8). This will create an output file identical to the input, except with the different encoding.

 

However, if you have any tags within the HTML identifying the encoding, these will need to be changed as well. You can do this by adding a StringReplacer to the workspace to replace the string 'iso-8859-1' with 'UTF-8'.

Hi @daveatsafe​ ,

Thank you for your solution!

I have tried it and it works with one file at a time.

Then I tried using Zip instead since I wish to get all files done with the encoding workspace. But I ended up with a big html (instead of several html files which is suppose to be the same number of files in the original).

So I tried batch processing with reader "Directory and File Pathnames",

But now facing the problem that destination folder option is not available. Instead it writes everything to a single file too.

Any tips?

Badge +2

@daveatsafe​ has the correct solution here - but to illustrate it I made this one of my question-of-the-week and added it to a video here: https://youtu.be/uyF7MEuBdK0

Thank you @mark2atsafe​ ! I have seen your youtube video and it helps a lot! =)

Userlevel 2
Badge +17

Hi @daveatsafe​ ,

Thank you for your solution!

I have tried it and it works with one file at a time.

Then I tried using Zip instead since I wish to get all files done with the encoding workspace. But I ended up with a big html (instead of several html files which is suppose to be the same number of files in the original).

So I tried batch processing with reader "Directory and File Pathnames",

But now facing the problem that destination folder option is not available. Instead it writes everything to a single file too.

Any tips?

Hi @lily​,

You can use the Dataset Fanout to distinguish the output files:

  • Open the input Text file feature type properties, pick the Format Attribute tab, then click the box beside fme_basename, if it not already clicked.
  • In the Navigator pane of Workbench, expand the parameters for the Text File writer, then double click on Fanout Dataset.
  • Set the Destination Fanout Directory to the output zip file (zip files are considered folders by FME)
  • Set the Fanout Expression to '@Value(fme_basename).html'

This should write each input file to a separate output file in the output zip file.

Badge +2

Hi @daveatsafe​ ,

Thank you for your solution!

I have tried it and it works with one file at a time.

Then I tried using Zip instead since I wish to get all files done with the encoding workspace. But I ended up with a big html (instead of several html files which is suppose to be the same number of files in the original).

So I tried batch processing with reader "Directory and File Pathnames",

But now facing the problem that destination folder option is not available. Instead it writes everything to a single file too.

Any tips?

Thank you @daveatsafe​ !

I will give a feedback as soon as I can! BeSafe =)

Badge +2

Hi @daveatsafe​ ,

Thank you for your solution!

I have tried it and it works with one file at a time.

Then I tried using Zip instead since I wish to get all files done with the encoding workspace. But I ended up with a big html (instead of several html files which is suppose to be the same number of files in the original).

So I tried batch processing with reader "Directory and File Pathnames",

But now facing the problem that destination folder option is not available. Instead it writes everything to a single file too.

Any tips?

It works perfectly!! Now I can move on to my next assignment =)

Badge +2

Hi @daveatsafe​ ,

Thank you for your solution!

I have tried it and it works with one file at a time.

Then I tried using Zip instead since I wish to get all files done with the encoding workspace. But I ended up with a big html (instead of several html files which is suppose to be the same number of files in the original).

So I tried batch processing with reader "Directory and File Pathnames",

But now facing the problem that destination folder option is not available. Instead it writes everything to a single file too.

Any tips?

Thank you!! @daveatsafe​ @mark2atsafe​ 

Reply