Hi @linhgg2, are you try to read .doc files?
I have installed FMe Desktop 2017 and i didnt see a Reader for Microsoft Word.
No MS Word Reader available in FME 2017 yet.
So, I do not see any other option than converting to Text.
If your document is a .docx type file, it is actually a zip file containing several XML files etc. that you can read with FME. Here's what it might look like when opened in 7zip:
But I agree that unless you feel adventurous, it is probably easier to convert it to text first.
I have the same problem with the Ordnance Survey Local Custodians table:
https://www.ordnancesurvey.co.uk/docs/product-schemas/addressbase-products-local-custodian-codes.zip
I tried using the XML Reader but it won't open the .docx.
Converting the file to text in Word seems to result in the loss of the table structure.
I found saving the Word doc as HTML worked quite well (although still a manual step). Once it is in that format, FME will read it using the HTML Table Reader, and even strips out the title and text above the table.
I did find that the column headings got treated as a data row. Maybe the headings are formatted as HTML TD tags rather than TH ones. A handy update to the Reader would be to have something similar to CSV where you can specify whether there's a header row. A workaround is to tell the HTML Table Reader to start at feature 2.
I had soem data handed to me in Word not long ago..
I just stuffed it in a txt/csv file and proceedde from there.
If it is formatted somehow in word i basicaly used variablesetters/and retrievers an a lot of regexp in stringsearchers etc.