Question

Loading and processing multiple unstructured text files

  • 21 April 2015
  • 2 replies
  • 2 views

Badge
I'm trying to setup a FME workbench process that will run daily which will read a series of text files from a network directory, and load it into a a database. 

 

 

The header of the text file contains information about the sensor, including a georeference. The body of the file contains readings for every hour of the day. My aim is to create one database table containing the sensor information, and another table containing the readings, with the primary key being the sensor id.

 

 

 

SAMPLE FILE

 

Filename: 55_22_20130626_2359.txt

 

Sensor Id: 55.22 
 [SEQUENCE] REVISION=1.0 [DESCRIPTION] ID=55.22   DESCRIPTION=Bypass Loop, Central Tunnel                                        GEOREF=WGS84,-33.865977,151.205619 INTERVAL=60 QUANTUM=READINGS [55.22]:STARTS 20130626,00:00,40 20130626,01:00,37 20130626,02:00,27 20130626,03:00,19 20130626,04:00,16 20130626,05:00,29 20130626,06:00,30 20130626,07:00,19 20130626,08:00,24 20130626,09:00,26 20130626,10:00,33 20130626,11:00,34 20130626,12:00,42 20130626,13:00,49 20130626,14:30,84 20130626,15:00,61 20130626,16:00,86 20130626,17:00,140 20130626,18:00,211 20130626,19:00,223 20130626,20:00,283 20130626,21:00,294 20130626,22:00,226 20130626,23:00,184 [55.22]:ENDS
 

 

From what I've read, I can use the workspace runner to iterate through all the files in the directory, but what I'm unsure about is how to read and process both the header and the body of each text file given that the structure for the header and body are quite different. 

 

 

Any ideas?

 

 

Cheers

 

MK

2 replies

Userlevel 4
Hi,

 

 

sounds like a perfect candidate for a Python custom reader, if you're comfortable with Python.

 

 

Look in the folder <FME_HOME>\\pluginbuilder for documentation and samples.

 

 

Some info is also available here: https://knowledge.safe.com/articles/How_To/Developing-a-new-format-reader-writer-with-the-FME-Plug-in-SDK

 

 

David

 

 
Badge +3
Does'nt realy matter, the difference in structure.

 

 

Use a txt_reader.. as it reads it row by row.

 

It's all about match/searching  the rows. If the structure of the files are fixed just expose "rownumber" and assign attributes until the reader gets at "STARTS"

 

Then read in the data.

 

Use regexes or in case attributes are fixed, the names.

 

 

If textfiles contain more ID"s then you can use variabllesetter/retriever combo to read them.

 

 

No need for any scripting at all.

Reply