FME How to read multiple CSV files from many folders and write to GDP tables by folder

Question

I have 13000+ CSV files, all containing around 13000 rows. These CSV files are arranged in 146 different folders. All CSV files have the same schema with ; as separator. These CSV files contain about 175 million rows of data combined. The folders are named as seen in the picture below (for example ...\\CSV_folders\\5785xxx)

CSV files are named like these:

I already tried to copy all CSV files to same directory and run them to one GDB with FME. This went well... but resulted too heavy ESRI GDB that takes forever to search through. (175 million rows, around 20GB)

Now I would need to make things a bit differently. I would like to work with FME to create one ESRI Geodatabase. It should contain tables named like these folders containing CSV files. A table in the GDB should have all the CSV files loaded in the table, that are inside the original folder. This would split my 175 million rows to 146 tables, making it a bit faster to search trough with Arcmap.

So my problem is, how to make FME read CSV files from all of the folders and write a GDB with tables named like the folder, each table containing the same information the CSV files have in the original folder combined. I'm not familiar with PostGIS etc. (and it would not work with ArcGIS), so this needs to be done with some kind of ESRI GDB workaround..

icon

Best answer by gifupack 16 November 2016, 07:28

View original

gifupack · Accepted Answer

IgotananswertomyquestionfromnicnameFezteratGISstackexchangewebsite.Icopy-pastedittohere,becauseIthinktheanswerwasgreat,withclearexplanation.ThisisaprettystraightforwardexerciseinFME.You'llneedtodothefollowing:LoadyourCSVsusingadynamicreader(SingleMergedFeatureType)andpointtothewholefolderwhereyourCSVsarestored.Usetheadvancedbrowsertoselectthewholefolder:Ensureyouselectsearchsubfolders:Inthereader,exposethefme_datasetattributebyrightclickingonthereaderandclickingonProperties.ThengototheFormatAttributestab:Inyourworkbench,addaFilenamePartExtractortransformerandpointittothefme_datasetattribute.Thefieldyouneedis_dirname.Finally,inyourwriter,you'llwanttosetyourTablenametoanexpressionwhichincludesyour_dirnamefromtheFilenamePartExtractor.ThereasonIdidthiswasbecauseyourfoldernamesstartwithanumberandfeatureclassesinafilegeodatabasecannotbeginwithanumber.Alsonotethatifyou'reusingtheArcObjectsFileGeodatabasewriter,youwillsetyourgeometrytogeodb_table.Ifyou'reusingtheAPIwriterthenyouwillsetyourgeometrytogeodb_no_geom:YoucanseethatIhadfolderswithintheCSVfoldercalled1,2and3.TheywrotetotheGeoDatabaseastablescalledTable_1,Table_2,andTable_3:

itay · Answer

Hi @gifupack, I would look into using the fme format attributes (fme_dataset, fme_feature_type) for defining the gdb output tables and for using in a fanout on the gdb writer (to produce multiple tables based on an attribute name or combination)

https://knowledge.safe.com/articles/565/fanout-1.html

https://knowledge.safe.com/articles/192/whats-the-difference-between-user-attributes-and-f.html

A tip on reading the CSV data, have a look at the Point Cloud XYZ Reader.

Hope this helps.

FME How to read multiple CSV files from many folders and write to GDP tables by folder

2 replies

Reply

Community Stats

Reply

Community Stats

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded