Solved

Merge Daily csvs into one file

4 years ago
August 4, 2020
7 replies
59 views

markcoopersdc
Contributor
27 replies

We are recieving a daily csv from a supplier and I would like to merge them all into one file (ideally write to a sql server database but could just be a csv). I have enabled the reader to read everything in the folder, which works fine. However, if I run this every day, it will become increasing inefficient to truncate the table being written and update it.

Is there a way that I can check if a file has already been written and then just append ones that haven't?

Thanks

Mark

Best answer by redgeographics

markcoopersdc wrote:

Thank you for your quick response!

Unfortunately, the data belongs to a different department so I wont be able to delete anything or amend the file structure.

I was wondering if there was a way more to check the names of the files that have already been read and saved to the merged file (if I add fme basename) and get it to only read those which have not. I could read the output file and retrieve the basenames that exist (get unique values through something like a Duplicate filter). However, I don't know how I can tell a reader only to read certain files (the ones where the basename does not appear in the duplicate filer) - is there a way to pass a list of files to the reader?

Thank you for your help

Well, that makes things a bit more complicated. You can keep a log of your own of files you've already processed and then use a File/Directory Path reader to get a list of files first, compare that list to yours and then only process the ones you haven't done yet. Instead of using a CSV reader in your workspace you'll have to use a FeatureReader so you can still do the whole thing in a single workspace.

View original

Did this help you find an answer to your question?

+49

redgeographics
Celebrity
3632 replies
4 years ago
August 4, 2020

If you keep the daily csv's in a separate folder you can add a Python shutdown script to delete all files in that folder when the workspace is done. Alternatively you can use a SystemCaller to do the same thing via a command line.

That way it'll make sure no daily file is processed more than once.

markcoopersdc
Author
Contributor
27 replies
4 years ago
August 4, 2020

Thank you for your quick response!

Unfortunately, the data belongs to a different department so I wont be able to delete anything or amend the file structure.

Thank you for your help

+49

redgeographics
Celebrity
3632 replies
Best Answer
4 years ago
August 4, 2020

markcoopersdc wrote:

Thank you for your quick response!

Unfortunately, the data belongs to a different department so I wont be able to delete anything or amend the file structure.

Thank you for your help

jlbaker2779
194 replies
4 years ago
August 4, 2020

Create a SQL table, bring it in to FME, bring in the CSV, run them both through a ChangeDetecror. From there you will have;

updated - records that have changed

insert - new records

deleted - deleted records

unchanged - existing records that are unchanged

After that add 2 SQL writers;

Order matters in your navigator so put the delete as the highest by right clicking and click 'move up' until it is above the insert.

1st writer is a delete operation to handle the updated records

2nd writer is an insert operation for both the update and insert records

Deletes and unchanged shouldn't be needed.

markcoopersdc
Author
Contributor
27 replies
4 years ago
August 4, 2020

redgeographics wrote:

Amazing - this is exactly what I'm looking for, thank you. However, I'm new to the feature reader and am having an issue. Its reading the features correctly but its not actually bringing in any data - when I view what is on the generic port, it shows the correct number of records, but it says No Schema. Any idea what I am doing wrong?

Thanks featurereader2

+49

redgeographics
Celebrity
3632 replies
4 years ago
August 5, 2020

redgeographics wrote:

I forgot, the FeatureReader can be a bit tricky to set up if you're feeding it filenames from an attribute. Basically you'll need to specify an output port and name it what the feature type name would be if you'd be using a regular reader, so in this case that's CSV

Screenshot 2020-08-05 at 11.23.48

Then if you click OK you'll get a popup window asking about generating output ports. Select one of the original CSV files there.

Screenshot 2020-08-05 at 11.23.12

Then it should route all output through the CSV port and it'll keep the schema.

markcoopersdc
Author
Contributor
27 replies
4 years ago
August 5, 2020

redgeographics wrote:

Brilliant 😊 . I had played around with a lot of the settings in the feature reader, but not this one. Its now working perfectly and doing exactly what I needed it to. Thank you so much for your help

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Merge Daily csvs into one file