Skip to main content
Question

CSV Reader skips duplicate file name

  • December 5, 2019
  • 5 replies
  • 29 views

Hi, good FME people!

First post here, let's see if we can solve this.

I am using a CSV reader to read all the files within a folder. [...\\rawdata\\*}], regardless of file extension.

In the reader parmeter menu I am using the standard dataset parameter, feature type name from file name.

I am noticing that not all files are getting through the reader. And by checking all unique fme_feature_types against what is actually in the folder it seems that duplicate filenames a skipped.

The problem might be that there are duplicate file names, but in windows explorer, they are not considered duplicates as they have different extensions. Ie "1204NA.SND" and 1204NA.cpt"

Is there a good solution for this?

 

Thank you in advance

Victor

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

5 replies

redgeographics
Celebrity
Forum|alt.badge.img+62

This is due to CSV using the filename as the feature type name, I guess the best way to work around this is to use a Directory/File reader and either scan for duplicates and rename those before reading them as CSV's or try and go directly to a FeatureReader (but that may cause issues later on if the structure of the files differs)


erik_jan
Contributor
Forum|alt.badge.img+26
  • Contributor
  • December 5, 2019

Another option would be to use multiple CSV readers (one per extension) and instead of using the *, use the *.snd, *.cpt as a file filter in the reader.

Will be harder to maintain, but will get all the files read.


  • Author
  • December 6, 2019

Another option would be to use multiple CSV readers (one per extension) and instead of using the *, use the *.snd, *.cpt as a file filter in the reader.

Will be harder to maintain, but will get all the files read.

Thought about that, that's a viable option.

Is it possible to have several readers (w different extensions) but just one referation to the path? Or do I need to change the path on every single one?


  • Author
  • December 6, 2019

This is due to CSV using the filename as the feature type name, I guess the best way to work around this is to use a Directory/File reader and either scan for duplicates and rename those before reading them as CSV's or try and go directly to a FeatureReader (but that may cause issues later on if the structure of the files differs)

Good idea, I really want to look into the Directory/File reader, but haven't really figured it out. Is there a good tutorial on its use somewhere? Looked for that a couple of months ago but gave up!


jovitaatsafe
Safer
Forum|alt.badge.img+11

Good idea, I really want to look into the Directory/File reader, but haven't really figured it out. Is there a good tutorial on its use somewhere? Looked for that a couple of months ago but gave up!

Hi @virre,

I wasn't able to find any article examples, but our help documentation has been much improved recently so the Directory and Filenames Reader Parameters actually includes a lot of examples if you expand the sections.

I took another look for examples and Takashi's answer to a file copy question has a great relevant example. Some changes you might consider are perhaps to add an AttributeValidator to check for unique rootnames, and maybe a Counter for duplicate names, and then you can set the new value to something like <@Value(path_rootname)@Value(_count)> to create new names for these duplicates. Lots of room here for creativity to get this working the way you want it.