It is possible that those csv in different zip files may have the same name and cause data to lump together when FME read it.
1) Maybe you can try set the reader to read Directory and Files Name, set Path Filter to *.* and expose path_windows. This should read all the files submitted by the user.
2) Now you add another feature reader after that one to to read csv data. and put dataset path as path_windows. (You will need seperate reader if there's other file extension besides .csv)
3) in the CSV feature reader > Parameter > Schema Attribute to expose >select fme_dataset and fme_basename
4) Under Output > Single Output Port> Attribute and Geometry Handling > <Generic> Port> put fme_dataset and fme_basename
5) Connect Matcher to <Generic> port and in Matcher you can uncheck the "Check Geometry" and in Check Attributes > Match Selected Attributes and Select fme_dataset
6) Expose the csv attributes and start your processing
This should let you read all the csv regardless of schemas or if they have the same name in different zip file or not. _match_id allow you to group csv easier and you can handle them differently.
I hope i understand your problem correctly. Happy FMEing!
It is possible that those csv in different zip files may have the same name and cause data to lump together when FME read it.
1) Maybe you can try set the reader to read Directory and Files Name, set Path Filter to *.* and expose path_windows. This should read all the files submitted by the user.
2) Now you add another feature reader after that one to to read csv data. and put dataset path as path_windows. (You will need seperate reader if there's other file extension besides .csv)
3) in the CSV feature reader > Parameter > Schema Attribute to expose >select fme_dataset and fme_basename
4) Under Output > Single Output Port> Attribute and Geometry Handling > <Generic> Port> put fme_dataset and fme_basename
5) Connect Matcher to <Generic> port and in Matcher you can uncheck the "Check Geometry" and in Check Attributes > Match Selected Attributes and Select fme_dataset
6) Expose the csv attributes and start your processing
This should let you read all the csv regardless of schemas or if they have the same name in different zip file or not. _match_id allow you to group csv easier and you can handle them differently.
I hope i understand your problem correctly. Happy FMEing!
Thanks for your suggestion but no luck so far I'm afraid.
Basically what I want to do is allow the user to upload two or more zipfiles with this content:
The csv's all have a different schema (and it's important, because there's a lot of logic happening) and their names *may* be prefixed, although I think we can make a good case to not allow that. I also want to keep it as simple for user as possible, so only a single upload.
I currently have it set up with a "Files" type parameter;
Which then gets put into a Generic reader:
The \\* is to force the Generic reader to look inside the zipfile. I've added a CSV reader as a resource and imported the feature types, using wildcard matching to make sure the right data ends up in the right place.
This works fine when I only process one zipfile at a time. But when I try two the $ZIPFILE parameter looks like this: ""C:\\Temp\\File1.zip" "C:\\Temp\\File2.zip"" and the Generic reader then only processes the last one. So that's why I decided to try the same approach with a FeatureReader, by splitting the parameter into its parts first and using each as a separate initiator. But then I can't seem to get it to separate the csv's out 😕 In fact, I can't even expose the original filename as an attribute so I can manually filter them.
Hi @Hans van der Maarel ,
Can FilePathExtractor from FME Hub help you?
This transformer extracts multiple zip files and outputs features which have some attributes containing extracted file path etc. Something like PATH reader.
Thanks for your suggestion but no luck so far I'm afraid.
Basically what I want to do is allow the user to upload two or more zipfiles with this content:
The csv's all have a different schema (and it's important, because there's a lot of logic happening) and their names *may* be prefixed, although I think we can make a good case to not allow that. I also want to keep it as simple for user as possible, so only a single upload.
I currently have it set up with a "Files" type parameter;
Which then gets put into a Generic reader:
The \\* is to force the Generic reader to look inside the zipfile. I've added a CSV reader as a resource and imported the feature types, using wildcard matching to make sure the right data ends up in the right place.
This works fine when I only process one zipfile at a time. But when I try two the $ZIPFILE parameter looks like this: ""C:\\Temp\\File1.zip" "C:\\Temp\\File2.zip"" and the Generic reader then only processes the last one. So that's why I decided to try the same approach with a FeatureReader, by splitting the parameter into its parts first and using each as a separate initiator. But then I can't seem to get it to separate the csv's out 😕 In fact, I can't even expose the original filename as an attribute so I can manually filter them.
Good morning @Hans van der Maarel. The only different we have on the User parameter is that i have Path Selection = Multiple Paths, and has specify extension filter = *.zip
(however if leave it as * and in the feature reader have it read recurse subfolder should achieve the same thing) so I don't think that was the issue.
I create bogus csv files and seperate them into 2 different zip files and tried it. I was able to read all the csv files and it showed different schemas on the feature information in SingleMatched.
I also attached the workspace to see if this would work.
Please let me know how this goes!
Thanks for your suggestion but no luck so far I'm afraid.
Basically what I want to do is allow the user to upload two or more zipfiles with this content:
The csv's all have a different schema (and it's important, because there's a lot of logic happening) and their names *may* be prefixed, although I think we can make a good case to not allow that. I also want to keep it as simple for user as possible, so only a single upload.
I currently have it set up with a "Files" type parameter;
Which then gets put into a Generic reader:
The \\* is to force the Generic reader to look inside the zipfile. I've added a CSV reader as a resource and imported the feature types, using wildcard matching to make sure the right data ends up in the right place.
This works fine when I only process one zipfile at a time. But when I try two the $ZIPFILE parameter looks like this: ""C:\\Temp\\File1.zip" "C:\\Temp\\File2.zip"" and the Generic reader then only processes the last one. So that's why I decided to try the same approach with a FeatureReader, by splitting the parameter into its parts first and using each as a separate initiator. But then I can't seem to get it to separate the csv's out 😕 In fact, I can't even expose the original filename as an attribute so I can manually filter them.
Thanks, that does seem to work for the CSV's (although I need to manually expose the attributes), but there's mid/mif files inside the zip as well 😅
Thanks for your suggestion but no luck so far I'm afraid.
Basically what I want to do is allow the user to upload two or more zipfiles with this content:
The csv's all have a different schema (and it's important, because there's a lot of logic happening) and their names *may* be prefixed, although I think we can make a good case to not allow that. I also want to keep it as simple for user as possible, so only a single upload.
I currently have it set up with a "Files" type parameter;
Which then gets put into a Generic reader:
The \\* is to force the Generic reader to look inside the zipfile. I've added a CSV reader as a resource and imported the feature types, using wildcard matching to make sure the right data ends up in the right place.
This works fine when I only process one zipfile at a time. But when I try two the $ZIPFILE parameter looks like this: ""C:\\Temp\\File1.zip" "C:\\Temp\\File2.zip"" and the Generic reader then only processes the last one. So that's why I decided to try the same approach with a FeatureReader, by splitting the parameter into its parts first and using each as a separate initiator. But then I can't seem to get it to separate the csv's out 😕 In fact, I can't even expose the original filename as an attribute so I can manually filter them.
Oh yea I forgot about those. if it's always only CSV and mid/mif in those zipfiles. I think you can expose path_extension in FeatureReader and do Testfilter separate the extension and then send them to the designated reader (CSV & MapInfo & Generic(any format)) I never tried generic any format, so i'm not quite sure if that would work.
And yes that's one down side on Generic port for it not expose any of the attribute, we will have to do it as the process goes lol. Attribute Expose from feature cache or dataset might help you, but it's not very dynamic sadly.