Skip to main content

I'm using a HTTP caller to download a ZIP file from a API. After that I'm using the ZipExtractor transformer to extract a number of csv files from the ZIP file to a folder and then a FeatureReader reads the files and so on. Only problem I'm having is that the CSV files change from time to time. They all start with the time of the export followed by the filename. So every time that happends I have to go back to the FeatureReader, select the files again and reconnect the files to their appropriatie transformers.

 

Is there a way to rename the files after the ZipExtractor ? I have found some topics using the Directory and File Pathnames Reader with the File Copy writer but I cannot get that to work. Any help would be greatly appreciated.

Multiple ways to do this. I use the path_windows attribute from the ZipExtractor as input for the FeatureReader's Dataset field. This will result in output from the Generic outputport and unexposed attributes, but you can solve that using an AttributeExposer, import, expose from feature cache.


Multiple ways to do this. I use the path_windows attribute from the ZipExtractor as input for the FeatureReader's Dataset field. This will result in output from the Generic outputport and unexposed attributes, but you can solve that using an AttributeExposer, import, expose from feature cache.

Hi Niels,

 

Thanks for getting back to me. There are 11 csv files inside the ZIP file: imageWhen I connect the FeatureReader to the File output, select CSV file and the path_windows attribute for the dataset I get a popup asking me the following:

imageWhen selecting the three dots on the right an Explorer window opens up where I can select a file. Since I don't want to have to select the file manually I hit cancel. After running the FeatureReader I connect a AttributeExposer and import from feature cache. However, since a lot of csv files share the same attribute names the list is incomplete and has a lot of missing values. Now I can do a single FeatureReader per file but is there a faster way then that?

 


Hi Niels,

 

Thanks for getting back to me. There are 11 csv files inside the ZIP file: imageWhen I connect the FeatureReader to the File output, select CSV file and the path_windows attribute for the dataset I get a popup asking me the following:

imageWhen selecting the three dots on the right an Explorer window opens up where I can select a file. Since I don't want to have to select the file manually I hit cancel. After running the FeatureReader I connect a AttributeExposer and import from feature cache. However, since a lot of csv files share the same attribute names the list is incomplete and has a lot of missing values. Now I can do a single FeatureReader per file but is there a faster way then that?

 

In the FeatureReader Parameters, under Output Ports, if you choose for "Single Output Port", you won't get the "Generating Output Ports" screen.

 

If you let the FeatureReader read all the files from the ZipExtractor, you will have all complete list of attributes?

 

If the files have different schema's I would use one FeatureReader and split the different files by filename. You can get the filename from path_windos using the FilenamePartExtractor. If you merge the _filename attribute (FeatureReader Parameters, Attribute and Geometry Handling, Merge Initiator and Result) you can use that to split the files using an AttributeFilter or TestFilter. Then an AttributeExposer for the different file's / schema's.


What I ended up doing what using a tester per file after the ZipExtractor and a FeatureReader after that. Took me a little bit of time to set it up but it works now :)


Reply