Skip to main content
Solved

Extracting CSV files from large zip file


Forum|alt.badge.img

Im simply trying to get fme to use FileCopy (so that I can copy csv files inside a large zip file (about 4 gigs)) to a network folder. I only used a reader with format set to CSV, dataset set to the path of the zip file, and workflow options set to Individual (connected to filecopy writer). Are there any issues with fme and large zip files? Is there a better method to extracting a specific file type from the zip than what I used? Thanks for any help

--latest error states: "failed to open file. failed to get any schema from reader". i am able to manually use filezila or 7 zip and perform this

Best answer by takashi

Hi @jvickrey656, if you need to extract a CSV file from the Zip archive and save it to a specific directory, the ZipExtractor custom transformer from FME Hub might help you.

If you just need to copy the Zip to a specific directory without extracting, the File Copy writer does that.

Anyway you don't need to read data from the CSV file using the CSV reader.

View original
Did this help you find an answer to your question?

8 replies

danilo_fme
Evangelist
Forum|alt.badge.img+44
  • Evangelist
  • March 27, 2018

Hi @jvickrey656,

Could you share with us your template file and logfile?

Thanks,

Danilo


takashi
Influencer
  • Best Answer
  • March 27, 2018

Hi @jvickrey656, if you need to extract a CSV file from the Zip archive and save it to a specific directory, the ZipExtractor custom transformer from FME Hub might help you.

If you just need to copy the Zip to a specific directory without extracting, the File Copy writer does that.

Anyway you don't need to read data from the CSV file using the CSV reader.


takashi
Influencer
  • March 27, 2018
takashi wrote:

Hi @jvickrey656, if you need to extract a CSV file from the Zip archive and save it to a specific directory, the ZipExtractor custom transformer from FME Hub might help you.

If you just need to copy the Zip to a specific directory without extracting, the File Copy writer does that.

Anyway you don't need to read data from the CSV file using the CSV reader.

By the way, what do you mean the "network folder" here? If you intend to upload the extracted CSV file to a directory in an FTP server, you need to save the extracted CSV file to a folder in your local machine temporarily and then upload it to the server with the FTPCaller.

 


Forum|alt.badge.img
takashi wrote:

Hi @jvickrey656, if you need to extract a CSV file from the Zip archive and save it to a specific directory, the ZipExtractor custom transformer from FME Hub might help you.

If you just need to copy the Zip to a specific directory without extracting, the File Copy writer does that.

Anyway you don't need to read data from the CSV file using the CSV reader.

Hi takashi- if I use ZipExtractor then how do I specify I want csv files out of the zip? I'll attach my fmw and log. And what I meant by "network folder" is just a shared folder on our office network (server). Based on your explanation it sounds like I need to use ZipExtractor but I need the correct syntax in order to tell that transformer to grab any CSV file within the zip and copy all of the CSV's to a directory

 

 

csv22filecopyheader.fmw

 

csv22filecopyheader.txt

 

 


takashi
Influencer
  • March 27, 2018
takashi wrote:

Hi @jvickrey656, if you need to extract a CSV file from the Zip archive and save it to a specific directory, the ZipExtractor custom transformer from FME Hub might help you.

If you just need to copy the Zip to a specific directory without extracting, the File Copy writer does that.

Anyway you don't need to read data from the CSV file using the CSV reader.

Basically, just set the zip file path to the "Source Zip File" parameter and set the destination folder path to the "Destination Root Folder" parameter. However, I found the error message "BadZipfile: zipfiles that span multiple disks are not supported" in the log you have attached.

 

The ZipExtractor contains a Python script with the Python standard "zipfile" module, and the module doesn't support extracting files from a disk to another disk unfortunately.

 

A workaround I can think of is:

 

  1. Copy the source zip file to local disk (FeatureWriter with File Copy writer).
  2. Extract csv files from the zip file and save them into the same disk temporarily (ZipExtractor).
  3. Read the paths of the csv files (Directory and File Pathnames reader), then copy them to the destination folder (File Copy writer).
In addition, the TempPathnameCreator is convenient to make a temporary folder/file path. FME will automatically remove all files saved in the temporary path after the translation has completed.

 

 


Forum|alt.badge.img
takashi wrote:
Basically, just set the zip file path to the "Source Zip File" parameter and set the destination folder path to the "Destination Root Folder" parameter. However, I found the error message "BadZipfile: zipfiles that span multiple disks are not supported" in the log you have attached.

 

The ZipExtractor contains a Python script with the Python standard "zipfile" module, and the module doesn't support extracting files from a disk to another disk unfortunately.

 

A workaround I can think of is:

 

  1. Copy the source zip file to local disk (FeatureWriter with File Copy writer).
  2. Extract csv files from the zip file and save them into the same disk temporarily (ZipExtractor).
  3. Read the paths of the csv files (Directory and File Pathnames reader), then copy them to the destination folder (File Copy writer).
In addition, the TempPathnameCreator is convenient to make a temporary folder/file path. FME will automatically remove all files saved in the temporary path after the translation has completed.

 

 

Hi Takashi, do you have an example fmw I can look at it to see what you mean on your workaround? Did you mean FeatureReader to FileCopy? It makes sense what you are saying but having some trouble implementing that workaround

 

 


takashi
Influencer
  • March 27, 2018
takashi wrote:

Hi @jvickrey656, if you need to extract a CSV file from the Zip archive and save it to a specific directory, the ZipExtractor custom transformer from FME Hub might help you.

If you just need to copy the Zip to a specific directory without extracting, the File Copy writer does that.

Anyway you don't need to read data from the CSV file using the CSV reader.

Probably this workflow works. Directory and File Pathnames (PATH) reader was not essential.

If the source zip file is saved in the local disk, the first TempPathnameCreator and the FeatureWriter are not necessary. You can remove them and then set the source zip file path to the Source Zip File parameter in the ZipExtractor.

 

 

 


Forum|alt.badge.img
takashi wrote:

Probably this workflow works. Directory and File Pathnames (PATH) reader was not essential.

If the source zip file is saved in the local disk, the first TempPathnameCreator and the FeatureWriter are not necessary. You can remove them and then set the source zip file path to the Source Zip File parameter in the ZipExtractor.

 

 

 

 

Thanks takashi

Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings