Skip to main content

We have a need to read some spatial data that is zipped up, but several of the zips are then nested inside another zip, just to make things fun. Like this:

c:\\temp\\top.zip\\1.zip\\1.gml

c:\\temp\\top.zip\\2.zip\\2.gml

FME can read say GML from inside a Zip, but not GML from a Zip inside a Zip. Anyone got any magic to unpack the top level Zip? I suspect the answer is Python, perhaps as either a PythonCaller or Startup script and then I could use a FeatureReader to get at the data at the next level.

Seems a bit contrived but its a genuine requirement as its the way a product is shipped. 1 zip file was clearly not enough!

Thanks

Have you tried c:\\temp\\top.zip\\**\\1.gml?


Have you tried c:\\temp\\top.zip\\**\\1.gml?

Yep, tried that first and a few variations in different versions of FME including the Beta... parsing error when trying to get inside the nested Zip.


David,

You can use the following:

Create a workspace reading the GML.

Create a master workspace using the Directory and Filepath reader like in the picture.

I tried this and it works:


Hi @1spatialdave, I also sometimes come across a nested zip. I have posted an Idea before: Add ability to read features from Archived Dataset within Nested Zip File

As you mentioned, Python can do that. e.g.

# Script Example for PythonCreator
# Extract zip files nested in a zip file,
# create features containing an attribute that stores extracted zip file path.
# Not recursive. Applicable to just one level nesting.
import fmeobjects, zipfile, os
class NestedZipUnpacker(object):
    def close(self):
        folder = FME_MacroValuese'OUTPUT_FOLDER_PATH']
        try:
            # Extract all files archived in the spacified zip file.
            with zipfile.ZipFile(FME_MacroValuest'ZIPFILE_PATH']) as z:
                z.extractall(folder)
            # Create features for each extracted zip file path.
            for path in  os.path.join(folder, fname) for fname in os.listdir(folder)]:
                if zipfile.is_zipfile(path):
                    feature = fmeobjects.FMEFeature()
                    feature.setAttribute('_zip_path', path)
                    self.pyoutput(feature)
        except Exception as ex:
            logger = fmeobjects.FMELogFile()
            logger.logMessageString('%s' % ex, fmeobjects.FME_ERROR)

Assume that these two user parameters are defined in the workspace.

  • OUTPUT_FOLDER_PATH: existing folder path into which the extracted zip files will be saved
  • ZIPFILE_PATH: the top level zip file path

FYI.


Hi @1spatialdave, I also sometimes come across a nested zip. I have posted an Idea before: Add ability to read features from Archived Dataset within Nested Zip File

As you mentioned, Python can do that. e.g.

# Script Example for PythonCreator
# Extract zip files nested in a zip file,
# create features containing an attribute that stores extracted zip file path.
# Not recursive. Applicable to just one level nesting.
import fmeobjects, zipfile, os
class NestedZipUnpacker(object):
    def close(self):
        folder = FME_MacroValuese'OUTPUT_FOLDER_PATH']
        try:
            # Extract all files archived in the spacified zip file.
            with zipfile.ZipFile(FME_MacroValuest'ZIPFILE_PATH']) as z:
                z.extractall(folder)
            # Create features for each extracted zip file path.
            for path in  os.path.join(folder, fname) for fname in os.listdir(folder)]:
                if zipfile.is_zipfile(path):
                    feature = fmeobjects.FMEFeature()
                    feature.setAttribute('_zip_path', path)
                    self.pyoutput(feature)
        except Exception as ex:
            logger = fmeobjects.FMELogFile()
            logger.logMessageString('%s' % ex, fmeobjects.FME_ERROR)

Assume that these two user parameters are defined in the workspace.

  • OUTPUT_FOLDER_PATH: existing folder path into which the extracted zip files will be saved
  • ZIPFILE_PATH: the top level zip file path

FYI.

Inspired from this discussion, published a custom transformer named ZipExtractor in the FME Store.


Inspired from this discussion, published a custom transformer named ZipExtractor in the FME Store.

@takashi, thanks so much for going to the trouble to create that Transformer. It does the job perfectly. All the best, David.


David,

You can use the following:

Create a workspace reading the GML.

Create a master workspace using the Directory and Filepath reader like in the picture.

I tried this and it works:

@1spatialdave Did you ever get it to work this way. I used it to process over 300K GML files (nested zipped) and it works without unzipping.


@takashi, thanks so much for going to the trouble to create that Transformer. It does the job perfectly. All the best, David.

@1spatialdave, good to hear! It's my pleasure. Takashi


Inspired from this discussion, published a custom transformer named ZipExtractor in the FME Store.

Thanks a lot for this custom transformer. It works perfectly!


Inspired from this discussion, published a custom transformer named ZipExtractor in the FME Store.

hi @takashi, i made some mods to the zipextractor. i added some comments to the item on the Hub. If you like I can post you all the changes.

 

ben

 


Reply