Skip to main content
Solved

Download multiple file formats from SFTP - Iterate through subfolders


galigis
Enthusiast
Forum|alt.badge.img+20

Hi Community,

I’ve developed a workflow that enables me to connect to an external SFTP server and download all contents from a specified folder. I followed the guidance provided in the following post:

The workflow functions well for a single folder. However, the SFTP directory structure includes multiple nested subfolders. I now need to enhance the automation to recursively traverse these subfolders. Please see the structure example below.

 

 

Additionally, the SFTP contains files in various formats. To handle this, I plan to use multiple StringSearcher transformers—one for each file type (e.g., .tiff.asci.las). My challenge is ensuring that FME iterates through all subfolders to identify and process the relevant files.

Could anyone advise on how to configure FME to recursively search through subfolders and apply the appropriate filters?

Many thanks in advance!

Best answer by david_r

Unfortunately, the FTPCaller does not support recursing over folders, as far as I know. You could implement the functionality yourself using e.g. a looping custom transformer, but it’s going to be a bit of work to cover all edge cases.

If you can live with a solution based on Python using the 3rd party library paramiko, (to install, see here) here’s a solution that you can paste into a PythonCaller, it will return a feature for each file found in all folders on a given SFTP server. This makes it easy to use e.g. a Tester to check for names or file types before using the FTPCaller to download files.

from typing import Any, List
import os
import stat

import fme
import fmeobjects

import paramiko

def get_all_files_from_sftp(
    hostname: str,
    username: str,
    password: str = None,
    private_key_path: str = None,
    port: int = 22,
    root_path: str = "/",
) -> List[str]:
    """
    Recursively iterate over all files and folders on an SFTP server.

    Args:
        hostname: SFTP server hostname or IP address
        username: Username for authentication
        password: Password for authentication (optional if using key)
        private_key_path: Path to private key file (optional if using password)
        port: SFTP server port (default: 22)
        root_path: Starting directory path (default: "/")

    Returns:
        List of full file paths found on the server

    Raises:
        paramiko.AuthenticationException: If authentication fails
        paramiko.SSHException: If SSH connection fails
        FileNotFoundError: If private key file not found
    """
    all_files = []

    # Create SSH client
    ssh = paramiko.SSHClient()
    ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())

    try:
        # Connect to server
        if private_key_path:
            private_key = paramiko.RSAKey.from_private_key_file(private_key_path)
            ssh.connect(hostname, port=port, username=username, pkey=private_key)
        else:
            ssh.connect(hostname, port=port, username=username, password=password)

        # Open SFTP session
        sftp = ssh.open_sftp()

        def _recursive_list(path: str):
            """Recursively list all files in the given path."""
            try:
                # Get list of items in current directory
                items = sftp.listdir_attr(path)

                for item in items:
                    # Construct full path
                    full_path = os.path.join(path, item.filename).replace("\\", "/")

                    # Check if item is a directory
                    if stat.S_ISDIR(item.st_mode):
                        # Recursively process subdirectory
                        _recursive_list(full_path)
                    elif stat.S_ISREG(item.st_mode):
                        # Add regular file to results
                        all_files.append(full_path)

            except PermissionError:
                print(f"Permission denied: {path}")
            except Exception as e:
                print(f"Error accessing {path}: {str(e)}")

        # Start recursive listing from root path
        _recursive_list(root_path)

    finally:
        # Clean up connections
        if "sftp" in locals():
            sftp.close()
        ssh.close()

    return all_files


class RecursiveListSFTP():

    def __init__(self):
        """Base constructor for class members."""
        self._log = fmeobjects.FMELogFile()

    def input(self, feature: fmeobjects.FMEFeature):
        """This method is called for each feature which enters the PythonCaller."""
        try:
            files = get_all_files_from_sftp(
                hostname="TODO",
                username="TODO",
                password="TODO",
                root_path="/",
            )

            self._log.logMessageString(f"Found {len(files)} files on SFTP server")
            for file_path in files:
                f = feature.clone()
                f.setAttribute('sftp_filename', file_path)
                self.pyoutput(f, output_tag="PYOUTPUT")

        except Exception as e:
            print(f"Error: {e}")


    def close(self):
        """This method is called once all the FME Features have been
        processed from input().
        """
        pass

    def process_group(self):
        """This method is called by FME for each group when group
        processing mode is enabled.
        """
        pass

    def reject_feature(self, feature: fmeobjects.FMEFeature, code: str, message: str):
        """This method can be used to output a feature to the <Rejected> port."""
        feature.setAttribute("fme_rejection_code", code)
        feature.setAttribute("fme_rejection_message", message)
        self.pyoutput(feature, output_tag="<Rejected>")

    def has_support_for(self, support_type: int) -> bool:
        """This method is called by FME to determine if the PythonCaller supports
        Bulk mode, which allows for significant performance gains when processing
        large numbers of features.
        """
        return support_type == fmeobjects.FME_SUPPORT_FEATURE_TABLE_SHIM

Settings in the PythonCaller:

  • Class to process features: “RecursiveListSFTP”
  • Attributes to expose: “sftp_filename”

Lines 106-109: You’ll need to set the correct values for hostname, username and password. If you don’t want to recurse from the root folder of the SFTP server, change the root_path value as well.

To install paramiko into FME, open a command line window in the root folder of your FME installation and type:

.\fme python -m pip install paramiko

If this doesn’t work, refer to the FME documentation linked at the top of this post.

View original
Did this help you find an answer to your question?

2 replies

david_r
Celebrity
  • Best Answer
  • June 16, 2025

Unfortunately, the FTPCaller does not support recursing over folders, as far as I know. You could implement the functionality yourself using e.g. a looping custom transformer, but it’s going to be a bit of work to cover all edge cases.

If you can live with a solution based on Python using the 3rd party library paramiko, (to install, see here) here’s a solution that you can paste into a PythonCaller, it will return a feature for each file found in all folders on a given SFTP server. This makes it easy to use e.g. a Tester to check for names or file types before using the FTPCaller to download files.

from typing import Any, List
import os
import stat

import fme
import fmeobjects

import paramiko

def get_all_files_from_sftp(
    hostname: str,
    username: str,
    password: str = None,
    private_key_path: str = None,
    port: int = 22,
    root_path: str = "/",
) -> List[str]:
    """
    Recursively iterate over all files and folders on an SFTP server.

    Args:
        hostname: SFTP server hostname or IP address
        username: Username for authentication
        password: Password for authentication (optional if using key)
        private_key_path: Path to private key file (optional if using password)
        port: SFTP server port (default: 22)
        root_path: Starting directory path (default: "/")

    Returns:
        List of full file paths found on the server

    Raises:
        paramiko.AuthenticationException: If authentication fails
        paramiko.SSHException: If SSH connection fails
        FileNotFoundError: If private key file not found
    """
    all_files = []

    # Create SSH client
    ssh = paramiko.SSHClient()
    ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())

    try:
        # Connect to server
        if private_key_path:
            private_key = paramiko.RSAKey.from_private_key_file(private_key_path)
            ssh.connect(hostname, port=port, username=username, pkey=private_key)
        else:
            ssh.connect(hostname, port=port, username=username, password=password)

        # Open SFTP session
        sftp = ssh.open_sftp()

        def _recursive_list(path: str):
            """Recursively list all files in the given path."""
            try:
                # Get list of items in current directory
                items = sftp.listdir_attr(path)

                for item in items:
                    # Construct full path
                    full_path = os.path.join(path, item.filename).replace("\\", "/")

                    # Check if item is a directory
                    if stat.S_ISDIR(item.st_mode):
                        # Recursively process subdirectory
                        _recursive_list(full_path)
                    elif stat.S_ISREG(item.st_mode):
                        # Add regular file to results
                        all_files.append(full_path)

            except PermissionError:
                print(f"Permission denied: {path}")
            except Exception as e:
                print(f"Error accessing {path}: {str(e)}")

        # Start recursive listing from root path
        _recursive_list(root_path)

    finally:
        # Clean up connections
        if "sftp" in locals():
            sftp.close()
        ssh.close()

    return all_files


class RecursiveListSFTP():

    def __init__(self):
        """Base constructor for class members."""
        self._log = fmeobjects.FMELogFile()

    def input(self, feature: fmeobjects.FMEFeature):
        """This method is called for each feature which enters the PythonCaller."""
        try:
            files = get_all_files_from_sftp(
                hostname="TODO",
                username="TODO",
                password="TODO",
                root_path="/",
            )

            self._log.logMessageString(f"Found {len(files)} files on SFTP server")
            for file_path in files:
                f = feature.clone()
                f.setAttribute('sftp_filename', file_path)
                self.pyoutput(f, output_tag="PYOUTPUT")

        except Exception as e:
            print(f"Error: {e}")


    def close(self):
        """This method is called once all the FME Features have been
        processed from input().
        """
        pass

    def process_group(self):
        """This method is called by FME for each group when group
        processing mode is enabled.
        """
        pass

    def reject_feature(self, feature: fmeobjects.FMEFeature, code: str, message: str):
        """This method can be used to output a feature to the <Rejected> port."""
        feature.setAttribute("fme_rejection_code", code)
        feature.setAttribute("fme_rejection_message", message)
        self.pyoutput(feature, output_tag="<Rejected>")

    def has_support_for(self, support_type: int) -> bool:
        """This method is called by FME to determine if the PythonCaller supports
        Bulk mode, which allows for significant performance gains when processing
        large numbers of features.
        """
        return support_type == fmeobjects.FME_SUPPORT_FEATURE_TABLE_SHIM

Settings in the PythonCaller:

  • Class to process features: “RecursiveListSFTP”
  • Attributes to expose: “sftp_filename”

Lines 106-109: You’ll need to set the correct values for hostname, username and password. If you don’t want to recurse from the root folder of the SFTP server, change the root_path value as well.

To install paramiko into FME, open a command line window in the root folder of your FME installation and type:

.\fme python -m pip install paramiko

If this doesn’t work, refer to the FME documentation linked at the top of this post.


galigis
Enthusiast
Forum|alt.badge.img+20
  • Author
  • Enthusiast
  • June 16, 2025

That’s brilliant ​@david_r . Your suggestion helped a lot! :)


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings