Question

How can we automate the process to download a specific file type from Azure Blob Storage Containers. Additionally how to create an alert when a files/containers are uploaded/downloaded/deleted.

  • 8 March 2022
  • 1 reply
  • 35 views

Hi everyone,

I would be happy if somebody can give a thought to address the issue on automating FME. Presently, I am processing .csv files in FME to create some spatial data (points/lines). Problem is the .csv files are located inside Azure Blob storage containers along with several datasets. I am manually finding out the latest containers to download the csv file to local directory and then processing the same using FME workbenches. I have already added a AzureBlobStorage Container in FME and connection parameters successfully, but not getting a sense how to download all the csv files automatically from all containers and store it to a local or temp directory to process further.

My Flow is --

Creator > AzureBlobStorageConnector

Additionally trying to find out any possible way to create a mail or other alert system, so that, whenever new data would be uploaded/downloaded/deleted to Azure, it will show up in fme.


1 reply

Userlevel 3
Badge +16

I haven't used Azure Blob storage, but I would approach this by using the AzureBlobStorageConnector to list the contents of the top level container where the latest containers are being added (make sure to include subfolders). That should give you the contents of every container, every csv file.

Have it recorded somewhere all of the csv files which have been processed previously, so that you can compare what's on Azure to what's on that list in order to know what needs to be processed. Then pass each azure csv filepath into a second AzureBlobStorageConnector to download all those csvs, then process as usual and update the list.

 

For an alert system, you could (I assume) set something up on the azure side to notify FME server when new data has appeared. I don't know what that would look like, whether it would be done with webhooks..?

Or, have FME server run your workspace on a frequent schedule. It depends if you need to process the files immediately, or if checking once every 10 minutes or 60 minutes is sufficient. If the workspace is set up to get the list of csv files from azure and only proceed if there's new data, then most of the time it won't find anything new and will finish after only a few seconds.

 

Reply