Question

Using Directory Watch in FME Server, not triggering workspace with each new file.

  • 27 November 2015
  • 12 replies
  • 23 views

Badge

I have a workspace that watches a specific folder for csv files to be added to it.

The workspace then converts the csv file to a shapefile.

If i copy just one csv file into the folder it works perfectly.

If i copy more than one csv file then server kicks off an instance of the workspace for each csv file added but it reads in the first csv file in the folder to every instance ignoring the rest of the csv files in the folder.

Also if the folder already has a csv file in the folder and a second is added, server will kick of the workspace and run it using the csv file that was originally in the folder and not the new one.

any ideas what i may be doing wrong?

@aaronkoning


12 replies

Userlevel 5

Hi

Have you checked the JSON block sent to your workspace? Could it be that more than one trigger file is listed in it?

David

Badge

Hi

Have you checked the JSON block sent to your workspace? Could it be that more than one trigger file is listed in it?

David

I am fairly new at using FME server, not sure how i would check that.

Userlevel 5

I am fairly new at using FME server, not sure how i would check that.

You can use the Topic monitor in the FME Server gui, see section 3 here.

Badge

Ok, you should always check your workspace first, figured out why i was only getting the first file written out. A dissolver was the cause of that problem. Still have another problem though.

If 3 CSV files are put in the folder it envokes 3 engines all running the same workspace which all take in the 3 CSV files and all output 3 shapefiles. Therefore for the 3 CSV files it produces 9 shapefiles. Granted, each workspace will overwrite the last but thats not what I want as i am also writing out to oracle and in the database I will get 3 copies of each feature in the end table.

any help gratefully received.

Badge

You can use the Topic monitor in the FME Server gui, see section 3 here.

I have looked at that, they each only come in once. See comment to original post, problem has evolved. thanks

Userlevel 4
Badge +25

Ok, you should always check your workspace first, figured out why i was only getting the first file written out. A dissolver was the cause of that problem. Still have another problem though.

If 3 CSV files are put in the folder it envokes 3 engines all running the same workspace which all take in the 3 CSV files and all output 3 shapefiles. Therefore for the 3 CSV files it produces 9 shapefiles. Granted, each workspace will overwrite the last but thats not what I want as i am also writing out to oracle and in the database I will get 3 copies of each feature in the end table.

any help gratefully received.

Edit: Actually - I wouldn't do this. I'd try the other idea first. Your notification should just be 1 per file which you should feed to the Reader.

----------------

The only thing I can think of is that you tie that workspace to a particular engine. That way the 2nd and 3rd runs will be queued up and not run immediately.

Then if you make the workspace delete the source data after reading, the 2nd and 3rd runs can't take place because the source data is missing and will error out.

Not particularly elegant - and I look forward to a better answer - but it should work.

Userlevel 4
Badge +25

Ok, you should always check your workspace first, figured out why i was only getting the first file written out. A dissolver was the cause of that problem. Still have another problem though.

If 3 CSV files are put in the folder it envokes 3 engines all running the same workspace which all take in the 3 CSV files and all output 3 shapefiles. Therefore for the 3 CSV files it produces 9 shapefiles. Granted, each workspace will overwrite the last but thats not what I want as i am also writing out to oracle and in the database I will get 3 copies of each feature in the end table.

any help gratefully received.

Or... what content does the directory watch notification return? It should be something like:

{

"dirwatch_publisher_content": "ENTRY_CREATE C:\\\\apps\\\\FMEServer\\\\Temp\\\\sample_file.txt\\n",

"fns_type": "dirwatch_publisher"

}

In which case you should get three notifications each (I expect) mentioning a single file. Can you get your workspace to read just that file, instead of reading all files in the folder (which it sounds like you are doing)

Userlevel 4
Badge +25

Hi

Have you checked the JSON block sent to your workspace? Could it be that more than one trigger file is listed in it?

David

It's like I mentioned above. Does dirwatch_publisher_content list one file or three? And how is your Reader set up? I would hope that the notification lists just one file which the workspace would read. Don't have the workspace set up to read all files in the folder.

Badge

Or... what content does the directory watch notification return? It should be something like:

{

"dirwatch_publisher_content": "ENTRY_CREATE C:\\\\apps\\\\FMEServer\\\\Temp\\\\sample_file.txt\\n",

"fns_type": "dirwatch_publisher"

}

In which case you should get three notifications each (I expect) mentioning a single file. Can you get your workspace to read just that file, instead of reading all files in the folder (which it sounds like you are doing)

I paste in 3 csv files and get 6 of those notifications returned 2 for each csv file.

this in turn triggers 6 workspaces to run, each reading all 3 csv files and processing them.

If i could get only 3 notifications, and then either 1 workspace to run and process them all or 3 workspaces to run but each process a different file then that would be great.

still haven't found a way though.

Still very new to Server and its workings.

Userlevel 4
Badge +25

I paste in 3 csv files and get 6 of those notifications returned 2 for each csv file.

this in turn triggers 6 workspaces to run, each reading all 3 csv files and processing them.

If i could get only 3 notifications, and then either 1 workspace to run and process them all or 3 workspaces to run but each process a different file then that would be great.

still haven't found a way though.

Still very new to Server and its workings.

I've asked our server guys to chime in on this one. I suspect that perhaps you are getting create and modify notifications for each file. There's a timeout value I think, that perhaps you need to increase (else the file is flagged as modified even when it's being copied into the folder). Your workspace will need a TextFile reader to accept the JSON notification. Then use a FeatureReader to read the file mentioned in the JSON. That should help - but if I'm wrong I hope one of our server folk will speak up.

Badge +9

Hi @baznewman07,

This is a tricky problem and my hope is that we can improve the process to make it easier to use. A key factor to using Directory watch is to parse the incoming notifications. My guess here is that you may be seeing a 'Create' followed by a 'Modify' for each file.

There is a workspace within this tutorial that should help: https://knowledge.safe.com/articles/1439/fme-serve...

It will show you how to take the notification and parse it to grab out the filename. Then you can use a FeatureReader or another workspace to process the data. Give it a try and see if it helps your situation.

Note: if you wanted to process all the CSV files at a single time, there is an advanced workflow within that tutorial. It is similar to the basic process described above but instead of processing the files right away, a database is used to store every notification. Then a separate scheduled job will read the database and process the correct number of files. Have a look here:

https://knowledge.safe.com/articles/1400/directory...

Badge +1

Have you set your FME script as a subcriber to the topic that receives the Directory Watch message? If so, you can use a Directory and File Pathnames reader in your FME script to read the contents of the directory. This happens whenever the directory gets changed and the FME script executes.

Next you can test for the filenames that you are interested in with the path_filename attribute of the reader. If one or more matches, use a DateFormater to format the path_modified_date attribute of the feature in the reader to be a timestamp (that is, %s).

Next, use a TimeStamper to get the timestamp for the current date (that is, ^s). Now you can compare the two timestamps for a given file and determine if it has been changed within a reasonable period of time, say, 10 seconds. If so, you know which file has been changed and can run the rest of your FME script. Otherwise you just exit.

Reply