Question

Merging updates into a directory

  • 5 December 2013
  • 3 replies
  • 0 views

Badge
I'm sure this is something FME should be able do.

 

I have a master directory of JPGs named with grid square and date thus:

 

SD0907_2008_16_13.jpg

 

and a directory of updated jpgs named thus:

 

SD0907_2011_08_23.jpg.

 

I want to merge the two directorie into a third directory containing the latest jpg for each area, retaining the date for the next iteration.  Not all tiles are updated, and I can't trust the file date, only the date embedded in the filname.

 

My workspace so far reads the master directory and pulls in each jpg in turn. I  parse the grid and the date into attributes using the AtrtributeSplitter and AttributeRenamer.  I create an atrtribute of the form *SD0907* and use it as the Initiator in a FeatureReader which is looking at the Updates directory.  I'm then expecting to sometimes (not every tile is updated each cycle) get a match of the area and then to be able to parse the date of the update and either write the original file or the updated file using a couple of TestFilters.   I've set the FeatureReader to keep the Result rather than the Initiator in cases of conflict which seems the right way to do it.

 

However, I can't seem to find the name of the updated file (fme_basename doesn't change) so I can't get the date from it. And I may be going about this in completly the wrong way but before I go of to learn Python and do it programatically I thought I'd ak the question here.

 

Any suggestions?

 

Chris BB

 


3 replies

Userlevel 4
Hi,

 

 

if you only need to copy the raster files without any further processing, consider using the "Directory and File Pathnames" reader in conjunction with the "File Copy" writer.

 

 

It is much quicker as FME doesn't have to parse the rasters into memory, only the file references. In other words, I'd avoid the FeatureReader and the jpg reader/writer.

 

 

David
Userlevel 2
Badge +17
Hi Chris,

 

 

I agree with David on using the Directory File Pathnames Reader and the File Copy Writer. Maybe the workspace like this can do the required job.

 

  Add a Directory File Pathnames Reader; specify the master directory and the updated directory to Dataset, "*.jpg" to Path Filter and "FILE" to Allowed Path Type. The Reader creates features for every "*.jpg" file in the specified directories. Each feature has several attributes named "path_***", you can use these attributes in this case. path_rootname: file name without extension path_filename: file name path_windows (or path_unix): full path   Add an AttributeSplitter to split "path_rootname" at underscore. _list{0}  Area Name (e.g. SD0907) _list{1}  Year (YYYY) _list{2}  Month (mm) _list{3}  Day (dd)   Sort features descending by Year (_list{1}), Month (_list{2}) and Day (_list{3}).   Select each first feature of the same Area Name (_list{0}) using a DuplicateRemover.

 

Key Attributes: _list{0}   Connect an AttributeRenamer to the UNIQUE port of the DuplicateRemover, and rename these attributes.

 

path_windows (or path_unix) --> filecopy_source_dataset path_filename --> filecopy_dest_filename

 

  "filecopy_***" is a format attribute name of the File Copy Writer. The Writer will copy a file saved as "filecopy_source_dataset" to a file named "filecopy_dest_filename" in the specified directory.   Add a File Copy Writer; specify the third directory to Dataset, and connect to the AttributeRenamer.

 

  Run the workspace.

 

Takashi
Badge
That's great.  Very many thanks.  I had to change the final filecopy_dest_filename and filecopy_source_dataset to wildcards to copy over the supporting tab and jgw files but all working smoothly and quickly.

 

Chris

 

Reply