Skip to main content

Hi everyone. I have a XLSX files with field for the name of a PDF. I have also a folder with the PDF. I want to compare if all the name in the XSLX file have a PDF on thr folder (with the PDF). Thank for help :)

 

There are multiple PythonCaller based custom transformers that can do this, it is up to you how you implement them.

Here is an idea based exclusively on FME transformers:

Read the Excel file and send the features to a FilenamePartExtractor where you set your Path attribute (from Excel) in the Source Filename parameter of the transformer.

FilenamePartExtractorYou send this to a FeatureReader set as Format: Directory and File Pathnames. You set it's Dataset to the _dirpath attribute (outputted by FilenamePartExtractor)

FeatureReaderand Path Filter to _filename attribute (outputted by FilenamePartExtractor). You can find Path Filter by clicking the Parameters button.

Directory and File Pathnames ParametersUse the output of the <Initiator> port to test for _matched_records =1 if file exist or _matched_records=0 if file doesn't exist. You can use Tester, TestFilter, AttributeManager with a Conditional value, etc.

If you need the original attributes from Excel set Accumulation Mode to Merge Initiator and Result in the Attribute and Geometry Handling section of the FeatureReader.

I have attached a sample workspace that can get you started


Hi @hugues​ ,

 

I think you could use a FeatureMerger transformer for this. I would run the Excel Reader into the Requestor port and a Directory and File Pathnames Reader into the Supplier port. In the transformer, use your excel field with the PDF link as the Requestor field to join on. And for the Supplier join on, use the path_windows attribute from the Directory Reader. Anything that comes out the Merged port has a PDF in your folder, and anything that comes out the UnmergedRequestor port does not.

 

Note: This method does assume that the data in your excel field exactly matches the actual file and folder path. If they do not, you may need to manipulate the data (change case, etc.) before running it into the FeatureMerger. Hope that helps. :)


Reply