Question

directory and pathname reader performance lag

2 years ago
December 1, 2022
9 replies
51 views

Forum|alt.badge.img

+17

gisbradokla
Enthusiast
132 replies

I have been using this transformer for quite some time and have been living with the pain.

i have done some more testing with it and discovered that if i don't use recurse there is a significant performance gain with a giant functionality loss!

it can take anywhere from 5 minutes to HOURS to get a response from the initial read in dektop. slightly better in server however i am ussually not waiting on it at that point (if i need it to work i publish to server and walk away) it may be tomorrow before it completes.

are there any workarounds. i have considered trying to use a system caller to just do a dir and get really bad data back and rebuild it from that.

Forum|alt.badge.img

+31

dustin
Influencer
625 replies
2 years ago
December 1, 2022

Could you post your workspace, or at least the portion that you are using the reader? I've never had any performance issues, but I wasn't reading large directories either.

Forum|alt.badge.img

+17

gisbradokla
Author
Enthusiast
132 replies
2 years ago
December 1, 2022

yes. but it doesn't matter how large the directory is. I am ultimately reading large datasets and using a file filter as well. I realize both of these are going to add bandwidth. as well as another requirement (running on server in final form so it is a network drive an di am searching on the unc not the mapped drive.

@gisbradokla

Forum|alt.badge.img

+17

gisbradokla
Author
Enthusiast
132 replies
2 years ago
December 1, 2022

this reads the unc. converts to a mapped letter, creates some more attributes from the path. and then writes out to sql (deleted that connection)

@gisbradokla

david_r
8349 replies
2 years ago
December 2, 2022

Did you perhaps enable Retrieve File Properties in the reader parameters? That can make a substantial difference if there are a lot of files. Also make sure to leverage the Path Filter, if you can, rather than using e.g. a Tester in the workspace.

Forum|alt.badge.img

+17

gisbradokla
Author
Enthusiast
132 replies
2 years ago
December 2, 2022

david_r wrote:

Did you perhaps enable Retrieve File Properties in the reader parameters? That can make a substantial difference if there are a lot of files. Also make sure to leverage the Path Filter, if you can, rather than using e.g. a Tester in the workspace.

I do require the file properties. However i can watch the difference between recurse yes and recurse no. yes takes several hours, while no runs immediately.

@gisbradokla

david_r
8349 replies
2 years ago
December 2, 2022

gisbradokla wrote:

I do require the file properties. However i can watch the difference between recurse yes and recurse no. yes takes several hours, while no runs immediately.

I'm not sure that it's the recurse=yes/no that is causing the issue on its own, rather that the number of files to query the file properties is much larger that way. On Windows, retrieving file properties is relatively slow, and when you multiply it with a large number of files it makes a difference.

I haven't retried with the most recent versions of FME, but some years ago I found that iterating over large directories using Python (os.walk) was substantially faster than FME, even when requesting the file properties.

Forum|alt.badge.img

+17

gisbradokla
Author
Enthusiast
132 replies
2 years ago
December 2, 2022

gisbradokla wrote:

I do require the file properties. However i can watch the difference between recurse yes and recurse no. yes takes several hours, while no runs immediately.

if i set the number if features to 1 and set the recurse yes it takes hours. if i set the number of features to 1 and set the recurse no it takes seconds.

@gisbradokla

david_r
8349 replies
2 years ago
December 2, 2022

gisbradokla wrote:

I do require the file properties. However i can watch the difference between recurse yes and recurse no. yes takes several hours, while no runs immediately.

The bottleneck is when reading the filesystem, which I believe happens before FME limits the number of features.

Forum|alt.badge.img

+17

gisbradokla
Author
Enthusiast
132 replies
2 years ago
December 2, 2022

gisbradokla wrote:

I do require the file properties. However i can watch the difference between recurse yes and recurse no. yes takes several hours, while no runs immediately.

i understand there is a bottleneck. will the bottleneck ever be fixed?

there is a tool there. obviously if i (as suggested would learn to use it) python, or a dir command or many other methods it does not encounter the bottleneck that the directory and pathnames reader does. if it is not a pathnames reader for bulk data then name it the single pathname reader.

EDIT:** After 5 days no response... Just left hanging. not sure where to turn.

scan using pathname reader current time is 2022-12-7 9:04 still waiting scan using pathname reader2

@gisbradokla

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

×

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing