Skip to main content
Question

How to increase performance Directory and File Pathnames to read the name of all excel files in folder and subfolders


spiderman
Contributor
Forum|alt.badge.img+7

i would like to read the name of all excel files in folder and subfolders but i found it takes long time , any ideas please .

9 replies

drc43
Contributor
Forum|alt.badge.img+11
  • Contributor
  • August 9, 2021

Are you using a Path Filter to read in only excel files? (.xls, .xlsx)

https://docs.safe.com/fme/html/FME_Desktop_Documentation/FME_ReadersWriters/path/Set_Path_Filter.htm


hkingsbury
Celebrity
Forum|alt.badge.img+53
  • Celebrity
  • August 9, 2021

A lot of things could effect the speed of this. As @drc43​ mentioned, are you filtering by excel files? Also how deep are you searching (with folder hierarchy), how many files/folders are there to check, is the data on a hdd or ssd, is it local or remote, if its remote what's your network connection like?

 

I've run similar processes before (looking for .xlsx) over a remote network drive that had well in excess of 20tb of data. It took a couple of days to complete.


spiderman
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • August 9, 2021
drc43 wrote:

Are you using a Path Filter to read in only excel files? (.xls, .xlsx)

https://docs.safe.com/fme/html/FME_Desktop_Documentation/FME_ReadersWriters/path/Set_Path_Filter.htm

i have chosen for path filter *.xlsx but if i want to read also .xls in case i hae old version of excel

how will be looks like ,i have tried in path filter *.xls|*.xlsx but i got nothing or i have tried .xls, .xlsx .

i got only output when i use *.xlsx

what should i write in path filter ?

 

 


spiderman
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • August 9, 2021
hkingsbury wrote:

A lot of things could effect the speed of this. As @drc43​ mentioned, are you filtering by excel files? Also how deep are you searching (with folder hierarchy), how many files/folders are there to check, is the data on a hdd or ssd, is it local or remote, if its remote what's your network connection like?

 

I've run similar processes before (looking for .xlsx) over a remote network drive that had well in excess of 20tb of data. It took a couple of days to complete.

the data on hdd but remotely .

i have only to use path filter nothing more could i do to increase the performance because i did already the path filter but it still slow and data around 5 giga byte


hkingsbury
Celebrity
Forum|alt.badge.img+53
  • Celebrity
  • August 9, 2021
spiderman wrote:

the data on hdd but remotely .

i have only to use path filter nothing more could i do to increase the performance because i did already the path filter but it still slow and data around 5 giga byte

Whats your definition of slow?

If its on a spinning remote disk (hdd) then I would expect it to have poor performance. What is speed like when browsing the disk using Windows Explorer? roughly how many files are there that make up the 5gb?


Forum|alt.badge.img+2

@spiderman​ The documentation that @drc43​ referenced probably gives the hint you need under the Path Filter section. Using the Path Filter instead of c:/temp/**/*.xlsx makes quite a difference


spiderman
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • August 9, 2021
markatsafe wrote:

@spiderman​ The documentation that @drc43​ referenced probably gives the hint you need under the Path Filter section. Using the Path Filter instead of c:/temp/**/*.xlsx makes quite a difference

how could i include also xls extension not only xlsx ?


david_r
Celebrity
  • August 10, 2021

As mentioned, a remote disk is not ideal for performance. Additionally, make sure that you haven't activated "Retrieve file properties" on the reader, that can add quite a lot of time, especially for remote partitions with large amounts for files.


jkr_wrk
Influencer
Forum|alt.badge.img+29
  • August 10, 2021
spiderman wrote:

i have chosen for path filter *.xlsx but if i want to read also .xls in case i hae old version of excel

how will be looks like ,i have tried in path filter *.xls|*.xlsx but i got nothing or i have tried .xls, .xlsx .

i got only output when i use *.xlsx

what should i write in path filter ?

 

 

According to the documentation you need to use: {*.xls,*.xlsx} or *.xls*

 

http://docs.safe.com/fme/2019.1/html/FME_Desktop_Documentation/FME_ReadersWriters/path/PATH_reader.htm

 

You can view the documentation by clicking Help in the Parameters dialog.

 

Or by clicking on: See Directory and File Pathnames Reader Parameters for additional information and then on Path Filter.


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings