Skip to main content
Question

How to increase performance Directory and File Pathnames to read the name of all excel files in folder and subfolders

  • August 9, 2021
  • 9 replies
  • 61 views

spiderman
Contributor
Forum|alt.badge.img+7

i would like to read the name of all excel files in folder and subfolders but i found it takes long time , any ideas please .

9 replies

drc43
Contributor
Forum|alt.badge.img+11
  • Contributor
  • 83 replies
  • August 9, 2021

Are you using a Path Filter to read in only excel files? (.xls, .xlsx)

https://docs.safe.com/fme/html/FME_Desktop_Documentation/FME_ReadersWriters/path/Set_Path_Filter.htm


hkingsbury
Celebrity
Forum|alt.badge.img+63
  • Celebrity
  • 1625 replies
  • August 9, 2021

A lot of things could effect the speed of this. As @drc43​ mentioned, are you filtering by excel files? Also how deep are you searching (with folder hierarchy), how many files/folders are there to check, is the data on a hdd or ssd, is it local or remote, if its remote what's your network connection like?

 

I've run similar processes before (looking for .xlsx) over a remote network drive that had well in excess of 20tb of data. It took a couple of days to complete.


spiderman
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • 64 replies
  • August 9, 2021

Are you using a Path Filter to read in only excel files? (.xls, .xlsx)

https://docs.safe.com/fme/html/FME_Desktop_Documentation/FME_ReadersWriters/path/Set_Path_Filter.htm

i have chosen for path filter *.xlsx but if i want to read also .xls in case i hae old version of excel

how will be looks like ,i have tried in path filter *.xls|*.xlsx but i got nothing or i have tried .xls, .xlsx .

i got only output when i use *.xlsx

what should i write in path filter ?

 

 


spiderman
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • 64 replies
  • August 9, 2021

A lot of things could effect the speed of this. As @drc43​ mentioned, are you filtering by excel files? Also how deep are you searching (with folder hierarchy), how many files/folders are there to check, is the data on a hdd or ssd, is it local or remote, if its remote what's your network connection like?

 

I've run similar processes before (looking for .xlsx) over a remote network drive that had well in excess of 20tb of data. It took a couple of days to complete.

the data on hdd but remotely .

i have only to use path filter nothing more could i do to increase the performance because i did already the path filter but it still slow and data around 5 giga byte


hkingsbury
Celebrity
Forum|alt.badge.img+63
  • Celebrity
  • 1625 replies
  • August 9, 2021

the data on hdd but remotely .

i have only to use path filter nothing more could i do to increase the performance because i did already the path filter but it still slow and data around 5 giga byte

Whats your definition of slow?

If its on a spinning remote disk (hdd) then I would expect it to have poor performance. What is speed like when browsing the disk using Windows Explorer? roughly how many files are there that make up the 5gb?


Forum|alt.badge.img+2
  • 1891 replies
  • August 9, 2021

@spiderman​ The documentation that @drc43​ referenced probably gives the hint you need under the Path Filter section. Using the Path Filter instead of c:/temp/**/*.xlsx makes quite a difference


spiderman
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • 64 replies
  • August 9, 2021

@spiderman​ The documentation that @drc43​ referenced probably gives the hint you need under the Path Filter section. Using the Path Filter instead of c:/temp/**/*.xlsx makes quite a difference

how could i include also xls extension not only xlsx ?


david_r
Celebrity
  • 8392 replies
  • August 10, 2021

As mentioned, a remote disk is not ideal for performance. Additionally, make sure that you haven't activated "Retrieve file properties" on the reader, that can add quite a lot of time, especially for remote partitions with large amounts for files.


jkr_wrk
Influencer
Forum|alt.badge.img+35
  • 424 replies
  • August 10, 2021

i have chosen for path filter *.xlsx but if i want to read also .xls in case i hae old version of excel

how will be looks like ,i have tried in path filter *.xls|*.xlsx but i got nothing or i have tried .xls, .xlsx .

i got only output when i use *.xlsx

what should i write in path filter ?

 

 

According to the documentation you need to use: {*.xls,*.xlsx} or *.xls*

 

http://docs.safe.com/fme/2019.1/html/FME_Desktop_Documentation/FME_ReadersWriters/path/PATH_reader.htm

 

You can view the documentation by clicking Help in the Parameters dialog.

 

Or by clicking on: See Directory and File Pathnames Reader Parameters for additional information and then on Path Filter.