Question

How to increase performance Directory and File Pathnames to read the name of all excel files in folder and subfolders

  • 9 August 2021
  • 9 replies
  • 6 views

Badge +3

i would like to read the name of all excel files in folder and subfolders but i found it takes long time , any ideas please .


9 replies

Badge +10

Are you using a Path Filter to read in only excel files? (.xls, .xlsx)

https://docs.safe.com/fme/html/FME_Desktop_Documentation/FME_ReadersWriters/path/Set_Path_Filter.htm

Userlevel 5
Badge +29

A lot of things could effect the speed of this. As @drc43​ mentioned, are you filtering by excel files? Also how deep are you searching (with folder hierarchy), how many files/folders are there to check, is the data on a hdd or ssd, is it local or remote, if its remote what's your network connection like?

 

I've run similar processes before (looking for .xlsx) over a remote network drive that had well in excess of 20tb of data. It took a couple of days to complete.

Badge +3

Are you using a Path Filter to read in only excel files? (.xls, .xlsx)

https://docs.safe.com/fme/html/FME_Desktop_Documentation/FME_ReadersWriters/path/Set_Path_Filter.htm

i have chosen for path filter *.xlsx but if i want to read also .xls in case i hae old version of excel

how will be looks like ,i have tried in path filter *.xls|*.xlsx but i got nothing or i have tried .xls, .xlsx .

i got only output when i use *.xlsx

what should i write in path filter ?

 

 

Badge +3

A lot of things could effect the speed of this. As @drc43​ mentioned, are you filtering by excel files? Also how deep are you searching (with folder hierarchy), how many files/folders are there to check, is the data on a hdd or ssd, is it local or remote, if its remote what's your network connection like?

 

I've run similar processes before (looking for .xlsx) over a remote network drive that had well in excess of 20tb of data. It took a couple of days to complete.

the data on hdd but remotely .

i have only to use path filter nothing more could i do to increase the performance because i did already the path filter but it still slow and data around 5 giga byte

Userlevel 5
Badge +29

the data on hdd but remotely .

i have only to use path filter nothing more could i do to increase the performance because i did already the path filter but it still slow and data around 5 giga byte

Whats your definition of slow?

If its on a spinning remote disk (hdd) then I would expect it to have poor performance. What is speed like when browsing the disk using Windows Explorer? roughly how many files are there that make up the 5gb?

Badge +2

@spiderman​ The documentation that @drc43​ referenced probably gives the hint you need under the Path Filter section. Using the Path Filter instead of c:/temp/**/*.xlsx makes quite a difference

Badge +3

@spiderman​ The documentation that @drc43​ referenced probably gives the hint you need under the Path Filter section. Using the Path Filter instead of c:/temp/**/*.xlsx makes quite a difference

how could i include also xls extension not only xlsx ?

Userlevel 4

As mentioned, a remote disk is not ideal for performance. Additionally, make sure that you haven't activated "Retrieve file properties" on the reader, that can add quite a lot of time, especially for remote partitions with large amounts for files.

Userlevel 3
Badge +17

i have chosen for path filter *.xlsx but if i want to read also .xls in case i hae old version of excel

how will be looks like ,i have tried in path filter *.xls|*.xlsx but i got nothing or i have tried .xls, .xlsx .

i got only output when i use *.xlsx

what should i write in path filter ?

 

 

According to the documentation you need to use: {*.xls,*.xlsx} or *.xls*

 

http://docs.safe.com/fme/2019.1/html/FME_Desktop_Documentation/FME_ReadersWriters/path/PATH_reader.htm

 

You can view the documentation by clicking Help in the Parameters dialog.

 

Or by clicking on: See Directory and File Pathnames Reader Parameters for additional information and then on Path Filter.

Reply