Solved

How to extract certain parts from path name ?

  • 22 August 2017
  • 9 replies
  • 34 views

Userlevel 1
Badge +22

Hi,

I'm looking for a best practice to extract parts of a path name.

If I have a path: S:\\Miljø og teknik\\Svendborg Vand\\Anlæg vand\\Microstation\\Dokgraf dokumenter\\VandGraf\\vand_knude\\ULB00004\\Arbejdsrapport.pdf

I need to extract the part starting with "VandGraf", i.e.: "VandGraf\\vand_knude\\ULB00004\\Arbejdsrapport.pdf" into one attribute, and the file name into another.

What is the simplest way to accomplish this (without custom Python coding) ?

Cheers.

icon

Best answer by ebygomm 22 August 2017, 14:34

View original

9 replies

Userlevel 1
Badge +21

String searcher with some regex possibly depending on the exact rules you want to follow, e.g. do you need everything always starting with VandGraf or everything at that level which sometimes might be in a folder not called VandGraf?

If the latter, finding the substring after every 6th backslash might be more straightforward e.g.

Badge +2

Have a look at the FilenamePartExtractor transformer also

Badge

You can also use an attributesplitter. Splitting on the '\\' but it has to be in the same place in the list.

Userlevel 1
Badge +22

String searcher with some regex possibly depending on the exact rules you want to follow, e.g. do you need everything always starting with VandGraf or everything at that level which sometimes might be in a folder not called VandGraf?

If the latter, finding the substring after every 6th backslash might be more straightforward e.g.

Actually, I wanted the former: everything starting with "Vandgraf", or NULL if not found.

 

 

Userlevel 1
Badge +22

You can also use an attributesplitter. Splitting on the '\\' but it has to be in the same place in the list.

Alas, I can't be sure that the paths are that well-structured.

 

 

Userlevel 1
Badge +22

Have a look at the FilenamePartExtractor transformer also

I did, but it just gives me the "standard" parts of a path name, not a custom and optional part.

 

 

Userlevel 1
Badge +22

I ended up solving it with a PythonCaller:

def processFeature(ft):
    doc_name = ft.getAttribute("Documentname")
    if doc_name != None:
        try:
            p = doc_name.upper().index("VANDGRAF")
            if p > 0:
                RelDocName = doc_name[p:]
                ft.setAttribute("RelDocName", unicode(RelDocName))
        except:
            pass
        pass
    pass

Userlevel 1
Badge +22

I ended up solving it with a PythonCaller:

def processFeature(ft):
    doc_name = ft.getAttribute("Documentname")
    if doc_name != None:
        try:
            p = doc_name.upper().index("VANDGRAF")
            if p > 0:
                RelDocName = doc_name[p:]
                ft.setAttribute("RelDocName", unicode(RelDocName))
        except:
            pass
        pass
    pass

The try-except was necessary, as it apparently throws an exception if the substring isn't found.

 

 

Userlevel 1
Badge +21
Actually, I wanted the former: everything starting with "Vandgraf", or NULL if not found.

 

 

Even easier then with regex VandGraf.+ in the string searcher

 

 

Reply