Skip to main content

Hi,

I'm looking for a best practice to extract parts of a path name.

If I have a path: S:\\Miljø og teknik\\Svendborg Vand\\Anlæg vand\\Microstation\\Dokgraf dokumenter\\VandGraf\\vand_knude\\ULB00004\\Arbejdsrapport.pdf

I need to extract the part starting with "VandGraf", i.e.: "VandGraf\\vand_knude\\ULB00004\\Arbejdsrapport.pdf" into one attribute, and the file name into another.

What is the simplest way to accomplish this (without custom Python coding) ?

Cheers.

String searcher with some regex possibly depending on the exact rules you want to follow, e.g. do you need everything always starting with VandGraf or everything at that level which sometimes might be in a folder not called VandGraf?

If the latter, finding the substring after every 6th backslash might be more straightforward e.g.


Have a look at the FilenamePartExtractor transformer also


You can also use an attributesplitter. Splitting on the '\\' but it has to be in the same place in the list.


String searcher with some regex possibly depending on the exact rules you want to follow, e.g. do you need everything always starting with VandGraf or everything at that level which sometimes might be in a folder not called VandGraf?

If the latter, finding the substring after every 6th backslash might be more straightforward e.g.

Actually, I wanted the former: everything starting with "Vandgraf", or NULL if not found.

 

 


You can also use an attributesplitter. Splitting on the '\\' but it has to be in the same place in the list.

Alas, I can't be sure that the paths are that well-structured.

 

 


Have a look at the FilenamePartExtractor transformer also

I did, but it just gives me the "standard" parts of a path name, not a custom and optional part.

 

 


I ended up solving it with a PythonCaller:

def processFeature(ft):
    doc_name = ft.getAttribute("Documentname")
    if doc_name != None:
        try:
            p = doc_name.upper().index("VANDGRAF")
            if p > 0:
                RelDocName = doc_nameip:]
                ft.setAttribute("RelDocName", unicode(RelDocName))
        except:
            pass
        pass
    pass


I ended up solving it with a PythonCaller:

def processFeature(ft):
    doc_name = ft.getAttribute("Documentname")
    if doc_name != None:
        try:
            p = doc_name.upper().index("VANDGRAF")
            if p > 0:
                RelDocName = doc_nameip:]
                ft.setAttribute("RelDocName", unicode(RelDocName))
        except:
            pass
        pass
    pass

The try-except was necessary, as it apparently throws an exception if the substring isn't found.

 

 


Actually, I wanted the former: everything starting with "Vandgraf", or NULL if not found.

 

 

Even easier then with regex VandGraf.+ in the string searcher

 

 


Reply