Skip to main content
Solved

How to extract certain parts from path name ?

  • August 22, 2017
  • 9 replies
  • 310 views

lifalin2016
Supporter
Forum|alt.badge.img+38

Hi,

I'm looking for a best practice to extract parts of a path name.

If I have a path: S:\\Miljø og teknik\\Svendborg Vand\\Anlæg vand\\Microstation\\Dokgraf dokumenter\\VandGraf\\vand_knude\\ULB00004\\Arbejdsrapport.pdf

I need to extract the part starting with "VandGraf", i.e.: "VandGraf\\vand_knude\\ULB00004\\Arbejdsrapport.pdf" into one attribute, and the file name into another.

What is the simplest way to accomplish this (without custom Python coding) ?

Cheers.

Best answer by ebygomm

String searcher with some regex possibly depending on the exact rules you want to follow, e.g. do you need everything always starting with VandGraf or everything at that level which sometimes might be in a folder not called VandGraf?

If the latter, finding the substring after every 6th backslash might be more straightforward e.g.

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

9 replies

ebygomm
Influencer
Forum|alt.badge.img+44
  • Influencer
  • 3434 replies
  • Best Answer
  • August 22, 2017

String searcher with some regex possibly depending on the exact rules you want to follow, e.g. do you need everything always starting with VandGraf or everything at that level which sometimes might be in a folder not called VandGraf?

If the latter, finding the substring after every 6th backslash might be more straightforward e.g.


Forum|alt.badge.img+2
  • 325 replies
  • August 22, 2017

Have a look at the FilenamePartExtractor transformer also


robert_punt
Contributor
Forum|alt.badge.img+7
  • Contributor
  • 23 replies
  • August 22, 2017

You can also use an attributesplitter. Splitting on the '\\' but it has to be in the same place in the list.


lifalin2016
Supporter
Forum|alt.badge.img+38
  • Author
  • Supporter
  • 592 replies
  • September 1, 2017

String searcher with some regex possibly depending on the exact rules you want to follow, e.g. do you need everything always starting with VandGraf or everything at that level which sometimes might be in a folder not called VandGraf?

If the latter, finding the substring after every 6th backslash might be more straightforward e.g.

Actually, I wanted the former: everything starting with "Vandgraf", or NULL if not found.

 

 


lifalin2016
Supporter
Forum|alt.badge.img+38
  • Author
  • Supporter
  • 592 replies
  • September 1, 2017

You can also use an attributesplitter. Splitting on the '\\' but it has to be in the same place in the list.

Alas, I can't be sure that the paths are that well-structured.

 

 


lifalin2016
Supporter
Forum|alt.badge.img+38
  • Author
  • Supporter
  • 592 replies
  • September 1, 2017

Have a look at the FilenamePartExtractor transformer also

I did, but it just gives me the "standard" parts of a path name, not a custom and optional part.

 

 


lifalin2016
Supporter
Forum|alt.badge.img+38
  • Author
  • Supporter
  • 592 replies
  • September 1, 2017

I ended up solving it with a PythonCaller:

def processFeature(ft):
    doc_name = ft.getAttribute("Documentname")
    if doc_name != None:
        try:
            p = doc_name.upper().index("VANDGRAF")
            if p > 0:
                RelDocName = doc_name[p:]
                ft.setAttribute("RelDocName", unicode(RelDocName))
        except:
            pass
        pass
    pass


lifalin2016
Supporter
Forum|alt.badge.img+38
  • Author
  • Supporter
  • 592 replies
  • September 1, 2017

I ended up solving it with a PythonCaller:

def processFeature(ft):
    doc_name = ft.getAttribute("Documentname")
    if doc_name != None:
        try:
            p = doc_name.upper().index("VANDGRAF")
            if p > 0:
                RelDocName = doc_name[p:]
                ft.setAttribute("RelDocName", unicode(RelDocName))
        except:
            pass
        pass
    pass

The try-except was necessary, as it apparently throws an exception if the substring isn't found.

 

 


ebygomm
Influencer
Forum|alt.badge.img+44
  • Influencer
  • 3434 replies
  • September 1, 2017
Actually, I wanted the former: everything starting with "Vandgraf", or NULL if not found.

 

 

Even easier then with regex VandGraf.+ in the string searcher