Skip to main content
Solved

RegEx to clean up data

  • December 17, 2020
  • 2 replies
  • 37 views

fmenco
Contributor
Forum|alt.badge.img+5

I have a CSV file with attributedata that's structured kind of like this:

 

apple man woman 1234-567\\1

apple man woman 1234-567\\1B

apple man woman 1234-567\\12

apple man woman 1234-567 child

890.1234-567

apple A3 1234-567

 

I'm only interested in the 1234-567 number sequence, since this is an ID I need to match further along. The rest of the attribute data can be deleted. The number sequence seems to always have 4 numbers before the hyphen and 3 numbers after. The hyphen isn't essential and can be removed. The number are the most important

 

How do I go about it? Do I need a stringreplacer or more than one?

I tried this with stringreplacer, mode= regular expression, text to change set to: (?<!\\\\)[0-9]+

But what do I fill in at replacement text? If I fill in a regular expressions in that field, it returns the regular expression...!

 

I'm using fme desktop 2019

 

 

Best answer by ebygomm

Easier to use a stringsearcher, and search for the regex. If will then end up in the match attribute

\d{4}\-\d{3}

 

View original
Did this help you find an answer to your question?

2 replies

ebygomm
Influencer
Forum|alt.badge.img+31
  • Influencer
  • Best Answer
  • December 17, 2020

Easier to use a stringsearcher, and search for the regex. If will then end up in the match attribute

\d{4}\-\d{3}

 


gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • December 18, 2020

@fmenco​ 

 

Stringreplacer is more direct:

regex replace

to search: .*(\\d{4}\\-\\d{3}).*

to replace : \\1

 


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings