Solved

RegEx to clean up data

4 years ago
December 17, 2020
2 replies
41 views

fmenco
Contributor
86 replies

I have a CSV file with attributedata that's structured kind of like this:

apple man woman 1234-567\\1

apple man woman 1234-567\\1B

apple man woman 1234-567\\12

apple man woman 1234-567 child

890.1234-567

apple A3 1234-567

I'm only interested in the 1234-567 number sequence, since this is an ID I need to match further along. The rest of the attribute data can be deleted. The number sequence seems to always have 4 numbers before the hyphen and 3 numbers after. The hyphen isn't essential and can be removed. The number are the most important

How do I go about it? Do I need a stringreplacer or more than one?

I tried this with stringreplacer, mode= regular expression, text to change set to: (?<!\\\\)[0-9]+

But what do I fill in at replacement text? If I fill in a regular expressions in that field, it returns the regular expression...!

I'm using fme desktop 2019

Best answer by ebygomm

Easier to use a stringsearcher, and search for the regex. If will then end up in the match attribute

\d{4}\-\d{3}

View original

Did this help you find an answer to your question?

+39

ebygomm
Influencer
3313 replies
Best Answer
4 years ago
December 17, 2020

Easier to use a stringsearcher, and search for the regex. If will then end up in the match attribute

\d{4}\-\d{3}

+15

gio
Contributor
2252 replies
4 years ago
December 18, 2020

@fmenco

Stringreplacer is more direct:

regex replace

to search: .*(\\d{4}\\-\\d{3}).*

to replace : \\1

Reply

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

RegEx to clean up data