Skip to main content
Solved

RegEx to clean up data

  • December 17, 2020
  • 2 replies
  • 51 views

fmenco
Contributor
Forum|alt.badge.img+5

I have a CSV file with attributedata that's structured kind of like this:

 

apple man woman 1234-567\\1

apple man woman 1234-567\\1B

apple man woman 1234-567\\12

apple man woman 1234-567 child

890.1234-567

apple A3 1234-567

 

I'm only interested in the 1234-567 number sequence, since this is an ID I need to match further along. The rest of the attribute data can be deleted. The number sequence seems to always have 4 numbers before the hyphen and 3 numbers after. The hyphen isn't essential and can be removed. The number are the most important

 

How do I go about it? Do I need a stringreplacer or more than one?

I tried this with stringreplacer, mode= regular expression, text to change set to: (?<!\\\\)[0-9]+

But what do I fill in at replacement text? If I fill in a regular expressions in that field, it returns the regular expression...!

 

I'm using fme desktop 2019

 

 

Best answer by ebygomm

Easier to use a stringsearcher, and search for the regex. If will then end up in the match attribute

\d{4}\-\d{3}

 

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

2 replies

ebygomm
Influencer
Forum|alt.badge.img+46
  • Influencer
  • Best Answer
  • December 17, 2020

Easier to use a stringsearcher, and search for the regex. If will then end up in the match attribute

\d{4}\-\d{3}

 


gio
Contributor
Forum|alt.badge.img+15
  • Contributor
  • December 18, 2020

@fmenco​ 

 

Stringreplacer is more direct:

regex replace

to search: .*(\\d{4}\\-\\d{3}).*

to replace : \\1