Skip to main content
Solved

Replace value with a null


sdepriest
Contributor
Forum|alt.badge.img+7
This should be simple...

 

I'm using FME 2013 SP1.

 

 

I'm reading values from Excel.  When I encounter any alphabet character, I want to replace the full value with a null.

 

 

I'm fiddling with a StringReplacer and using a regular expression [a-zA-Z], but not sure what I can put in the Replacement Text.

 

 

My source data in column looks like this:

 

 

2013-05-31 00:00:00

 

Today

 

Day 0

 

Day 1

 

Day 2

 

Day 3

 

Day 4

 

Day 5

 

Day 6

 

 

I want the date string to remain in the column.  I want the "Today" and "Day x" values to be converted to null values...

 

 

Sarah

Best answer by takashi

A good idea flashed into my mind. The job in the orange bookmark could be done through only one StringReplacer with these settings: Text to Find: ^([0-9]{4}-[0-9]{2}-[0-9]{2}\\s[0-9]{2}:[0-9]{2}:[0-9]{2})?(.*)$ Replacement Text: \\1 Use Regular Expressions: yes

 

 

This may be the most condensed solution.

 

Takashi
View original
Did this help you find an answer to your question?
This post is closed to further activity.
It may be a question with a best answer, an implemented idea, or just a post needing no comment.
If you have a follow-up or related question, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

22 replies

david_r
Evangelist
  • June 3, 2013
Hi,

 

 

you could use a StringSearcher with the regexp "[\\p{L}]" (without the quotes, of course). Then add a Tester, if _matched_characters = 0 then you have an attribute without any letters in it.

 

 

For an explanation of the slightly unusual regexp, see here. The idea is that it will match any letter, not just just a-z, including such as àüåñ, etc.

 

 

David

sdepriest
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • June 3, 2013
I entered [\\p{L}] in a stringSearcher, but got this message.

 

 

Illegal operator `@Tcl2 -- failed to evaluate expression `StringSearcher_e18efcc7_db38_416d_9340_34593e766f8d3_doGrep {<openbracket><backslash>p<opencurly>L<closecurly><closebracket>} {_matched_characters} {_matched_parts}' -- couldn't compile regular expression pattern: invalid escape \\ sequence' supplied for use in TestFactory -- operator must be one of < > <= >= != = ==

 

 

I'll keep searching through the RegEx stuff...

 


sdepriest
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • June 3, 2013
For clarification...I don't want to split the rows with numbers and those with letters apart.  They need to be kept in sequential order.  Using the StringSearcher forces me to split them apart into matched and non-matched.

 

 

I want to read the values, evaluate if it contains a letter, and set to null if it does have a letter.

 

 

Sarah

david_r
Evangelist
  • June 3, 2013
Looks like the Tcl-based StringSearcher doesn't support Unicode extensions. Too bad.

 

 

If you're positive about never encounting any other charachter than the letters A-Z, you could use the regexp [a-zA-Z] as you first suggested.

 

 

To preserve the order of the features, you could start with a counter, then order the data based on the counter in the end (Sorter). Not super efficient, but it shouldn't cause any problems for small-ish amounts of data.

 

 

David

sdepriest
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • June 3, 2013
Hi David,

 

 

What transformer should be used to set the value to null?

 

 

AttributeRenamer?
  • Old Attribute = attribute name that has letters
  • Default Value = ?  do you literally type the word "null" ?

Sarah


david_r
Evangelist
  • June 3, 2013
Hi Sarah,

 

 

if you want to write a NULL value to a database, use an AttributeRemover to delete the attribute completely. Otherwise you risk writing an empty string, which isn't the same.

 

 

Hopefully FME2014 will have a more logical way of representing NULL values :-)

 

 

David

fmelizard
Contributor
Forum|alt.badge.img+17
  • Contributor
  • June 3, 2013
Hi,

 

 

You can also try the string replacer with: ^(Today|Day \\d{1})$, text to replace ' '.

 

Hope this helps.

 

 


sdepriest
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • June 3, 2013
I ended up with a solution.  Thanks for all the suggestions.  Yes I hope Safe will come up with a better way to set a NULL value.

 

 

Yes, I want to write a null value to the database for these rows with letters, but keep the date value in the same field.

 

 

I used David's suggestion of adding a counter ahead of the StringSearcher.  In the StringSearcher I used the [a-zA-Z] regular expression which split the data into matched and not_matched.  Off the Matched port, I added an AttributeRenamer and deleted the attribute then added a new attribute with the same name.  After that, I added a Sorter and joined both the Matched & Not_Matched pipelines to the Input on the Sorter.  The joined the Sorter to the Destination (SQL Server table).  The data comes out in the original sort order and the values are null in the appropriate places.

 

 

Not very elegant, but does the trick.

 

 

Sarah

 

 


takashi
Supporter
  • June 4, 2013
Hi Sarah,

 

 

an alternative approach: if you need to write finally the attribute into SQL Server table as a datetime value, the string should be formatted sooner or later. So, you can format the string to YYYYmmddHHMMSS (14 digits) by a DateFormatter first and then test it by an AttributeClassifier (Classification To Test = Digit) instead of the StringSearcher.

 

 

Takashi

takashi
Supporter
  • June 4, 2013
About using a pair of Counter and Sorter.

 

This is a good way often used to keep the order of features. However, I think the order of features is not so important in the case to write features into a database table, because the physical row order in a database table could be arbitrary (i.e. does not always match with the writing order) according to the optimization by the DBMS.

 

Takashi

sdepriest
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • June 4, 2013
Takashi,

 

Thanks for the suggestion, but the AttributeClassifier is not passing the rows through, failing them all when I choose the Digit in the Classification to Test.  I tried a few other values in the Classification to Test and none have gone through the passed port.

 

 

All, I'm ultimately cleansing the data with this translation.  The user has now requested I convert a field with 92 spaces in it to a null value.  And this 92 space value exists in the same field as I've been working with in this post. So I believe with what I've learned, I can handle this in the same manner.

 

 

Thanks,

 

Sarah

fmelizard
Contributor
Forum|alt.badge.img+17
  • Contributor
  • June 4, 2013
Hi Sarah,

 

 

Just out of curiosity did you try the NullAttributeRemover from the FME store?

 

 

Itay

sdepriest
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • June 4, 2013
Itay,

 

No I hadn't tried it.  I downloaded and looked at the transformer description and realized it might work for the 92 spaces value...but it didn't.  The spaces still appeared in the Inspector Table View when I tested this.

 

 

The help says the NullAttributeRemover "checks all the selected attributes and removes them if their contents are 0 or blank"...what do they mean by "blank"...null?  or empty string?  or spaces?  Doesn't seem to mean spaces with my testing...

 

 

by the title of the transformer I initially thought it would delete or remove attributes with a null value...and I don't want to do that.

 

 

Sarah

fmelizard
Contributor
Forum|alt.badge.img+17
  • Contributor
  • June 4, 2013
Sarah,

 

 

For the spaces you can use the string replacer with a regex expression :

 

(' '{92}), have a look at example 2 in the documentation, the replacement text can be left empty. Possibly the nullattributeremover can than be of help.

 

 

Itay

 

 


sdepriest
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • June 4, 2013
I keep running into trouble...

 

 

Now I'm experiencing this:

 

http://fmepedia.safe.com/articles/Error_Unexpected_Behavior/Regular-expressions-that-work-in-one-parameter-fail-in-another

 

 

rather than have a series of StringSearchers  (one for letters [a-zA-Z] and one for the 92 spaces ^\\s), I thought I'd be smart and use that fancy-dancy TestFilter.  And those same regular expressions do not pass the darn thing.  So I see the KB article I referenced above.  Geez!

 

 

I must have some freaky data.

 

 

Here is what I got so far...the part in the orange bookmark works fine...just trying to make it more condensed with a TestFilter.

 

 

 


takashi
Supporter
  • June 5, 2013
Hi Sarah,

 

 

changing the way of thinking... A StringSeacher with the following regular expression passes only strings formatted to "nnnn-nn-nn<space>nn:nn:nn" ('n' indicates a digit character): ^[0-9]{4}-[0-9]{2}-[0-9]{2}\\s[0-9]{2}:[0-9]{2}:[0-9]{2}$   This is a simplified example, other stricter expressions for the datetime format can be considered, if necessary.   Alternatively, a StringReplacer with the following settings replaces a string - starts with a non-digit character including a space - with an empty string (blank): Text to Find: ^[^0-9].*$ Replacemant Text: <not set i.e. blank> Use Regular Expressions: yes

 

 

Takashi

takashi
Supporter
  • June 5, 2013
> ...just trying to make it more condensed with a TestFilter.

 

 

"[a-zA-Z]" matches with just one alphabetical character, "\\^s" matches with just one white space. Try these expressions in the TestFilter: "^[a-zA-Z].*$" for Letter (begins with an alphabetial character)  "^\\s.*$" for 92_spaces (begins with a white space) "^([a-zA-Z]|\\s).*$" for both

 

Takashi

takashi
Supporter
  • Best Answer
  • June 5, 2013
A good idea flashed into my mind. The job in the orange bookmark could be done through only one StringReplacer with these settings: Text to Find: ^([0-9]{4}-[0-9]{2}-[0-9]{2}\\s[0-9]{2}:[0-9]{2}:[0-9]{2})?(.*)$ Replacement Text: \\1 Use Regular Expressions: yes

 

 

This may be the most condensed solution.

 

Takashi

sdepriest
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • June 5, 2013
Takashi!  You're full of great ideas!  I'll give some of these a whirl and report back.

 

 

Thanks.

 

Sarah

takashi
Supporter
  • June 5, 2013
Hi Sarah,

 

 

if you need to remove attributes containing non-datetime value, a Tester - Operator: Matches Regex - Right Value: ^[0-9]{4}-[0-9]{2}-[0-9]{2}\\s[0-9]{2}:[0-9]{2}:[0-9]{2}$ and an AttributeRemover (for failed features from the Tester) would be effective. Anyway, I think the point for condensing the workflow is a regular expression for matching with a datetime value rather than a non-datetime value. Good luck!

 

Takashi

sdepriest
Contributor
Forum|alt.badge.img+7
  • Author
  • Contributor
  • June 5, 2013
Takashi, Your "most condensed" solution worked just slick!  I replaced the whole orange bookmark with a single StringReplacer with the regular expression you suggested.

 

 

There are various websites around that list the syntax for regular expressions, which do you use?

 

 

Sarah

takashi
Supporter
  • June 5, 2013
> There are various websites around that list the syntax for regular expressions, which do you use?

 

 

I often refer to Regular-Expressions.info. This site is linked from some pages of FME documentations e.g. the description about the StringSearcher.

 

 

Takashi

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings