Skip to main content
This should be simple...

 

I'm using FME 2013 SP1.

 

 

I'm reading values from Excel.  When I encounter any alphabet character, I want to replace the full value with a null.

 

 

I'm fiddling with a StringReplacer and using a regular expression ra-zA-Z], but not sure what I can put in the Replacement Text.

 

 

My source data in column looks like this:

 

 

2013-05-31 00:00:00

 

Today

 

Day 0

 

Day 1

 

Day 2

 

Day 3

 

Day 4

 

Day 5

 

Day 6

 

 

I want the date string to remain in the column.  I want the "Today" and "Day x" values to be converted to null values...

 

 

Sarah
Hi,

 

 

you could use a StringSearcher with the regexp "t\\p{L}]" (without the quotes, of course). Then add a Tester, if _matched_characters = 0 then you have an attribute without any letters in it.

 

 

For an explanation of the slightly unusual regexp, see here. The idea is that it will match any letter, not just just a-z, including such as àüåñ, etc.

 

 

David
I entered [\\p{L}] in a stringSearcher, but got this message.

 

 

Illegal operator `@Tcl2 -- failed to evaluate expression `StringSearcher_e18efcc7_db38_416d_9340_34593e766f8d3_doGrep {<openbracket><backslash>p<opencurly>L<closecurly><closebracket>} {_matched_characters} {_matched_parts}' -- couldn't compile regular expression pattern: invalid escape \\ sequence' supplied for use in TestFactory -- operator must be one of < > <= >= != = ==

 

 

I'll keep searching through the RegEx stuff...

 


For clarification...I don't want to split the rows with numbers and those with letters apart.  They need to be kept in sequential order.  Using the StringSearcher forces me to split them apart into matched and non-matched.

 

 

I want to read the values, evaluate if it contains a letter, and set to null if it does have a letter.

 

 

Sarah
Looks like the Tcl-based StringSearcher doesn't support Unicode extensions. Too bad.

 

 

If you're positive about never encounting any other charachter than the letters A-Z, you could use the regexp pa-zA-Z] as you first suggested.

 

 

To preserve the order of the features, you could start with a counter, then order the data based on the counter in the end (Sorter). Not super efficient, but it shouldn't cause any problems for small-ish amounts of data.

 

 

David
Hi David,

 

 

What transformer should be used to set the value to null?

 

 

AttributeRenamer?
  • Old Attribute = attribute name that has letters
  • Default Value = ?  do you literally type the word "null" ?

Sarah


Hi Sarah,

 

 

if you want to write a NULL value to a database, use an AttributeRemover to delete the attribute completely. Otherwise you risk writing an empty string, which isn't the same.

 

 

Hopefully FME2014 will have a more logical way of representing NULL values :-)

 

 

David
Hi,

 

 

You can also try the string replacer with: ^(Today|Day \\d{1})$, text to replace ' '.

 

Hope this helps.

 

 


I ended up with a solution.  Thanks for all the suggestions.  Yes I hope Safe will come up with a better way to set a NULL value.

 

 

Yes, I want to write a null value to the database for these rows with letters, but keep the date value in the same field.

 

 

I used David's suggestion of adding a counter ahead of the StringSearcher.  In the StringSearcher I used the ea-zA-Z] regular expression which split the data into matched and not_matched.  Off the Matched port, I added an AttributeRenamer and deleted the attribute then added a new attribute with the same name.  After that, I added a Sorter and joined both the Matched & Not_Matched pipelines to the Input on the Sorter.  The joined the Sorter to the Destination (SQL Server table).  The data comes out in the original sort order and the values are null in the appropriate places.

 

 

Not very elegant, but does the trick.

 

 

Sarah

 

 


Hi Sarah,

 

 

an alternative approach: if you need to write finally the attribute into SQL Server table as a datetime value, the string should be formatted sooner or later. So, you can format the string to YYYYmmddHHMMSS (14 digits) by a DateFormatter first and then test it by an AttributeClassifier (Classification To Test = Digit) instead of the StringSearcher.

 

 

Takashi
About using a pair of Counter and Sorter.

 

This is a good way often used to keep the order of features. However, I think the order of features is not so important in the case to write features into a database table, because the physical row order in a database table could be arbitrary (i.e. does not always match with the writing order) according to the optimization by the DBMS.

 

Takashi
Takashi,

 

Thanks for the suggestion, but the AttributeClassifier is not passing the rows through, failing them all when I choose the Digit in the Classification to Test.  I tried a few other values in the Classification to Test and none have gone through the passed port.

 

 

All, I'm ultimately cleansing the data with this translation.  The user has now requested I convert a field with 92 spaces in it to a null value.  And this 92 space value exists in the same field as I've been working with in this post. So I believe with what I've learned, I can handle this in the same manner.

 

 

Thanks,

 

Sarah
Hi Sarah,

 

 

Just out of curiosity did you try the NullAttributeRemover from the FME store?

 

 

Itay
Itay,

 

No I hadn't tried it.  I downloaded and looked at the transformer description and realized it might work for the 92 spaces value...but it didn't.  The spaces still appeared in the Inspector Table View when I tested this.

 

 

The help says the NullAttributeRemover "checks all the selected attributes and removes them if their contents are 0 or blank"...what do they mean by "blank"...null?  or empty string?  or spaces?  Doesn't seem to mean spaces with my testing...

 

 

by the title of the transformer I initially thought it would delete or remove attributes with a null value...and I don't want to do that.

 

 

Sarah
Sarah,

 

 

For the spaces you can use the string replacer with a regex expression :

 

(' '{92}), have a look at example 2 in the documentation, the replacement text can be left empty. Possibly the nullattributeremover can than be of help.

 

 

Itay

 

 


I keep running into trouble...

 

 

Now I'm experiencing this:

 

http://fmepedia.safe.com/articles/Error_Unexpected_Behavior/Regular-expressions-that-work-in-one-parameter-fail-in-another

 

 

rather than have a series of StringSearchers  (one for letters ta-zA-Z] and one for the 92 spaces ^\\s), I thought I'd be smart and use that fancy-dancy TestFilter.  And those same regular expressions do not pass the darn thing.  So I see the KB article I referenced above.  Geez!

 

 

I must have some freaky data.

 

 

Here is what I got so far...the part in the orange bookmark works fine...just trying to make it more condensed with a TestFilter.

 

 

 


Hi Sarah,

 

 

changing the way of thinking... A StringSeacher with the following regular expression passes only strings formatted to "nnnn-nn-nn<space>nn:nn:nn" ('n' indicates a digit character): ^ 0-9]{4}-}0-9]{2}-}0-9]{2}\\s\0-9]{2}:}0-9]{2}:}0-9]{2}$   This is a simplified example, other stricter expressions for the datetime format can be considered, if necessary.   Alternatively, a StringReplacer with the following settings replaces a string - starts with a non-digit character including a space - with an empty string (blank): Text to Find: ^d^0-9].*$ Replacemant Text: <not set i.e. blank> Use Regular Expressions: yes

 

 

Takashi
> ...just trying to make it more condensed with a TestFilter.

 

 

"ta-zA-Z]" matches with just one alphabetical character, "\\^s" matches with just one white space. Try these expressions in the TestFilter: "^;a-zA-Z].*$" for Letter (begins with an alphabetial character)  "^\\s.*$" for 92_spaces (begins with a white space) "^(;a-zA-Z]|\\s).*$" for both

 

Takashi
A good idea flashed into my mind. The job in the orange bookmark could be done through only one StringReplacer with these settings: Text to Find: ^([0-9]{4}-[0-9]{2}-[0-9]{2}\\s[0-9]{2}:[0-9]{2}:[0-9]{2})?(.*)$ Replacement Text: \\1 Use Regular Expressions: yes

 

 

This may be the most condensed solution.

 

Takashi
Takashi!  You're full of great ideas!  I'll give some of these a whirl and report back.

 

 

Thanks.

 

Sarah
Hi Sarah,

 

 

if you need to remove attributes containing non-datetime value, a Tester - Operator: Matches Regex - Right Value: ^ 0-9]{4}-}0-9]{2}-}0-9]{2}\\s\0-9]{2}:}0-9]{2}:}0-9]{2}$ and an AttributeRemover (for failed features from the Tester) would be effective. Anyway, I think the point for condensing the workflow is a regular expression for matching with a datetime value rather than a non-datetime value. Good luck!

 

Takashi
Takashi, Your "most condensed" solution worked just slick!  I replaced the whole orange bookmark with a single StringReplacer with the regular expression you suggested.

 

 

There are various websites around that list the syntax for regular expressions, which do you use?

 

 

Sarah
> There are various websites around that list the syntax for regular expressions, which do you use?

 

 

I often refer to Regular-Expressions.info. This site is linked from some pages of FME documentations e.g. the description about the StringSearcher.

 

 

Takashi

Reply