Question

How to remove empty rows in text files? StringReplace

  • 26 November 2013
  • 10 replies
  • 74 views

First off, I am a beginner to FME, so bare that in mind.

 

I have a .txt (notepad) file that contains useful data, but first I need to remove anything superfluous and redundant like extra spaces and empty rows. I am using the StringReplace transform and the text (advanced) editor and I am running FME 2013. Replacing extra spaces was easy but I cannot find out how to replace empty rows (or more than one empty row) with the available commands in FME text editor (fyi: rows are empty- no characters) I used to do this same thing in MS Word with replace ^P^P with ^P but cannot find a working command in the FME text editor to do this. I have searched the documentation, forums and elsewhere, but cannot find how to do. Do those commands just not work, or am I doing something wrong? Thanks.

 


10 replies

Userlevel 4
Hi,

 

 

how about using the Tester and test for "Attribute is empty"?

 

 

If you need to use the StringReplacer you can use the regular expression ^$ to denote an empty string. (^ = start of line, $ = end of line)

 

 

You might also want to use the AttributeTrimmer first, just in case there are some spaces in your otherwise empty string.

 

 

David

 

 

Badge +3
Hi,

 

 

A txt or csv reader is serial.

 

Each row's content is stored in text_line_data.

 

U can check wether a row is empty and skip or discard it.

 

 

Just make a stringsearcher with regexpression ^$ on text_line_data.
Thank you both for your info. The Tester (Attribute is empty) worked great, it removed the empty rows. But I cannot get the StringReplacer 'Text to find' regular expression  ^$ to do anything. I will be needing to use the the string replacer for other variants. FYI: I am assuming that leaving the 'Replacement text' box blank will remove the row once found in StringReplacer. 
Userlevel 4
Hi,

 

 

there is an imporant distinction between the StringReplacer and the Tester:

 

 

The Tester works on an entire feature, where as the StringReplacer works on one or several attributes.

 

 

If you use the TextLineReader, each line is a feature. This means that you cannot filter them using the StringReplacer. Meaning: You can modify the contents with the StringReplacer, but not skip them e.g. by setting the replacement string to blank.

 

 

David
Badge +3
Hi again.

 

 

U use a stringsearcher with ^$ to find the empty rows.The output from the "not matched" port then outputs rows that are not empty.

 

 

Stringreplacer wont filter anything out, as David said. Leaving text to replace blank just keeps them blank/empty.

 

Ok just fyi, the regular expression ^$ in StringReplacer does not work for finding empty (totally empty in 'text_line_data' feature) inside a .txt file. I found that the regular expression ^.$ does! 
Badge +3
actually it does.

 

 

What ^.$ finds is not an empty string,  a . represents any single character.

 

So this means there is something there, or aka the line is not empty.

 

 

Go look up the re-syntax. There  are a lot fo Tcl sites.
Yes, I jumped the gun on ^.$ - those lines had some periods. It does not find empty rows.

 

So I keep trying ^$ with no effect. (Yes, att tester does work, but need to work in stringreplacer as well).

 

I have found that If you just type ^ , the replacement text is added to begiinning  to all of text_line_data that is not null. If you just type $, the replacement text is added to end of all text_line_data that is not null.

 

So, neither of the expressions acknowledges a blank or null row. I guess that is why the ^$ does not work (text files only?) The must be something I am doing wrong, I've looked up all regex docs and still can't find an explanation,  it's probaly a duh thing. Any more suggestions? Thx
NullAttributeReplacer works!

 

 

 

Badge +1
Hi,

 

I was working with XMLs my i/p was having some empty rows. So, as suggested by David R. Tester for 'Attribute is Empty' and StringSearcher by Gio, I tried working on both. The basic difference being:

 

Tester didnot remove the parcing of other XML lines while StringSearcher removed the parcing of the lines which was not taken into consideration.

 

Thus, Tester and StringSearcher both are correct in their own ways depending on their usage (:

 

 

Reply