Solved

Is there an FME Reader that can read ~'Fortran inspired text datasets' (i.e. columns separated by fixed widths)?

3 years ago
March 1, 2022
4 replies
44 views

+10

thijsknapen
Contributor
154 replies

Last week I was contacted by a colleague to process a text based dataset that reminded me of something I had seen in the past for Fortran based datasets.

In particular, the text based dataset looks somewhat like a CSV dataset, only now columns are separated by a fixed width. See below for an example of how it looked:

I was wondering if there is maybe already some type of reader that can deal with such type of data?

Of course, if you get such a dedicated dataset, you can use a text_line reader, cut/split the text line into separate features (based on the known width (e.g. using a '#s#s#s' format string in an AttributeSplitter, see third example on the documentation page)), and then using e.g. an AttributeTrimmer to remove excessive whitespace.

However, this method needs to be configured for each individual dataset, so I was wondering if there was maybe already a tailored made reader that could do this (by first automatically detecting the column widths).

Out of curiosity I created a workspace myself that could deal with files like these a bit more dynamically (using the header line to detect the widths of the columns). But it's not that clean and requires manual exposing of the attributes at the end. Also it assumes the headerNames don't contain (white)space characters, and in this case also requires manually removing the header/data separating line (such a line was not present in the dataset I encountered earlier, so I considered this as a manual step). Nevertheless, see the attached workspace.

Best answer by ebygomm

The Column Aligned Text (CAT) reader will read fixed widths, but requires input to specify the widths. I don't think you can set the width dynamically.

View original

Did this help you find an answer to your question?

+50

redgeographics
Celebrity
3643 replies
3 years ago
March 1, 2022

One alternative solution is to use a StringReplacer to replace multiple occurences of a whitespace character with a new character which you can use as a separator character for the AttributeSplitter.

+39

ebygomm
Influencer
3313 replies
Best Answer
3 years ago
March 1, 2022

The Column Aligned Text (CAT) reader will read fixed widths, but requires input to specify the widths. I don't think you can set the width dynamically.

+10

thijsknapen
Author
Contributor
154 replies
3 years ago
March 1, 2022

redgeographics wrote:

One alternative solution is to use a StringReplacer to replace multiple occurences of a whitespace character with a new character which you can use as a separator character for the AttributeSplitter.

Hi @Hans van der Maarel ,

Thanks for your input. My main question was to find out if there is already a reader that can deal with such datasets, and I see that @ebygomm just responded below that the Column Aligned Text (CAT) reader is probably what I am looking for.

I understand that there are many alternative approaches, and nice to hear of your alternative solution. However, that would introduce the assumption that the data in a column never contains several whitespace characters (e.g. 'Daniel RadCliffe'), which probably won't happen, but yeah... you know what they say about 'when you assume ...' ;)

Therefore I personally prefer to impose as little additional assumptions to the data as possible. The main known here is that the columns are defined by a fixed width, so personally I think it is most robust to split the data on a fixed width.

+10

thijsknapen
Author
Contributor
154 replies
3 years ago
March 1, 2022

ebygomm wrote:

The Column Aligned Text (CAT) reader will read fixed widths, but requires input to specify the widths. I don't think you can set the width dynamically.

Thanks, that's exactly what I was looking for!

I vaguely remember having heard of that reader/format before, but a little too vague to identify it as an opportunity here :).

Too bad that the CAT reader requires user input to specify the widths, and can't determine this automatically. That said, from an architectural view I can understand such a choice.

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Is there an FME Reader that can read ~'Fortran inspired text datasets' (i.e. columns separated by fixed widths)?

1 Attachments