Question

Converting to required format

4 years ago
16 September 2019
3 replies
1 view

asabhinavgreats
15 replies

Hi All,

I have my data in the form given below. The data from 1507 to 2451 is a header and the three rows highlighted in green are one set of data and another set is appearing below it with differing values.

Now i want to have the data in the format:

where A to J are my columns names from the header (A-J may change in number like sometimes A-F or A-I or A-J as well). Value is coming from first row of the set of data and text from the third row of the set of data.

Sometimes the data may be missing also. e.g the second row may be missing and only two rows are there. so for that value we have to write nothing in the main table (shown above)

I have attached the excel file for reference

Kindly help.

3 replies

+22

jdh
Contributor
1959 replies
4 years ago
16 September 2019

Is the original data in excel, or is it a text file?

The header with the column names, in the xlsx file, some of them end with a period ( A.) whereas others end with a letter (H) do both occur in the actual data, or is this a typo?

D

Hi @asabhinavgreats!

I hope the attached screenshot explains the steps I tried. This fixes most of your problems. I'm sure someone will now give you three lines of python code to do the same and more, but anyway... :)

This does not address the problem with row 2 sometimes missing and also you will get empty columns at the end when you have less than the maximum number of values. That's something to keep working on.

Userlevel 4

+25

That's a fairly (actually very) tough proposition. I think the answer from @danullen is about as good as you can get. The only thing I'd mention is that the fields A,B,C,D, etc are hard-coded in his example. If you wished to go about this dynamically (i.e. you don't know in advance) then you could use this method to generate the output schema. Instead of having _list{0}, etc you would have attribute{0}.name, attribute{0}.fme_data_type, etc.

I'd also suggest that the CAT reader (Column-Aligned Text) would be useful for reading the first few header lines; so another solution would be to basically read those lines, read the actual data with a textline reader, and basically reformat it all to create a new dataset that is simpler to carry out the actual task (ie convert it to an intermediate "format"/structure, to make the real work easier). Just a thought.

Converting to required format

1 Attachment

3 replies

Reply

Community Stats

1 Attachment

Reply

Community Stats

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded