Question

I am creating a job to read structured text files and test the values in particular columns against a set of allowed terms. This involves establishing lists of the data columns, however the column names and positions are never the same.

  • 18 November 2021
  • 3 replies
  • 5 views

I am trying to use the Attribute filter to only see the columns i'm interested in (those containing the text 'lith') and using it's _element_index to filter the columns (lists). Its not working and I'd like some help or advice on better ways to solve this problem. The job will read in a database table of values to check against, but i'm not there yet as I first need to find / isolate the relevant columns of interest.

I attach the text file & workbook for reference

Thanks


3 replies

Userlevel 4
Badge +36

The most easy way may be to do some preprocessing: get rid of the superfluous header lines, and next read in the file as CSV (with Tab as separator).

 

Otherwise you can read the file, remove the header files, write to a temporary file, and then read the temporary file as a CSV file.

FME will then hide all attributes. The attributes can be exported while hidden, when you manually enter their names in the Attribute Definition of the Exporter, and set the default value (to Null or another value).

Image of Workspace DL4_codesAlso see the attached workspace.

The most easy way may be to do some preprocessing: get rid of the superfluous header lines, and next read in the file as CSV (with Tab as separator).

 

Otherwise you can read the file, remove the header files, write to a temporary file, and then read the temporary file as a CSV file.

FME will then hide all attributes. The attributes can be exported while hidden, when you manually enter their names in the Attribute Definition of the Exporter, and set the default value (to Null or another value).

Image of Workspace DL4_codesAlso see the attached workspace.

Thanks for your reply. It is a different approach and i can see the benefit of writing out only the relevant data into a new file. However i still don't see how to limit the columns to only those that have 'lith' in the header? Creating lists of the columns is easy but i don't know how to select only the columns that contain 'lith'. Most of the list transformers require entering an explicit list index by which to process, eg List indexer. However i need it be a variable or such as this number will always be different across the files as the the lith column positions & names are always different. Or is there a way to select columns using a the header line and looking for text string such as Lith ?

Thanks in advance

Userlevel 4
Badge +36

The most easy way may be to do some preprocessing: get rid of the superfluous header lines, and next read in the file as CSV (with Tab as separator).

 

Otherwise you can read the file, remove the header files, write to a temporary file, and then read the temporary file as a CSV file.

FME will then hide all attributes. The attributes can be exported while hidden, when you manually enter their names in the Attribute Definition of the Exporter, and set the default value (to Null or another value).

Image of Workspace DL4_codesAlso see the attached workspace.

You can use the BulkAttributeKeeper from FME Hub to keep only the columns starting with 'Lith'.

In this CSV the attribute names (H1000) are on line 26, and the data lines (D) start at line 28. If this is always the case, you can tweak the CSV Reader to read the field names from line 26 and the data starting with line 28.

Visualization of Workspace DL4_codes_v2

Reply