Hello @smithgk, hmm.. unfortunately, I haven't been able to find a way to isolate/extract just Colorado records from the entire dataset😔 I think subsetting the Colorado records from the entire file will require a lot of AttributeSplitters/ListExploding/joining/etc - not impossible though!
If you're able to try copying/pasting the Colorado records (or records of interest) to a seperate text file, the CAT Reader works flawlessly!
Hope this helps, Kailin.
It is not too difficult to extract the Colorado data. Read the entire text file at once. Use a StringSearcher with this regular expression (all * need to be escaped) to extract all Colorado data:
.+(\*\*\*\*\* Colorado \*\*\*\*\*>^\*]+)
Use the Subexpression Matches List Name to save the results in a list. Write the results to a text file.
Or continue processing the resulting data, probably starting with an AttributeSplitter on <Newline \\n>, and a ListExploder. Something like this:
Hello @smithgk, jumping back in the conversation here - hope you don't mind! Great idea with the StringSearcher @geomancer! After another attempt, managed to extract the state information using the Adjacent Features function! Here is another potential solution for you!
After reading the text file, the AttributeCreator is getting the state name (using regex) from any lines that start with ***** <text> ***** and assigning any subsequent features (that are presumed to be data) the value of state from the previous feature. An Aggregator is used to combine back into a single attribute, and then temporarily written as a text file. Because the file will not be created until runtime, you can point the FeatureReader at a similar dataset with the same schema/formatting (in my case, I used the colorado.txt file I show'd in the first comment). Let me know if you have any questions at all or have issues running the workspace. Happy to help, Kailin.
Or continue processing the resulting data, probably starting with an AttributeSplitter on <Newline \\n>, and a ListExploder. Something like this:
Thanks geomancer! This is a good idea, I'll see about inputting into the workflow. I appreciate it!
Hello @smithgk, jumping back in the conversation here - hope you don't mind! Great idea with the StringSearcher @geomancer! After another attempt, managed to extract the state information using the Adjacent Features function! Here is another potential solution for you!
After reading the text file, the AttributeCreator is getting the state name (using regex) from any lines that start with ***** <text> ***** and assigning any subsequent features (that are presumed to be data) the value of state from the previous feature. An Aggregator is used to combine back into a single attribute, and then temporarily written as a text file. Because the file will not be created until runtime, you can point the FeatureReader at a similar dataset with the same schema/formatting (in my case, I used the colorado.txt file I show'd in the first comment). Let me know if you have any questions at all or have issues running the workspace. Happy to help, Kailin.
Thank you Kailin! This looks like a nice and agile workflow that I'll definitely try out. Over the last few days I did come up with a solution too using the AttributeSplitter transformer, though it did take some trial and error. I first read the text file into Excel and used the Text to Columns function. I used the 'fixed width' option and wrote down the literal individual spaces that separate the attributes. Everything displayed in Excel perfect using that method.
I then went to FME, and read the text data using a Text Reader; I used the 'Number of Lines to Skip' and 'Number of Footer Lines to Skip' parameters to just grab the Colorado records. Using the AttributeSplitter transformer, I then input the exact amount of spaces that separate attributes using the 'Format String' option and format (see screenshot). The transformer split the data out into list values, which I then used an AttributeManager transformer to rename.
I know this workflow may not be as agile and flexible as previous solutions, but it does get the data to where it needs to be since the format/layout of the text file will never change (column values will only change). That said, I'll definitely be trying out the other workflows to replicate the process and see if I can get something that's more adaptable to possible future database changes. Thank you Kailin!
Or continue processing the resulting data, probably starting with an AttributeSplitter on <Newline \n>, and a ListExploder. Something like this:
Hi, I see I forgot to add the workspace. I've attached it below.
The AttributeManager reads all the different attributes, with Attribute Values like
@Trim(@Substring(@Value(Data),71,6))