Skip to main content

 I'm trying to work out how to parse the SAMPLES attribute from the Trace component of the SEG-Y reader. The reader outputs the samples as encoded FME Binary which is represented as HEX in the attribute viewer. SamplesThe samples are a byte array of 246 consecutive UINT16 readings

I want to parse the samples into a list of UNIT16 values so I can then do some analysis on them.

I have tried a whole range of approaches to process the string into a list of numbers but nothing seems to work and I suspect I'm just missing something.

Anyone have any ideas?

I think @daveatsafe​ should be able to help with this one.

If not you might find that the PythonCaller might be your ​next option.


Try the BinaryEncoder transformer. That will convert a binary field to the representative text/numbers.


Hi @tim.rastall1​ ,

 

As @mark2atsafe​ mentioned, you can use the BinaryEncoder to convert the binary data to HEX.

 

Next, you want to divide the HEX string into chunks for each UINT16 value. Since each HEX digit represents 4 bits, there will be 4 HEX characters for each UINT16 value, so you need to divide the string into 4 digit chunks. This can be done with a StringSearcher, with the Contains Regular Expression Parameter set to '.{4}' and Advanced - All Matches List Name set to 'values'.

 

Now we want to convert the list values from HEX to the numerical value. To do this we will need to explode the list into individual parts using the ListExploder, on the values{} list. Then use a BaseConverter to convert the 'match' attribute from base 16 to base 10.

 

If you want to rebuild the list, you can use an Aggregator, with Group By set to the SegY record number.

 

ReadHEX


Hi @tim.rastall1​ ,

 

As @mark2atsafe​ mentioned, you can use the BinaryEncoder to convert the binary data to HEX.

 

Next, you want to divide the HEX string into chunks for each UINT16 value. Since each HEX digit represents 4 bits, there will be 4 HEX characters for each UINT16 value, so you need to divide the string into 4 digit chunks. This can be done with a StringSearcher, with the Contains Regular Expression Parameter set to '.{4}' and Advanced - All Matches List Name set to 'values'.

 

Now we want to convert the list values from HEX to the numerical value. To do this we will need to explode the list into individual parts using the ListExploder, on the values{} list. Then use a BaseConverter to convert the 'match' attribute from base 16 to base 10.

 

If you want to rebuild the list, you can use an Aggregator, with Group By set to the SegY record number.

 

ReadHEX

Hi @daveatsafe

This is perfect. I managed to finally get something working using a similar approach but using the string splitter and a long row of '4s4s4s4s4s....' ​but the string searcher is much more efficient!

Side note - I've noted that the SEG-Y reader doesn't consider the coordinate scalar value that occupies the 2 bytes before the x coordinate. Without the scalar, the coordinates it outputs can be wildly wrong 🙂 - in my case the scalar is always 0.01 so it's easy to fix but maybe worth including in the reader? Also, it seems like basic sample value interpretation would be a useful addition to the SEG-Y reader to avoid the process you've suggested. What's the best practice for requesting these features?

Greatly appreciate your​ (and the communities) help here!


Reply