I am trying to return values for multiple fields (codes), parsing data from one incoming field (NAICS_CODE), using a AttributeManager transformer and SubstringRegularExpression.
The field containing the incoming data is called NAICS_CODE and can contain data for up to 30 separate codes. For example, in one record the field NAICS_CODE has 10 separate codes.
Using an online regex editor, I have the following code, where the index for each parsed field would be 0-9.
^\s*(?:\S+\s+){0}(\S+) THROUGH ^\s*(?:\S+\s+){9}(\S+)
And when I test this code against the following data with the online REGEX editor:
236210-DBE/MBE/SBE 236220-DBE/MBE/SBE 238210-DBE/MBE/SBE 238990-DBE/MBE/SBE 335311-DBE/MBE/SBE 335312-DBE/MBE/SBE 335313-DBE/MBE/SBE 335314-DBE/MBE/SBE 335999-DBE/MBE/SBE 423610-DBE/MBE/SBE
The results seem to be what I am looking for, returning the correct code from the field for each indexed value.
^\s*(?:\S+\s+){0}(\S+) returns 236210-DBE/MBE/SBE
THROUGH
^\s*(?:\S+\s+){9}(\S+) returns 423610-DBE/MBE/SBE
However, when I try to set the value of the field for the parsed code using the AttributeManager transformer, something in the expression must not be setup correctly. I read the documentation, and I am setting the startIdx at 1, the captureNum at 0 and the matchNum at 0 for each parsed field.
All the alpha characters are uppercase. Two blank spaces (‘ ‘) separate each code, except the first which is the start of the line.
When I run the workspace up to the point of conversion, the resulting field is <null> not the string I am expecting.
I have only setup the first two fields, and they come up as <null>, while the other eight fields seem to be blank.
The expressions I am using are for the first two fields (0,1) are:
@SubstringRegularExpression(@Value(NAICS_CODE),^\s*(?:\S+\s+){0}(\S+),1,0,0,caseSensitive=TRUE)
@SubstringRegularExpression(@Value(NAICS_CODE),^\s*(?:\S+\s+){1}(\S+),1,0,0,caseSensitive=TRUE)
I am open to any advice.
Thanks in advance.