Question

Using wildcard in StringReplacer to match but not change text

  • 4 November 2021
  • 2 replies
  • 18 views

Badge +2

I'm trying to find and replace a string from a text file where the first part is constant, then the second part of it could be anything, then the last part is a constant. I want to change some of the 'last part' text, keeping the first part + the second part (which could be anything) remain the same. I hope the highly simplified example below illustrates what I mean:

 

Input text:

This is constant Line 1

*This is anything Line 2*

This is constant again and will change, Line 3

Output result:

This is constant Line 1

This is anything Line 2

This Line 3 has now changed

 

Does anyone know how this could be accomplished? The files being fed in have hundreds of occurrences of the last part so the first part + second part (though this could be anything, a wildcard perhaps?) + last part search should ensure only one result.

I hope this makes sense and thanks in advance!


2 replies

Badge +6

Hi @timboberoosky​, If I understood correctly this sounds like a regex type of solution. If the first and third line are constant you could look for in between for ex: \\((.*?)\\) looks in between the closed pair of parenthesis. Feel free to share a small example of what you're working with here!

Badge +2

Hi @timboberoosky​, If I understood correctly this sounds like a regex type of solution. If the first and third line are constant you could look for in between for ex: \\((.*?)\\) looks in between the closed pair of parenthesis. Feel free to share a small example of what you're working with here!

Thanks @jennaatsafe​ .

I think I explained my scenario quite poorly! I'll try better here. I have uploaded an example text file (xml formatted but we're just using it as .txt extension for this) and the workspace I have just as an example (I know it doesn't work!). So, in the file, there are multiple <WatermarkDefinition> tags with differing <Name> tags underneath. I'm looking specifically for the entire strings using this search /replace methodology:

  • Where <WatermarkDefinition> has <Name>Default</Name> tag underneath
    • Where the same <WatermarkDefinition> above has <Overlay> --> <SignatureContent> with any value (i.e. ignore this value in the search but keep it in the output file)
      • In <WatermarkDefinition> --> <Overlay> , change value of the <ImagePositionXY> tag (and NOT the <PositionXY> tag)
      • In <WatermarkDefinition> --> <ESignature> , change value of the <PositionXY> tag (and NOT the <ImagePositionXY> tag)

 

My thought is to:

  • Use StringExtractor with RegEx to extract each entire <WatermarkDefinition> (<Name>Default</Name>) tag as a new attribute (see below my dilemma on this one)
  • Use StringReplacers with RegEx to make the modifications to <Overlay> --> <ImagePositionXY> and <ESignature> --> <PositionXY> in the above
  • Use that entire modified <WatermarkDefinition> (<Name>Default</Name>) tag to place back into the original text_line_data (searching again for the original <WatermarkDefinition> (<Name>Default</Name>) tag) for the output text file

 

So far I've had some successes and failures. The main issue I'm facing now is the first StringReplacer Regex to return ONLY one of the <WatermarkDefinition> (<Name>Default</Name>) tags to modify, as opposed to each one (the attachment only has a few but the actual file has 200+). The RegEx I've tried is:

((?i)<Name>Default</Name>)\\s*\\w*+.*</WatermarkDefinition>

But this includes everything up to the last </WatermarkDefinition> in the entire file, where I want to it stop at the FIRST occurrence of the end tag </WatermarkDefinition>.

 

Does this make any sense?

Thanks!

 

 

 

 

Reply