Question

I need to remove text from an XML file, but haven't had any luck with XMLUpdater.

  • 26 January 2022
  • 7 replies
  • 40 views

This is what I'm trying to remove is this block:

    <Remarks>

 "<span><span>xxxxxxxxx </span><span class="xxxxxxxxxx">xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx</span>

<span>xxxxxx </span><span class="xxxxxxxx">xxxxxxxxxxxxxxxx</span></span>" : 2022-01-24T08:23:34   

 "<span><span>xxxxxxxxxxxxxxxxxxxxx (&lt;span class="xxxxxxxx"&gt;xxxxxxxxxxxxxxxxxxxxxxxx&lt;/span&gt;)</span>

<span>Suffix: T (&lt;span class="xxxxxxxx"&gt;xxxxxxxxxxxx&lt;/span&gt;)</span>

<span>xxxxxxxxxxxxxxxxx</span>

<br />

<span>---- </span><span class="xxxxxxxxxxx">xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx</span>

<span>---- </span><span class="xxxxxxxxxx">xxxxxxxxxxxxxxxxxxxxxx</span>

<br /></span>" : 2022-01-24T08:23:39   

 "<span><span>---- </span><span class="xxxxxxxxxxxxx">xxxxxxxxxxxxxxxxxxxxxx</span>

<span>---- </span><span class="xxxxxxxxxxxxxx">xxxxxxxxxxxxxxxxxxxxxxxxxxx</span>: <span class="xxxxxxxxxxxxxxxxx"> 3</span>

<span>---- </span><span class="xxxxxxxxxxxxx">xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx</span>: <span class="xxxxxxxxxxxxxxxx"> 3</span>

<span>---- </span><span class="xxxxxxxxxxxx">xxxxxxxxxxxxxxxxxxxxx</span>

<span>---- </span><span>xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.</span>

<span>---- </span><span>xxxxx</span>: <span class="xxxxxxx"> 2</span>

<span>---- </span><span class="xxxxxxxxx">xxxxxxxxxxxxxxxxx</span>

<br /></span>" : 2022-01-24T08:24:07   

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx   

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx   

    </Remarks>

I'm using the xml reader to pass the transformer, where I can only select the <span> when going to the text attribute selection. It removes only one span. The documentation isn't showing much. I'm not only new to FME, but also XML. Originally I was asked to upload xml files to a vendor through SFTP.

 

Thanks.


7 replies

Badge +2

@davbistrath​ You could just use a regular expression:

<Remarks>.*?</Remarks>

in the StringReplacer. The ? make the .* match non-greedy so s it will remove multiple remark tags.

So the pattern in FME would be:

Textine reader - set the parameter Read Whole File At Once: Yes

StringReplacer - Mode: Replace Regular Expression

I've attached an example. (FME 2021.2)

Thank you very much, I will incorporate this.

Userlevel 2
Badge +17

Hi @davbistrath​ , I believe that XMLUpdater works as expected, if you need to remove all the <Remarks> elements from an XML document, like this.

 

Before

<Features><Feature>aaa</Feature><Remarks>bbb</Remarks><Feature>ccc</Feature><Remarks>ddd</Remarks>
</Features>

After

<Features><Feature>aaa</Feature><Feature>ccc</Feature>
</Features>

XMLUpdater parameters settings should be:

xmlupdater-parametrs

Just be aware that XMLUpdater always requires input from the Update port even if you don't need to use any attributes from the input feature. In this case, you can just send a feature created by the Creator.

xmlupdate-workflow-example 

[Add] If you need to remove only the contents of <Remarks> elements, this setting is available.

xmlupdater-parametrs-delete-contents

Hi @davbistrath​ , I believe that XMLUpdater works as expected, if you need to remove all the <Remarks> elements from an XML document, like this.

 

Before

<Features><Feature>aaa</Feature><Remarks>bbb</Remarks><Feature>ccc</Feature><Remarks>ddd</Remarks>
</Features>

After

<Features><Feature>aaa</Feature><Feature>ccc</Feature>
</Features>

XMLUpdater parameters settings should be:

xmlupdater-parametrs

Just be aware that XMLUpdater always requires input from the Update port even if you don't need to use any attributes from the input feature. In this case, you can just send a feature created by the Creator.

xmlupdate-workflow-example 

[Add] If you need to remove only the contents of <Remarks> elements, this setting is available.

xmlupdater-parametrs-delete-contents

That's great, I see where I went wrong. Thanks!

Userlevel 4
Badge +25

Thanks for asking this great question. I made this our Question of the Week and posted a video about it on YouTube. Of course, Takashi has made a great answer already!

Hi @davbistrath​ , I believe that XMLUpdater works as expected, if you need to remove all the <Remarks> elements from an XML document, like this.

 

Before

<Features><Feature>aaa</Feature><Remarks>bbb</Remarks><Feature>ccc</Feature><Remarks>ddd</Remarks>
</Features>

After

<Features><Feature>aaa</Feature><Feature>ccc</Feature>
</Features>

XMLUpdater parameters settings should be:

xmlupdater-parametrs

Just be aware that XMLUpdater always requires input from the Update port even if you don't need to use any attributes from the input feature. In this case, you can just send a feature created by the Creator.

xmlupdate-workflow-example 

[Add] If you need to remove only the contents of <Remarks> elements, this setting is available.

xmlupdater-parametrs-delete-contents

So another question, I'm scanning a folder and wanting to modify all the files. I've setup a Folder and File Pathnames reader and have set my  XMLUpdater parameters as such: 

XMLUpdater parametersIt's giving me an error that it can't find the files. Is this the correct way to handle this?

 

Thanks

Userlevel 2
Badge +17

So another question, I'm scanning a folder and wanting to modify all the files. I've setup a Folder and File Pathnames reader and have set my XMLUpdater parameters as such:

XMLUpdater parametersIt's giving me an error that it can't find the files. Is this the correct way to handle this?

 

Thanks

I suppose that the user parameter "SourceDataset_PATH" indicates a folder path, rather than XML file path. The PATH reader outputs features containing attribute called "path_unix" (or "path_windows") which stores a file path. Try setting the attribute to the XML File parameter.

Reply