Skip to main content
Solved

Extracting irregular Href

  • January 29, 2019
  • 1 reply
  • 8 views

Hi

I am extracting multiple hrefs. There are some irregular CSS tags that are being rejected. I have corrected the spaces by using a stringreplacer but there are a couple I cannot figure out. one being "Unsupported or invalid CSS selector: "democratic-republic-of-congo-(kinshasa)"

I need to remove the parenthesis, there are also ones with the word "the" at the end of the CCS href.

 

I only have about 12 of these, so is there a way to replace an entire href with another and I will just create 12 transformers?

 

any ideas?

Best answer by daveatsafe

Hi @tmtech,

The StringReplacer can also remove characters from the source string - just leave Replacement Text blank. You can remove multiple characters by using Replace Regular Expression in Mode. To remove the parentheses, and 'the' at the end of a line, use Text to Replace = \\(|\\)|the$.

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

1 reply

daveatsafe
Safer
Forum|alt.badge.img+19
  • Safer
  • 1637 replies
  • Best Answer
  • January 30, 2019

Hi @tmtech,

The StringReplacer can also remove characters from the source string - just leave Replacement Text blank. You can remove multiple characters by using Replace Regular Expression in Mode. To remove the parentheses, and 'the' at the end of a line, use Text to Replace = \\(|\\)|the$.