Skip to main content
Question

How to use the CSS selector in HTML extractor

  • October 25, 2022
  • 4 replies
  • 149 views

checcosisani
Contributor
Forum|alt.badge.img+12

Hi

I would like to extract some info from a website that are inside a p tag inside a div class = paragraphs_item__body_paragraph_bundle

the website is

https://www.padovanet.it/notizia/20221025/strade-chiuse

 

See picture below

thx for support

 

webscrapingFrancesco

 

4 replies

geomancer
Evangelist
Forum|alt.badge.img+47
  • Evangelist
  • October 26, 2022
.paragraphs_item__body_paragraph_bundle p

Select all p elements inside class paragraphs_item__body_paragraph_bundle (see HTMLExtractor and CSS Selector Reference).

HTMLExtractor_strade_chiuse


checcosisani
Contributor
Forum|alt.badge.img+12
  • Author
  • Contributor
  • October 26, 2022

thx !


checcosisani
Contributor
Forum|alt.badge.img+12
  • Author
  • Contributor
  • September 28, 2024

Hi

 

do you now if there any chance to extract info inside br tag 

I use this 

table > tbody > tr:nth-child(-n+10) > td:nth-child(2) > strong:nth-child(5) but I can’t expose the info inside br

 

 

this is the website 

 

https://cloud.urbi.it/urbi/progs/urp/ur1ME001.sto?DB_NAME=wt00038560&w3cbt=S&StwEvent=9100030

 

thx

 

Francesco


geomancer
Evangelist
Forum|alt.badge.img+47
  • Evangelist
  • September 30, 2024

You can just use a HTTPCaller, a few HTMLExtractors and ListExploders, and an AttributeSplitter.

Note that there is no ‘inside a <br> tag’, as <br> has no corresponding </br> tag. <br> signifies a line break (after <br> a new line is started). FME turns <br> into <br/> (I found this out by just testing).

 


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings