Skip to main content

Hi,

I'm trying to built a webscraper tool for italian website using CSS (NO API available) and since each website has a different structure I put the css selector in a POSTGRES table and then in the HTML extractor I use these selectors as parameters

imageThe probelm occurs when in a website an attribute is missing

eg normally I extract data_inizio and data_fine but in some case data_fine isn't present.

So I suspect that the html stop if th css for a specif parameters is missing, I tried to use the conditional value butthe only possibility is stop

image 

So the question is how can tell to the HTML if the css selector in a parameter is missing proceed ?

 

Hope the explanation is clear

 

thx for support

 

Francesco

So the issue is the missing attribute value in the HTML Extractor causing the extract to fail and rejected features?

 

If you use a NullAttributeMapper to map empty, null and missing attributes to a dummy value that should never match anything, the HTML extractor will then just return a blank for that css selector


Reply