Skip to main content
Solved

HTML parsing


Forum|alt.badge.img

Hello,

Can you please help me with HTML parsing. Which settings I should using (css selector and etc) into HTMLExtractor that I can get only value of data-title attribute ?

Regards, Pavel.

Best answer by takashi

Hi @pavelpostnov, try this setting. See the help on the HTMLExtractor to learn more.

View original
Did this help you find an answer to your question?

8 replies

takashi
Contributor
Forum|alt.badge.img+19
  • Contributor
  • Best Answer
  • May 16, 2017

Hi @pavelpostnov, try this setting. See the help on the HTMLExtractor to learn more.


Forum|alt.badge.img

Hi @takashi, thanks a lot for you help !


Forum|alt.badge.img

Hi @takashi,

If you can, please recommend to me, which rss selector I can using for select only ID values from only this string, see picture:

Regards, Pavel.


takashi
Contributor
Forum|alt.badge.img+19
  • Contributor
  • May 18, 2017
pavelpostnov wrote:

Hi @takashi,

If you can, please recommend to me, which rss selector I can using for select only ID values from only this string, see picture:

Regards, Pavel.

It seems to be a JSON object within a JavaScript script which is embedded as the text value of <script> element. I don't think the "id" value can be extracted directly with a CSS Selector. However, probably you can extract entire text of the <script> element setting "script" as CSS Selector. Then, parse the text to extract the "id" value. StringSearcher and/or JSON transformers etc. might help you.


takashi
Contributor
Forum|alt.badge.img+19
  • Contributor
  • May 18, 2017
pavelpostnov wrote:

@takashi, thanks a lot for information.

It seems to be a JSON object within a JavaScript script which is embedded as the text value of <script> element. I don't think the "id" value can be extracted directly with a CSS Selector. However, probably you can extract entire text of the <script> element setting "script" as CSS Selector. Then, parse the text to extract the "id" value. StringSearcher and/or JSON transformers etc. might help you.


Forum|alt.badge.img
pavelpostnov wrote:

Hi @takashi,

If you can, please recommend to me, which rss selector I can using for select only ID values from only this string, see picture:

Regards, Pavel.

@takashi, thanks a lot for information.


  • March 5, 2020

hi @takashi, Sorry for bring out old topic, but can you help me please: I want to get these values

(in red circles)

from this link : https://www.infoclimat.fr/observations-meteo/archives/27/novembre/2019/tarbes-ossun-lourdes/07621.html

 

 

I try to use HTMLExtractor, and I blocked in the CSS Selector. If you can help me please thanks

  • March 5, 2020
pavelpostnov wrote:

Hi @takashi, thanks a lot for you help !

hi @takashi, Sorry for bring out old topic, but can you help me please: I want to get these values

(in red circles)

from this link : https://www.infoclimat.fr/observations-meteo/archives/27/novembre/2019/tarbes-ossun-lourdes/07621.html

 

 

I try to use HTMLExtractor, and I blocked in the CSS Selector. If you can help me please thanks

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings