Skip to main content
Solved

HTML parsing

  • May 16, 2017
  • 8 replies
  • 80 views

Forum|alt.badge.img

Hello,

Can you please help me with HTML parsing. Which settings I should using (css selector and etc) into HTMLExtractor that I can get only value of data-title attribute ?

Regards, Pavel.

Best answer by takashi

Hi @pavelpostnov, try this setting. See the help on the HTMLExtractor to learn more.

This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

8 replies

takashi
Celebrity
  • 7843 replies
  • Best Answer
  • May 16, 2017

Hi @pavelpostnov, try this setting. See the help on the HTMLExtractor to learn more.


Forum|alt.badge.img
  • Author
  • 6 replies
  • May 17, 2017

Hi @takashi, thanks a lot for you help !


Forum|alt.badge.img
  • Author
  • 6 replies
  • May 18, 2017

Hi @takashi,

If you can, please recommend to me, which rss selector I can using for select only ID values from only this string, see picture:

Regards, Pavel.


takashi
Celebrity
  • 7843 replies
  • May 18, 2017

Hi @takashi,

If you can, please recommend to me, which rss selector I can using for select only ID values from only this string, see picture:

Regards, Pavel.

It seems to be a JSON object within a JavaScript script which is embedded as the text value of <script> element. I don't think the "id" value can be extracted directly with a CSS Selector. However, probably you can extract entire text of the <script> element setting "script" as CSS Selector. Then, parse the text to extract the "id" value. StringSearcher and/or JSON transformers etc. might help you.


takashi
Celebrity
  • 7843 replies
  • May 18, 2017

@takashi, thanks a lot for information.

It seems to be a JSON object within a JavaScript script which is embedded as the text value of <script> element. I don't think the "id" value can be extracted directly with a CSS Selector. However, probably you can extract entire text of the <script> element setting "script" as CSS Selector. Then, parse the text to extract the "id" value. StringSearcher and/or JSON transformers etc. might help you.


Forum|alt.badge.img
  • Author
  • 6 replies
  • May 18, 2017

Hi @takashi,

If you can, please recommend to me, which rss selector I can using for select only ID values from only this string, see picture:

Regards, Pavel.

@takashi, thanks a lot for information.


  • 17 replies
  • March 5, 2020

hi @takashi, Sorry for bring out old topic, but can you help me please: I want to get these values

(in red circles)

from this link : https://www.infoclimat.fr/observations-meteo/archives/27/novembre/2019/tarbes-ossun-lourdes/07621.html

 

 

I try to use HTMLExtractor, and I blocked in the CSS Selector. If you can help me please thanks

  • 17 replies
  • March 5, 2020

Hi @takashi, thanks a lot for you help !

hi @takashi, Sorry for bring out old topic, but can you help me please: I want to get these values

(in red circles)

from this link : https://www.infoclimat.fr/observations-meteo/archives/27/novembre/2019/tarbes-ossun-lourdes/07621.html

 

 

I try to use HTMLExtractor, and I blocked in the CSS Selector. If you can help me please thanks