Question

HTML ancestors

7 years ago
June 12, 2018
2 replies
6 views

+28

jdh
Contributor
1984 replies

I have some html data that's in the structure

<h2>Status</h2>
<h3>Place</h3>
<p><a name="1">Name</a></p> <p><a name="2">Name<blockquote><p>Description</p></blockquote></a></p> <h3>Place2</h3>
<p><a name="3">Name<blockquote><p>Description</p></blockquote></a></p>

but the Line Feeds are entirely erratic.

I need to have one feature per name anchor (which is easily enough done with the HTMLExtractor) but I also need to have the corresponding contents of the h2|h3 tags stored as attributes.

Normally I would read in the data line by line and use a TestFilter and variables to do so, but since the lines breaks don't match the data structure in any way, I'm not sure as to the best way to proceed.

takashi
7722 replies
7 years ago
June 13, 2018

Hi @jdh, I think it's hard to accomplish that with CSS Selectors.

A workaround I can think of is, collect all your interested elements with a StringSearcher and save them into a list attribute, explode the list, and then parse them line by line. If your interested elements are <h2>, <h3>, and <a>, this regex matches them, for example.

<h2.+?</h2>|<h3.+?</h3>|<a.+?</a>

+28

jdh
Author
Contributor
1984 replies
7 years ago
June 13, 2018

takashi wrote:

Hi @jdh, I think it's hard to accomplish that with CSS Selectors.

<h2.+?</h2>|<h3.+?</h3>|<a.+?</a>

That's definitely more elegant than solutions I was considering.

I did need to modify the regex to allow for closing tags split across multiple lines. ( I may have mentioned the Line Feeds were erratic).

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

HTML ancestors

2 replies

Reply

Helpful Members This Week

Recently Solved Questions

RasterExpressionEvaluator Expression to select raster GRAY8 values

FME 2025.1 PythonCaller can't run arcpy?

Tag unknown # features with ID from a previous record

How to set a "reply_to" parameter in flow automation action "email send"

AttributeValidator Pass Nulls

Community Stats

Latest FME

Cookie policy

Cookie settings

Reply

Related Topics

How do I know which modem to get?icon

SVG248 Surfboardicon

Some home network connections are being forwarded to an external captive portal on BSG7600AC2icon

How do I get a shell on my SBG7600AC2?icon

SB8200 Xfinity upload speed not more than 20mbps on 100mbps planicon

Helpful Members This Week

Recently Solved Questions

RasterExpressionEvaluator Expression to select raster GRAY8 values

FME 2025.1 PythonCaller can't run arcpy?

Tag unknown # features with ID from a previous record

How to set a "reply_to" parameter in flow automation action "email send"

AttributeValidator Pass Nulls

Popular Tags

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings