Skip to main content

I  have an “Extras” attribute that is XML-like with different number of attributes in different order within the attribute value..

 

When I use the , XMLXQueryExtractor* and configure a  query, it only returns rows that has one tag only called <PAGE_TITLE>, like below.

 

 

The query returns this query result in the list attribute. THe resr of the rows are rejected as invalid XML Queries.

For the other failed queries, they have multiple tags, as shown below.

What I’d like to do is aplit the tags into field::values regardless of number. 

This attribute value:

<PAGE_LINKS>https://p3tips.com/community/</PAGE_LINKS><PAGE_PRECISEPUBTIMESTAMP>20240905151800</PAGE_PRECISEPUBTIMESTAMP><PAGE_TITLE>Lynchburg Police searching for suspect after armed robbery at Family Dollar</PAGE_TITLE>

 

Woud look like below with all the attributes in the Extras value split into however many tagged attributes.

PAGE_LINKS PAGE_PRECISEPUBTIMESTAMP PAGE_TITLE
https://p3tips.com/community 20240905151800 Lynchburg Police searching for suspect after armed robbery at Family Dollar

 

 

Figured it out.using an HTMLExtractor

 

 

 


Reply