Question

How to extract data from website using HTTPCaller? (SRU) -> return is XML

8 years ago
June 27, 2017
16 replies
218 views

edhere
60 replies

Hi all,

FME beginner here.

I'm trying to process data from a Dutch government website.

One can request data using search keys in the URL (SRU I believe?) - the response is an XML.

e.g.

https://zoek.officielebekendmakingen.nl/sru/Search?version=1.2&operation;=searchRetrieve&x-connection;=oep&startRecord;=1&maximumRecords;=10&query;=title=%rotonde%

I'm trying to put in multiple search strings and process the results / output in FME.

What I have now:

Excel file with search strings connected to HTTPCaller.

HTTPCaller setup:

Request URL has "@Value(Search string)" referring to input Excel file

Output / errors:

*Edit

When inspecting the _response_body it seems I do have some XML data.

My next question, how to process this data? What transformers should I use next?

Many thanks,

+45

danilo_fme
Evangelist
2059 replies
8 years ago
June 27, 2017

Hi @edhere,

I tried to look this URL in my browser but its wrong.

+45

danilo_fme
Evangelist
2059 replies
8 years ago
June 27, 2017

Hi @edhere, what kind of data do you like to make download?

Danilo

larry
173 replies
8 years ago
June 27, 2017

For me, its working with the browser and HTTPCaller (the _response_body attribute contains the returned XML).

+49

mark2atsafe
Safer
2522 replies
8 years ago
June 27, 2017

When you say "no luck", what happens? Is there a crash? An error message? Or just a feature is output with no data? Can you post a screenshot of the transformer parameters, so that we can see what settings you are using? Thanks!

+45

danilo_fme
Evangelist
2059 replies
8 years ago
June 27, 2017

danilo_fme wrote:

Hi @edhere,

I tried to look this URL in my browser but its wrong.

I tried it now in my machine and its works. :)

edhere
Author
60 replies
8 years ago
June 28, 2017

Hi all,

Thanks for your responses. I have updated the start post with more info.

Hope this makes sense.

Thanks,

takashi
7723 replies
8 years ago
June 28, 2017

Hi @edhere,

> What transformers should I use next?

Generally you can use the XMLFragmenter and/or the XMLFlattener to extract some values contained by an XML document as feature attributes. In some cases, the XMLXQueryExploder or the XMLXQueryExtractor could also be helpful. The concrete solution depends on how you need to interpret the XML document.

+14

mygis
Supporter
307 replies
8 years ago
June 28, 2017

Hello @edhere , would you be able to let us know which data you are looking for in the xml? Would you be able to be specific? If it is one value you could extract the information using a regular expression, otherwise if it is more complex then it is better to consider it as an XML file and use xml handlng transformers. Those are the traansformers cited by @takashi

edhere
Author
60 replies
8 years ago
June 28, 2017

Hi, gisinnovationsb

I've checked the XML contained in the _response_body, let's start with:

<dcterms:title>*randomtext*</dcterms:title>

 <url>*randomurl*</url>

How would I extract the data in title and url?

Many thanks,

edhere
Author
60 replies
8 years ago
June 28, 2017

mygis wrote:

Hi, gisinnovationsb

I've checked the XML contained in the _response_body, let's start with:

<dcterms:title>*randomtext*</dcterms:title>

 <url>*randomurl*</url>

How would I extract the data in title and url?

Many thanks,

takashi
7723 replies
8 years ago
June 28, 2017

takashi wrote:

Hi @edhere,

> What transformers should I use next?

If you need to extract the values of the descendant elements (e.g. <title>, <url>) of the <record> element for each record, the XMLFragmenter with this setting might help you.

Just be aware the transformer would also extract unexposed attributes other than title and url. You can use FME Data Inspector (Feature Information Window) to check all the attributes that the resulting feature contains.

+14

mygis
Supporter
307 replies
8 years ago
June 28, 2017

mygis wrote:

hi @edhere,

Is this correct?

+14

mygis
Supporter
307 replies
8 years ago
June 28, 2017

mygis wrote:

hi @edhere,

Is this correct?

The idea is to read the url from an xml reader and not the httpCaller. I am using FME 2016. Attached is the workspace.

When you click on the parameters button, you will be able to filter any node from the xml file you wish to gain access to.

+14

mygis
Supporter
307 replies
8 years ago
June 28, 2017

mygis wrote:

hi @edhere,

Is this correct?

xml2none.fmw

+14

mygis
Supporter
307 replies
8 years ago
June 28, 2017

edhere wrote:

Hi, gisinnovationsb

I've checked the XML contained in the _response_body, let's start with:

<dcterms:title>*randomtext*</dcterms:title>

 <url>*randomurl*</url>

How would I extract the data in title and url?

Many thanks,

Answered above

markatsafe
1891 replies
8 years ago
June 28, 2017

@edhere Ed - the approach you take really does depend on what data you want to extract. But the general steps are:

use the approach you already have to read your query from Excel.
determine the the XML node that you want to split your records - it looks like it would be either:
- searchRetrieveResponse/records/record or
- searchRetrieveResponse/records/record/recordData

Tip: if you don't know the XML very well then add the XML reader and use the XML Elements to Match reader tree view to browse the XML to find the appropriate tag:

cut and paste the Selected Items. Once you have the selected item, cancel everything (i.e. don't actually add the XML reader to the workspace)

use either HTTPCaller (with XMLFragmenter) OR use the FeatureReader - I think I'd suggest FeatureReader
- FeatureReader:
  - add the XML reader, Dataset: <attribute with URL>,
  - Parameters: Elements to Match: <selected items>, i.e. searchRetrieveResponse/records/record,
  - Flatten Options: Enable Flattening
HTTPCaller & XMLFragmenter will be more or less the same.

Example Workspace attached: xmlreader.fmw

There's a pretty good XML Tutorial on the KnowledgeCentre that covers many of these topics..

Reply

Rich Text Editor, editor1

Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

How to extract data from website using HTTPCaller? (SRU) -> return is XML

16 replies

Reply

Helpful Members This Week

Recently Solved Questions

RasterExpressionEvaluator Expression to select raster GRAY8 values

FME 2025.1 PythonCaller can't run arcpy?

Tag unknown # features with ID from a previous record

How to set a "reply_to" parameter in flow automation action "email send"

AttributeValidator Pass Nulls

Community Stats

Latest FME

Cookie policy

Cookie settings

Reply

Related Topics

Default value from a xls file into a dbicon

Writing Data to SQL Spatial Database Results in Successful Translation With No Actual Data Written/Updatedicon

Issue writing GDB with XML template file set by parameter on FME Servericon

Error on inserting in a PostgreSQL table with Composite Primary Key. Attribute id is not added to the Insert queryicon

Missing Features in Outputicon

Helpful Members This Week

Recently Solved Questions

RasterExpressionEvaluator Expression to select raster GRAY8 values

FME 2025.1 PythonCaller can't run arcpy?

Tag unknown # features with ID from a previous record

How to set a "reply_to" parameter in flow automation action "email send"

AttributeValidator Pass Nulls

Popular Tags

Community Stats

Latest FME

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded

Cookie policy

Cookie settings