Skip to main content

Hi all,

I've tried to follow from other ask/doubts solutions here, but no luck. Maybe this method depends a little bit with the HTTP website that is facing.

Here is the website: Link

Belongs to the Spanish Cadastral, from which I'd want to extract some data when I pass a Cadastral ID, since there is no API, or other batch process for this purpose.

My idea is to pass a String on that form (box):

and the submit using DATOS button.

If we inspect the webpage:

BOX:

DATOS button:

If I'm not mistaken, I should give back those values within the highlighted boxed into the HTTPCaller, like this:

Within the Mulipart Upload, I'm passing the Values, using an attribute "ref_cat", and the value DATOS for the button.

The "_response_body" is not the same rather than if I fill manually with same data. So, do you think that I'm not considering something here?

Any help?

I've followed some successful tips from Takashi, but for this particular case is not working.

Thanks!

BTW, I'm using 2017.1 FME Desktop version.

Thanks in advance,

Try the following template and see if that works for you. The parcel ID comes from the attribute "RC", change it as you like in the Creator.

es-cadastre-scraper.fmwt

It will extract the first two address lines into the attributes "Address1" and "Address2", e.g.

Result, from the Data Inspector:


Try the following template and see if that works for you. The parcel ID comes from the attribute "RC", change it as you like in the Creator.

es-cadastre-scraper.fmwt

It will extract the first two address lines into the attributes "Address1" and "Address2", e.g.

Result, from the Data Inspector:

Hi David,

 

Thanks for taking your time, and providing a solution!

 

 

Let me show you how I've just readjusted your values:

 

For the Requested URL, worked fine if it was run using a particular Cadastral ID, so, once I replace with another one, it got me no results.

 

The Url has been changed to something like this:

 

 

"

 

https://www1.sedecatastro.gob.es/CYCBienInmueble/OVCListaBienes.aspx?RC1=@Value(RC1)&RC2;=@Value(RC2)&RC3;=&RC4;=&esBice;=&RCBice1;=&RCBice2;=&DenoBice;=&pest;=rc&final;=&RCCompleta;=@Value(RC)&from;=OVCBusqueda&tipoCarto;=nuevo

 

"

 

Where those RC1 and 2, where a string subtraction from the used RC value.

 

 

Now, what I cared is the rest of attribute of that Building or whatever is referenced to, like AREA, TYPE,... not the addresses. But, no worries, I'll try to extract those I need, following the sample you made with Address1 and Address2.

 

 

Here is an example of the data to be extracted:

 

 

Looking up in the source, here is the html table:

 

 

Could you tell me how to use the CSS Selector you used on your example, applied here? I cannot find any documentation, and I'm finding a little hard to get the solution.

 

 

Also, I'd ask you about the Query String Parameters that I didn't use, and seems like was the problem solver. How that works?

 

 

Many thanks David,

 

 

 


Hi David,

 

Thanks for taking your time, and providing a solution!

 

 

Let me show you how I've just readjusted your values:

 

For the Requested URL, worked fine if it was run using a particular Cadastral ID, so, once I replace with another one, it got me no results.

 

The Url has been changed to something like this:

 

 

"

 

https://www1.sedecatastro.gob.es/CYCBienInmueble/OVCListaBienes.aspx?RC1=@Value(RC1)&RC2;=@Value(RC2)&RC3;=&RC4;=&esBice;=&RCBice1;=&RCBice2;=&DenoBice;=&pest;=rc&final;=&RCCompleta;=@Value(RC)&from;=OVCBusqueda&tipoCarto;=nuevo

 

"

 

Where those RC1 and 2, where a string subtraction from the used RC value.

 

 

Now, what I cared is the rest of attribute of that Building or whatever is referenced to, like AREA, TYPE,... not the addresses. But, no worries, I'll try to extract those I need, following the sample you made with Address1 and Address2.

 

 

Here is an example of the data to be extracted:

 

 

Looking up in the source, here is the html table:

 

 

Could you tell me how to use the CSS Selector you used on your example, applied here? I cannot find any documentation, and I'm finding a little hard to get the solution.

 

 

Also, I'd ask you about the Query String Parameters that I didn't use, and seems like was the problem solver. How that works?

 

 

Many thanks David,

 

 

 

You can find an OK tutorial for CSS selectors here:

 

https://www.w3schools.com/cssref/css_selectors.asp

 

To give you a more detailed reply for the HTMLExtractor, could you please post a complete URL to a listing page I can use for testing?
You can find an OK tutorial for CSS selectors here:

 

https://www.w3schools.com/cssref/css_selectors.asp

 

To give you a more detailed reply for the HTMLExtractor, could you please post a complete URL to a listing page I can use for testing?
Yeas, of course. Thanks for the reference, I'll check it up.

 

 

Let me get the output once run the process, and I'll post here back, because a Cadastral Reference may take various horizontal houses, and it is getting a bit complicate. Thanks,

 


You can find an OK tutorial for CSS selectors here:

 

https://www.w3schools.com/cssref/css_selectors.asp

 

To give you a more detailed reply for the HTMLExtractor, could you please post a complete URL to a listing page I can use for testing?
ummm... I'll need to hold this for a next round tomorrow...

 

What if I put a RC 14] = "0729806VG3802N" value? it gets me to its longest reference since it is the unique "owner" let's say... with: 0729806VG3802N0001IB120]. Where the site looks like this:

 

which its source is: Link

 

 

Otherwise, if I put a RC value, with various owners, like this one: "9616903VG2891N"&14]

 

gets me to this one:

 

 

which its source is: Link

 

a full list on a table with RCs of t20] chars.

 

So, the topic is getting complicate.

 

 

I'll this topic updated once I get any results, if I can success with FME Desktop.

 

Thanks,

 

 


Hi @juanmahere, did you manage in the end to solve this via webscraping with the HTMLExtractor and CSS Selectors? By the way, very good idea @david_r !

I came very recently to this same problem with this very same website and I ended up using this other web service where I can easily send http GET requests to get the same data: https://ovc.catastro.meh.es/ovcservweb/OVCSWLocalizacionRC/OVCCallejero.asmx?op=Consulta_DNPRC

The only con is that every request takes about 4 seconds... I have seen faster services ;-)


Reply