Skip to main content

Hi,

I'm now working at Stedin, one of the Dutch electric and gas grid operators.

I'm trying to do some analysis of land rezoning using the Dutch national development plan. Basically I want to checking if agricultural land has changed to industrial or residential use. The data is available to download from PDOK; however the GML file available (Enkelbestemming.gml) is 6Gb.

I'm able to read and write 1.2 million of the polygon features with FME Desktop 2016.1 but then I get the following error:

Error in input dataset:'file:///C:/Apps/Temp/Bestemmingdata/Enkelbestemming.gml' line:43002043 column:24505 message:input ended before all started tags were ended; last tag started is 'gml:posList''

Is there anything I can do to fix this file or do I need to contact the data owner which is the Kadaster? I don't have a file editor that opens the file.

Perhaps there is a better way to achieve this task?

Thanks,

Annette

Hoi @annette2,

Don think there is anyway of forcing the GML to be read and maybe contacting the Kadaster will help (but usually it takes a long time...) what about using the TOP10NL and not the Ruimtelijke plannen GML?

Cheer,

Itay


Hoi @annette2,

Don think there is anyway of forcing the GML to be read and maybe contacting the Kadaster will help (but usually it takes a long time...) what about using the TOP10NL and not the Ruimtelijke plannen GML?

Cheer,

Itay

't and s I forgot before :)

 

 


Hi Annette,

I checked and in the cases that reported this error, some were caused by corrupt data files, so that is certainly the first thing to check. I'd suggest opening it in a text editor, but 6GB! What you could do is use the TextFile reader in FME and set it to read from the bottom up. Have it read only the first 10 features. Then you'll be able to see if there is an obvious problem at the end of the file.

The other issue might be encoding. I'm guessing there might be "non-English" characters in there? Again with a 6GB file I don't know what to suggest about checking that out (opening it and removing those characters for a simple test is out of the question). I think you'll have to contact our support team (safe.com/support) and see what they say.

Regards

Mark


Hi again, I tried an old trick reading the GML with the XML reader (just make sure you select Yes on the continue on geometry error parameter) in combination with the GeometryReplacer seems to work just fine.

Hope this helps,

Itay


Hi again, I tried an old trick reading the GML with the XML reader (just make sure you select Yes on the continue on geometry error parameter) in combination with the GeometryReplacer seems to work just fine.

Hope this helps,

Itay

Upvote from me. I didn't know about that trick. Nice one.

 

 


Thanks Itay - Just finished processing another large dataset so I will test this on Monday. Much appreciated!

Thanks for the tips Mark :)


Thanks Itay - Just finished processing another large dataset so I will test this on Monday. Much appreciated!

Thanks for the tips Mark :)

Sure no problem, the down side of this approach is that you will need to de-aggregate the geometries to get the parcels you need, since they are created as an aggregate (parcel+bounding box) and btw its was 15(!!) GB of data...

 

 


Hi Itay, How did you expose xml_fragment?

Thanks, Annette


Hi Itay, How did you expose xml_fragment?

Thanks, Annette

Hoi @annette2, the xml_fragment gets automatically created by the XML reader

 


Reply