Skip to main content

Hi FME gurus,

For an engineer I would like to parse OSM XML data with line and polygon features. Although point features are no problem to parse, the geometry from the line and polygon features is not coming correctly along by the OSM XML reader.

I can parse the OSM XML by a combination of FME transformers:

However this transformation is rather slow due to the number of vertices that have to be parsed.

Attached therefore an example of an OSM XML with highways. Is someone able to use the OSM XML reader with some settings, whereby the geometries of the line and polygon features are correctly parsed?

Is there a reason you are parsing the OSM file manually instead of using the built-in OSM reader?


Is there a reason you are parsing the OSM file manually instead of using the built-in OSM reader?

@jdh, you mean by the built-in OSM XML reader? That's because the geometry is not coming along by using that reader. If you could manage, that would be great.


Is there a reason you are parsing the OSM file manually instead of using the built-in OSM reader?

Actually, it is the output of an Overpass API call with HTTPCaller. For example:

http://overpass-api.de/api/interpreter?data=[bbox];way[highway];out%20geom;&bbox=5.817123,51.764072,6.244217,51.894772

In fact, if I can get de nodes into fme attributes from the reader, it would already lead to huge performance increase:

<way id="7053757">

 

<bounds minlat="52.0564849" minlon="5.0814822" maxlat="52.0567981" maxlon="5.0816927"/>

 

<nd ref="45053752" lat="52.0567981" lon="5.0816927"/>

 

<nd ref="364545599" lat="52.0566355" lon="5.0816688"/>

 

<nd ref="45053346" lat="52.0565550" lon="5.0815876"/>

 

<nd ref="45052894" lat="52.0564849" lon="5.0814822"/>

 

<tag k="bicycle" v="no"/>

 

<tag k="foot" v="no"/>

 

<tag k="highway" v="residential"/>

 

<tag k="maxspeed" v="50"/>

 

<tag k="mofa" v="no"/>

 

<tag k="name" v="Simon Vestdijkhove"/>

 

<tag k="oneway" v="yes"/>

 

<tag k="surface" v="asphalt"/>

 

</way>

Maybe changing the fme_map_features_config.xml helps?


Actually, it is the output of an Overpass API call with HTTPCaller. For example:

http://overpass-api.de/api/interpreter?data=[bbox];way[highway];out%20geom;&bbox=5.817123,51.764072,6.244217,51.894772

In fact, if I can get de nodes into fme attributes from the reader, it would already lead to huge performance increase:

<way id="7053757">

 

<bounds minlat="52.0564849" minlon="5.0814822" maxlat="52.0567981" maxlon="5.0816927"/>

 

<nd ref="45053752" lat="52.0567981" lon="5.0816927"/>

 

<nd ref="364545599" lat="52.0566355" lon="5.0816688"/>

 

<nd ref="45053346" lat="52.0565550" lon="5.0815876"/>

 

<nd ref="45052894" lat="52.0564849" lon="5.0814822"/>

 

<tag k="bicycle" v="no"/>

 

<tag k="foot" v="no"/>

 

<tag k="highway" v="residential"/>

 

<tag k="maxspeed" v="50"/>

 

<tag k="mofa" v="no"/>

 

<tag k="name" v="Simon Vestdijkhove"/>

 

<tag k="oneway" v="yes"/>

 

<tag k="surface" v="asphalt"/>

 

</way>

Maybe changing the fme_map_features_config.xml helps?

Unfortunately changing the fme_map_features_config.xml only controls what fme_feature_type the feature is assigned to based on the tags.


You could treat it as a generic xml, only instead of fragmenting the nodes and rebuilding the lines, you can restructure the xml into a geometry FME understands. overpassways.fmw


You could treat it as a generic xml,  only instead of fragmenting the nodes and rebuilding the lines,  you can restructure the xml into a geometry FME understands. overpassways.fmw

Nice, thanks! In this way the parsing of the geometry is more efficient than I had before. Final step is to write the tags as attributes, so I want to get the list of tags (k and v) dynamically as fme attribute and field value. Desired output format is GeoJSON and would look like:

 {

 

    "type" : "FeatureCollection",

 

    "name" : "highway",

 

    "features" : Â

 

        {

 

            "type" : "Feature",

 

            "geometry" : {

 

                "type" : "LineString",

 

                "coordinates" : F

 

                     Â 5.1051919, 52.0738589 ],

 

                    Â 5.1053088, 52.0737963 ],

 

                     Â 5.1053437, 52.0736926 ],

 

                    > 5.105369, 52.0736188 ],

 

                    , 5.10565, 52.0728625 ],

 

                    Â 5.1059033, 52.0719212 ],

 

                    Â 5.1061436, 52.0710064 ],

 

                    2 5.106168, 52.0708859 ],

 

                     Â 5.1064257, 52.0699046 ],

 

                    Â 5.1064969, 52.0696266 ],

 

                    5 5.106523, 52.069536 ]

 

                ]

 

            },

 

            "properties" : {

 

                "id" : "4342251",

 

                "timestamp" : "",

 

                "user" : "",

 

                "visible" : "",

 

                "uid" : "",

 

                "version" : "",

 

                "changeset" : "",

 

                "lat" : "52.0738589",

 

                "lon" : "5.1051919",

 

                "ref" : "250150107",

 

                "source" : "survey",

 

                "highway" : "cycleway"

 

            }

 

        }

 

    ]

 

}

For this last part I used the OSM XML reader and dynamically write to GeoJSON. However, this is too slow.

XMLXQueryUpdater expression:

  1. let $x := fme:get-xml-attribute("xml_fragment")
  2. let $members := {
  3.     for $a in $x//attribuut
  4.     return '"'||string($a/@naam)||'":"'||$a/text()||'"'
  5. }
  6. return '{'||fn:string-join($members, ',')||'}'

After that I use a JSONflattener to flatten attributes. However I don't fully get how the attributes I created return in the GeoJSON output.


Nice, thanks! In this way the parsing of the geometry is more efficient than I had before. Final step is to write the tags as attributes, so I want to get the list of tags (k and v) dynamically as fme attribute and field value. Desired output format is GeoJSON and would look like:

 {

 

    "type" : "FeatureCollection",

 

    "name" : "highway",

 

    "features" : i

 

        {

 

            "type" : "Feature",

 

            "geometry" : {

 

                "type" : "LineString",

 

                "coordinates" : Â

 

                    p 5.1051919, 52.0738589 ],

 

                    n 5.1053088, 52.0737963 ],

 

                    o 5.1053437, 52.0736926 ],

 

                     Â 5.105369, 52.0736188 ],

 

                    Â 5.10565, 52.0728625 ],

 

                    2 5.1059033, 52.0719212 ],

 

                    Â 5.1061436, 52.0710064 ],

 

                    Â 5.106168, 52.0708859 ],

 

                    1 5.1064257, 52.0699046 ],

 

                    Â 5.1064969, 52.0696266 ],

 

                    Â 5.106523, 52.069536 ]

 

                ]

 

            },

 

            "properties" : {

 

                "id" : "4342251",

 

                "timestamp" : "",

 

                "user" : "",

 

                "visible" : "",

 

                "uid" : "",

 

                "version" : "",

 

                "changeset" : "",

 

                "lat" : "52.0738589",

 

                "lon" : "5.1051919",

 

                "ref" : "250150107",

 

                "source" : "survey",

 

                "highway" : "cycleway"

 

            }

 

        }

 

    ]

 

}

For this last part I used the OSM XML reader and dynamically write to GeoJSON. However, this is too slow.

XMLXQueryUpdater expression:

  1. let $x := fme:get-xml-attribute("xml_fragment")
  2. let $members := {
  3.     for $a in $x//attribuut
  4.     return '"'||string($a/@naam)||'":"'||$a/text()||'"'
  5. }
  6. return '{'||fn:string-join($members, ',')||'}'

After that I use a JSONflattener to flatten attributes. However I don't fully get how the attributes I created return in the GeoJSON output.

I don't have time to experiment,  but I wonder if the most efficient solution might be to generate the attributes from the list  (see the ListKeyValuePairExtractor custom transformer)  and then rename the tag list to create a schema list (https://knowledge.safe.com/articles/1051/index.html ) and then send the data to a geojson writer dynamically.


I don't have time to experiment, but I wonder if the most efficient solution might be to generate the attributes from the list (see the ListKeyValuePairExtractor custom transformer) and then rename the tag list to create a schema list (https://knowledge.safe.com/articles/1051/index.html ) and then send the data to a geojson writer dynamically.

Thanks @jdh! This helped a lot. Attached my final workspace

OSMXML_to_GeoJSON.fmw


Reply