Skip to main content

Hello,

I have a huge amount of points within a country (>33 Million) which represents addresses. [1 csv file of size > 2GB with lat/lon information]
and I also have a PolygonS Area (>5k) which covers a number of area location. [multiple .tab files]

I need to get the number of points within each polygon area.


Here's the workflow information :
1. Added a reader to read the polygon layer from the MapInfo files (top most in the Navigator).
2. Added a reader to read the 33 million points (Splitted into 2 csv files).
3. Adding attribute manager to manage the attributes in polygon reader.
4. Created point geometries through CSV reader parameter setting or the VertexCreator.
5. Used a PointOnAreaOverlayer transformer, sending the polygons to the Area port and points to the Point port,("Yes" to the Area First parameter)
6. Adding Expression Evaluator to evaluate the pointonarea funcion.
7. Connecting point output from the point port having "_overlaps" attribute which stores the number of inside points.

So, I tried the processing using Pointonareaoverlayer but the problem is that output file containing no. of Points within that polygon are different from the previous method having huge difference of 600 points.
Is there a reason why am I getting such kind of errors.
Am I missing anything important?


Thanks in advance,

Kind Regards,
Mridu

@mridu_prakash

For performance reasons I would create the points from the CSV and write out to FFS (the internal format of FME).

Then I would create a workspace to read the areas and use the FeatureReader to read the FFS files back.

Then use the StatisticsCalculator on the points (group by the area ID or name) to calculate the number of points per area.


Hi @mridu_prakash, the polygon features output from the Area port of the PointOnAreaOverlayer should have the overlap count attribute (called "_overlaps" by default) which stores the number of points which are within it.


What was your previous method? Maybe these points lie on the boundary of the polygon and are not contained within?

You should be able to just use the output polygon port to get the count of points within each polygon. The _overlaps attribute is also added to the polygon with a count of points contained.


Hi,

The expression evaluator is not needed since the _overlaps attribute will hold the number of points per area.

If you writing to csv again, dropping the geometry with the geometry remover is advisable.

Possibly some of these articles might be helpful

 

https://knowledge.safe.com/articles/27998/tutorial-common-gis-operations.html

So things to consider are duplicate points or polygons that might result in unexpected results.

Hope this helps.


For performance reasons I would create the points from the CSV and write out to FFS (the internal format of FME).

Then I would create a workspace to read the areas and use the FeatureReader to read the FFS files back.

Then use the StatisticsCalculator on the points (group by the area ID or name) to calculate the number of points per area.

 

Hi erik_jan, thanks for your suggestion but unfortunately the FFS method didn't work. The output remains the same (no. of PIP).

Hi @mridu_prakash, the polygon features output from the Area port of the PointOnAreaOverlayer should have the overlap count attribute (called "_overlaps" by default) which stores the number of points which are within it.

 

Hi Takashi, Thanks for your suggestions..but still the output remains the same, please suggest some other options.

What was your previous method? Maybe these points lie on the boundary of the polygon and are not contained within?

You should be able to just use the output polygon port to get the count of points within each polygon. The _overlaps attribute is also added to the polygon with a count of points contained.

 

Hi , previous method was also using within query. _overlaps attribute of polygon also contains the same no. of output.

Your methodology looks correct, so I think you'll have to take one (or several) of those points that FME doesn't consider overlaps and look closely at them to figure out why. If you can post a couple of those points - and the polygon you think they should belong to - then we can investigate further. But it's pretty difficult to figure out why they might be uncounted just from a description I'm afraid.


Reply