U can try it.
Take a pointcloud and clip it with some feature.
Then use a pointcloudcoercer to turn your "clips" to points.....yup entire pc comes out.
Now try a pc-consumer before the coercer. ..that's more like it!
Hi Gio,
Thanks for the answer, but I got the same result whether i used the PCConsumer or not!
I completed a test as you described; however regardless of whether I used a PointCloudConsumer before the PCCoercer I still got the same amount of points. Please see the images included
test-workbench.pngoutput-001.pngoutput-002.png
The test workbench reads in a LAS file, a clipping boundary is created (scaled 1/2 size MBR) the LAS in then clipped. From the Inside port I took 1 stream and connected it into a PCCoercer this gave me 11856 individual points, I took a parallel stream from the Inside port and first connected a PC Consumer and then into a different PCCoercer I still got 11856 individual points. to my mind the PCConsumer made no difference what so ever, have I missed something?
Image 1 - test workspace
Image 2 - org LAS file and clip box to be used
image 3 - org LAS file in background with outputs from Non -PCConsumed (blue) and PCConsumed (orange) both with the same point count
Regards,
rob
OK, I think I know what the consumer does. There is a RasterConsumer too and I think I can base my answer on what I think that does!
So, here goes... with vector data in FME each transformer processes the data which is then sent to a Writer. So the data is processed step by step.
With Raster each transformer doesn't process data! Instead it tags the data with its operation (in the form of some complex mathematical matrix), which is then sent to the Writer. The Writer then forces the processing to occur. But! Because it knows what operations are going to accumulate it can do this more efficiently.
For example, if you have a Reprojector and then a Clipper, vector data would be reprojected and then clipped. But in Raster the Writer can say - "oh, I don't need to reproject all the raster data because only a part of it will come out of the clipper anyway". So it creates the known clip area then reprojects just that small area. It's just more efficient.
So, what I think the RasterConsumer does is this. It lets you define tile boundaries. Nothing happens immediately. But when the Writer starts working it divides the data into chunks based on these tile boundaries. Then it can (for example) clip and reproject small pieces of data, which is more efficient that trying to process an entire set of data at once (eg if the data can be treated as tiles, half of the tiles can be just discarded, rather than clipping one big raster and throwing half of it away)
You won't get tiles in the output, because you've not used the Tiler. You've just said FME can divide up the data for more efficient processing. So the result will be the same, but there are possible performance improvements (the amount of which is dependant on the other transformers used).
As for raster, so for point cloud, except that it is in 3D chunks rather than 2D tiles.
Why might you not want to do this? Where each cell/point might affect the results of another. For example, if you resampled a raster in tiles using nearest neighbor resampling, you might get different results around the tile edges because the nearest neighbor is now in a different tile.
I hope this helps. And I hope I'm correct! If you do have further questions on exactly how this works, do contact our support team (safe.com/support) who are more able to go talk to the developers and experts than I am.
Regards
Mark
@Mark2AtSafe
Hi Mark,
Thanks for the reply (needless to say this has generated more questions/thoughts, if you or anyone else has ideas relating to them them I would be happy to hear them).
As said I was a bit stumped as to what the PCConsumer actually did; and the transformer description was non-descript (to me at least). Therefore, I had wondered whether it was related to a processing/performance or perhaps used to limit the number PC points which are pushed down the pipe at any one time. (As an aside I had exploded a LAS file to see how many points were contained; and to understand if the process which built would work with the volume of data (or run out of memory). So I was sat there watching the LAS file be coerced into points (I had already watched the paint dry!), the auto count was increasing on the link line; when the count reached 2 million a message was displayed in the translation log window to say that points would no longed be written to ffs the store. However, the points were still be coerced from the PC and the auto count was still increasing; eventually, all the points had been generated over 5 million. I wondered about the message about the exceeded a 2 million point limited to the ffs store. Does this mean that the from the 2000001 to the end point will still be pushed down stream and any further processes will still be applied? Or does it mean that ONLY a maximum of 2 million points will be processed in total even though points after the 2 million threshold are still coerced? (The latter point would seem crazy, as why still coerce them if they cannot be used?) So I had wondered if the PCConsumer was meant as a valve to only push the number of points downstream to be processed as a chunk before the next lot were sent (but eventually all would be pushed though). i.e. a method to circumvent the 2 million limit and also meaning that memory would not be an issue.) (Sorry for the rather long aside.)
Can I quickly ask a specific question about your answer? You provide a description of what the RasterConsumer does, and say that the RasterConsumer is chunking the data into subsets, and I presume that I am right in thinking that it is effectively creating splits based on an AREA or BOUNDARY based on rows and cols. So my question is, in the case of PCConsumer and the BLOCK SIZE parameter, does the block size mean a subset of points of fixed amount will be processed at a given time (i.e. if set at 10000, then 10000 point will be passed, then another 10000, etc.)? Or is block size a distance unit which chunks the PC into cubes of that length? If it is the latter then presumably you might have a single point in 1 cube but 1000000 in another?
Thanks,
Rob
@Mark2AtSafe
Hi Mark,
Thanks for the reply (needless to say this has generated more questions/thoughts, if you or anyone else has ideas relating to them them I would be happy to hear them).
As said I was a bit stumped as to what the PCConsumer actually did; and the transformer description was non-descript (to me at least). Therefore, I had wondered whether it was related to a processing/performance or perhaps used to limit the number PC points which are pushed down the pipe at any one time. (As an aside I had exploded a LAS file to see how many points were contained; and to understand if the process which built would work with the volume of data (or run out of memory). So I was sat there watching the LAS file be coerced into points (I had already watched the paint dry!), the auto count was increasing on the link line; when the count reached 2 million a message was displayed in the translation log window to say that points would no longed be written to ffs the store. However, the points were still be coerced from the PC and the auto count was still increasing; eventually, all the points had been generated over 5 million. I wondered about the message about the exceeded a 2 million point limited to the ffs store. Does this mean that the from the 2000001 to the end point will still be pushed down stream and any further processes will still be applied? Or does it mean that ONLY a maximum of 2 million points will be processed in total even though points after the 2 million threshold are still coerced? (The latter point would seem crazy, as why still coerce them if they cannot be used?) So I had wondered if the PCConsumer was meant as a valve to only push the number of points downstream to be processed as a chunk before the next lot were sent (but eventually all would be pushed though). i.e. a method to circumvent the 2 million limit and also meaning that memory would not be an issue.) (Sorry for the rather long aside.)
Can I quickly ask a specific question about your answer? You provide a description of what the RasterConsumer does, and say that the RasterConsumer is chunking the data into subsets, and I presume that I am right in thinking that it is effectively creating splits based on an AREA or BOUNDARY based on rows and cols. So my question is, in the case of PCConsumer and the BLOCK SIZE parameter, does the block size mean a subset of points of fixed amount will be processed at a given time (i.e. if set at 10000, then 10000 point will be passed, then another 10000, etc.)? Or is block size a distance unit which chunks the PC into cubes of that length? If it is the latter then presumably you might have a single point in 1 cube but 1000000 in another?
Thanks,
Rob
The block size is the number of points, not a unit of distance. Perhaps I should have that added to the transformer gui (I'll also suggest we improve the documentation for the consumer transformers)
For me it did work.
I clipped a roadnetwork to get height data on the roads then consumed it then coerced it, and it works neatly. (and of course i set a value for blocksize to suit my needs)
FME2015
The block size is the number of points, not a unit of distance. Perhaps I should have that added to the transformer gui (I'll also suggest we improve the documentation for the consumer transformers)
I filed PR#68554 (for documentation improvements) and PR#68555 (for updating the GUI prompt)
@Mark2AtSafe
@Gio
Mark, thanks very much for the further explanation regarding the 'dimension' of block size and thanks for following up with the document improvement.
Gio, thanks also for running another test, i'm am not quite sure why PCConsumer 'appears' to make no difference in my test, I will recheck what I did. Thanks again.
Regards,
Rob
@rob14
Hi here is example of functioning consumer fme2015>
Clipper
part of the workbench:
Result after coersion:
these are 2.7mil points.
Also having sufficient memory is pretty handy. This is on a system with 16Gb memory. THis proces is 12 + min proces with 9Gb peak memory ussage.
I never even see the total amount of points in the processed pc in the workbench.
To extract height centerlines:
@gio
Hi Gio,
Thanks for the further screenshots, this looks like some interesting work with a nice output..
I will spend some more time looking at this within the context of the workbench I created.
regards,
Rob
OK, I think I know what the consumer does. There is a RasterConsumer too and I think I can base my answer on what I think that does!
So, here goes... with vector data in FME each transformer processes the data which is then sent to a Writer. So the data is processed step by step.
With Raster each transformer doesn't process data! Instead it tags the data with its operation (in the form of some complex mathematical matrix), which is then sent to the Writer. The Writer then forces the processing to occur. But! Because it knows what operations are going to accumulate it can do this more efficiently.
For example, if you have a Reprojector and then a Clipper, vector data would be reprojected and then clipped. But in Raster the Writer can say - "oh, I don't need to reproject all the raster data because only a part of it will come out of the clipper anyway". So it creates the known clip area then reprojects just that small area. It's just more efficient.
So, what I think the RasterConsumer does is this. It lets you define tile boundaries. Nothing happens immediately. But when the Writer starts working it divides the data into chunks based on these tile boundaries. Then it can (for example) clip and reproject small pieces of data, which is more efficient that trying to process an entire set of data at once (eg if the data can be treated as tiles, half of the tiles can be just discarded, rather than clipping one big raster and throwing half of it away)
You won't get tiles in the output, because you've not used the Tiler. You've just said FME can divide up the data for more efficient processing. So the result will be the same, but there are possible performance improvements (the amount of which is dependant on the other transformers used).
As for raster, so for point cloud, except that it is in 3D chunks rather than 2D tiles.
Why might you not want to do this? Where each cell/point might affect the results of another. For example, if you resampled a raster in tiles using nearest neighbor resampling, you might get different results around the tile edges because the nearest neighbor is now in a different tile.
I hope this helps. And I hope I'm correct! If you do have further questions on exactly how this works, do contact our support team (safe.com/support) who are more able to go talk to the developers and experts than I am.
Regards
Mark
So, my thoughts were not quite correct. What the consumer does is force FME to read the data at that point (presumably tiling it at the same time). Normally the reading of data would be part of the writing process I believe (for the above mentioned performance reasons - there's no point reading data you know you don't need). The consumer forces FME to read the data mid-translation. So... I don't really know that is much different in terms of outcome. Basically it's not something you're likely to ever need, and I don't think it would affect the output. I could make a good case that we should remove or hide this transformer from users, to avoid this sort of confusion.