Skip to main content

Dear all,

I am currently trying to expand an existing ASCII file with a single row. I have an ASCII file with X, Y, Z, R, G, B and an Intensity (Scalar field). My second ASCII file has the same X, Y, Z and Intensity but instead of R, G and B one single row, named Scalar field #2.

I have both files imported as readers in CSV and connected them with an Aggregator. Now I get an ASCII file which has everything I want: X, Y, Z, R, G, B, Intensity and Intensity 2. My problem is, this task takes +40 hours, since I'm dealing with over 40.000.000 data points each row.

 

 

Is there a chance to just copy X, Y, Z, R, G, B and Scalar field from ASCII 1 to a file and extend it with Scalar field #2 from ASCII 2? Do I really have to Aggregate the whole process?

Thank you in advance!

I believe FeatureMerger is faster than the Aggregator (especially if you read the Supplier first: first in the list of readers).

But if you have an option to read both CSV files as tables in a database (temporarily), you could use indexes to speed up the joining process and then write the join back to CSV.


  1. FeatureMerger - with Suppliers first
  2. Remove the CSV WRITER and replace it with a FeatureWRITER to be able to write the features at once - so you dont need to keep all the data in memory. You should be able to write the output-csv as soon as all the Suppliers are read!

Consider replacing the CSV reader/writer with Pointcloud XYZ-format - it should be waywaywayway faster as it stores - have a look at this for more info: https://knowledge.safe.com/articles/1337/reading-point-clouds.html


  1. FeatureMerger - with Suppliers first
  2. Remove the CSV WRITER and replace it with a FeatureWRITER to be able to write the features at once - so you dont need to keep all the data in memory. You should be able to write the output-csv as soon as all the Suppliers are read!

Consider replacing the CSV reader/writer with Pointcloud XYZ-format - it should be waywaywayway faster as it stores - have a look at this for more info: https://knowledge.safe.com/articles/1337/reading-point-clouds.html

Also - have the SMALLEST dataset as the SUPPLIER - because you need all of them in memory (it seems to be the bottom CSV-file with fewer attributes) and remove all unneded attributes.

 

 


Good morning,

hey you guys are awesome! It didn't even take 24 hours to get an answer to my questions, thank you very much! I will try your suggestions and write the outome into this post.

Have a nice christmas!

Greetings Sebastian


Reply