Question

many xyz to gdb

7 years ago
17 June 2016
7 replies
21 views

nhaz
6 replies

I have many (about 20000)xyz files with Schema x (coord) y (coord) z (relative height) and no header.

My output should be a gdb with geodb_points geometry.

In my workbench I use csvReader so far. Then I use geometrieFilter Transformer. To read all the thounds of files I use a workspacerunner in an different workbench with the PATH reader. result is a gdb with 1000 feature classes. So now I have almost transformed 2/3 of my data (when I saw that huge disc space and stopped to improve workbench) but recognised that fme imported tables instead of feature classes (I guess in case of "wait for job to complete"=no at workspacerunner) .

To save disc space I decided to delete attributes (only one left: height) and work with 2DForcer.

My question ist now: how can I improve processing time? Using a different reader? Mark Ireland wrote there:

http://gis.stackexchange.com/questions/54558/huge-... that xyzReader is much faster than csvReader. If I use xyzpointcloudReader I cannot use x y and z value as separate values.

Is it faster to transform all of xyz files again than using two different workbench (one for xyz to gdb and one for transformed gdb to gdb)? For the latter I`m actually using schema-writer with featureReader transformer.

Thanks for every information on that because I`m new to FME!

7 replies

Userlevel 2

+17

takashi
Contributor
7538 replies
7 years ago
18 June 2016

Hi @nhaz, usual batch processing (including use of the WorkspaceRunner) takes time overhead to launch FME engine for each run. I think the overhead cannot be ignored if you run the workspace for each source file (run 20000 times).

Depending on how to determine the destination feature class, it might be possible to implement entire processing with a single workspace, without using the WorkspaceRunner.

Alternatively, the FME Command File method might be effective if the size of each file is small. However, you may have to create another workspace to create the command file.

See these articles to learn more about FME Command File.

If you feel that it takes a long time to launch FME engine for each run, I think it's worth to consider improving this point.

Although it might not help right now, FME2017 introduces greatly improved performance for the CSV reader, so it will be as fast as the XYZ reader. FME2017 can be downloaded from our website as a beta, but I don't know if you could or would want to use a beta version for this project. It's still early in the 2017 development cycle, so the beta is nowhere near a finished product and I would want to check my output carefully if I were using it.

nhaz
Author
6 replies
7 years ago
21 June 2016

Hi @takashi and thanks for information and explaining some point(s).

So do you mean that it`s faster to work without workspacerunner in this case?

My source xyz files are about 30MB each and gdbs are about 42MB each (so there is about 1TB in the end).

And do you prefer to do xyz -> gdb for all the data or xyz -> gdb + gdb -> gdb ?

Thanks for your input!

nhaz
Author
6 replies
7 years ago
21 June 2016

Hi @mark2catsafe, thanksfor letting me know about new version of FME.

But I think I could`nt use the beta now. I will keep that in mind for further work :-)

Userlevel 2

+17

takashi
Contributor
7538 replies
7 years ago
21 June 2016

Hi @takashi and thanks for information and explaining some point(s).

So do you mean that it`s faster to work without workspacerunner in this case?

My source xyz files are about 30MB each and gdbs are about 42MB each (so there is about 1TB in the end).

And do you prefer to do xyz -> gdb for all the data or xyz -> gdb + gdb -> gdb ?

Thanks for your input!

I think the Command File method would be one of options to be considered, but I cannot guarantee that it will be definitely faster. Anyway, to think of a better solution, we need to understand the requirement exactly.

Do you need to create a single geodatabase which consists of 1000 feature classes?
How do you determine the destination feature class for each xyz feature?
What kind of transformation do you need to perform? Just read xyz (3D points) and write them into gdb?
I think that one step workflow (xyz -> gdb) would be better in general. Is there any reason to consider the two steps (xyz -> gdb + gdb -> gdb)?

Userlevel 2

+17

takashi
Contributor
7538 replies
7 years ago
21 June 2016

Hi @takashi and thanks for information and explaining some point(s).

So do you mean that it`s faster to work without workspacerunner in this case?

My source xyz files are about 30MB each and gdbs are about 42MB each (so there is about 1TB in the end).

And do you prefer to do xyz -> gdb for all the data or xyz -> gdb + gdb -> gdb ?

Thanks for your input!

In addition, these articles may be helpful.

nhaz
Author
6 replies
7 years ago
21 June 2016

Hi @takashi thanks again for your input!

I try to answer your questions:

1) In the end I should have one (or many) gdb with different feature classes

2) In my writer for gdb/feature class I make use of fme_basename (in main workbench). but I think I have to do some further considerations.

3) Yes I need to read xyz and write gdb with point (2,5D) feature classes with height attribute as single attribute.

4) I consider making two steps because I have done 2/3 xyz -> 3D gdb (which is not desired result, -> it`s too big GB) and 1/3 xyz. So I`m wondering which way is faster: to do one "new" workbench xyz -> 2,5D gdb or to do two workflows with gdb -> gdb and xyz -> gdb.

Thank you!

many xyz to gdb

7 replies

Reply

Community Stats

Reply

Community Stats

Sign up

An FME Account is required to contribute

Login to the community

An FME Account is required to contribute

Scanning file for viruses.

This file cannot be downloaded