Skip to main content
Question

What is the Standard Process in Loading Big Data to a Database

  • February 28, 2020
  • 5 replies
  • 341 views

Hi guys,I am currently trying to automate numerous bits of GIS data we get from various sources. I have managed to do the following.

 

Simple Terms currently:

 

Excel/Shapefiles/Geodatabases ---> Data Validation via FME ---> FME/Python Script run via a scheduler --> Loads into Enterprise GeoDatabase

 

I am currently trying to decide what would be the best way to improve this process for it to be more accurate and so numerous users can also load their data successfully.In terms of excel/shapefiles should they place their files in a shared area folder and then possibly break it down into various folder categories or would you suggest SharePoint? I would be interested to hear other's views on this and how they may use it within their business.Also what scheduling tools are best suggested, I am looking at FME server as a possible option.We are also using a SQL Server Geodatabase
This post is closed to further activity.
It may be an old question, an answered question, an implemented idea, or a notification-only post.
Please check post dates before relying on any information in a question or answer.
For follow-up or related questions, please post a new question or idea.
If there is a genuine update to be made, please contact us and request that the post is reopened.

5 replies

redgeographics
Celebrity
Forum|alt.badge.img+62

You could do FME Server and set up an automation with a directory watch as a trigger. So any new file placed in a folder will trigger the workspace to run. You're probably going to have to add some logic to the workspace to deal with unexpected input though. I.e. if a user uploads data in the wrong format or the wrong schema you want it to fail nicely rather than crash and/or corrupt your database.

Scheduling can be handled with FME Server as well.

The benefit of FME Server is that you can serve a larger group of users and through automations you can build some pretty complex workflows.


virtualcitymatt
Celebrity
Forum|alt.badge.img+47

The amount of automation and possibilities that come with FME Server are amazing. If you have the budget I would definitely recommend it for automating these tasks (and many more). Schedules, emails with attachments, Directory watching are all popular methods for triggering a process.


  • Author
  • March 1, 2020

You could do FME Server and set up an automation with a directory watch as a trigger. So any new file placed in a folder will trigger the workspace to run. You're probably going to have to add some logic to the workspace to deal with unexpected input though. I.e. if a user uploads data in the wrong format or the wrong schema you want it to fail nicely rather than crash and/or corrupt your database.

Scheduling can be handled with FME Server as well.

The benefit of FME Server is that you can serve a larger group of users and through automations you can build some pretty complex workflows.

Great thanks for the feedback. Would you suggest putting the files initially in a folder structure for FME server to pick up or are there any other alternative methods that could be used?

I do like the folder structure aspect and FME server can just simply pick it up. Other options I seen are things like web pages where people can upload and then have a FTP server where files go for FME to maybe pick up. Sharepoint is another option.


  • Author
  • March 1, 2020

The amount of automation and possibilities that come with FME Server are amazing. If you have the budget I would definitely recommend it for automating these tasks (and many more). Schedules, emails with attachments, Directory watching are all popular methods for triggering a process.

Where do you usually store the files before FME picks it up? A shared folder area or any other options?


virtualcitymatt
Celebrity
Forum|alt.badge.img+47

Where do you usually store the files before FME picks it up? A shared folder area or any other options?

FME Server is flexible, however, a shared folder would makes sense. A baked up filesystem would be my recommendation. FME Server comes with it's own folder structure for logs etc as well as places for data which you can use. But in addition you can set up shared folders and even connect to s3 buckets. If it were me I'd probably set up one shared connection per project to keep the data separated on your network share. If working in the cloud then go with an s3 bucket.