Skip to main content

The idea is to have an FME function that crawls database connections to extract the schema of the data.

For example, when you create a database connection there would be an option to create a crawler. When chosen we crawl the database (or specified parts of it) at intervals and store the table list and schemas locally. That way, when a user adds a reader and requests a list of tables, the response can be instantaneous and FME doesn't need to fetch that info from the database.

The crawler could be set to run at specified intervals, or on demand. The information it returns would include any schema information such as table names, field names, indexing, etc. The information would be held in a form of catalog that FME readers/writers/transformers can access. Maybe it could even be held in a format (like sqlite?) that users could read from it too?

Hi, I was wondering if there has been any progress with this idea? I would find this function very useful in my day to day work.

Thanks


Not at this point, no.


Risk with this idea is, that depending on the user you use to connect, the privileges to schema objects (tables etc) can be different. So the schema crawler is user dependent.


hum why not simply use an sqlexecutor and query analyse on database. We use that solution and just export informations on excel table use python jija2 templater or xslt transformer to export it on web page.

 

It really depend on database type and we can have special type (custom type) JSON type structure.

@erik_jan done a good response for privileges but this point is just an element who you need throw an error...

 

This inspection are specific on database and on type.

The result content is also the same problem. You can see introspection subject on web. Look at Stackoverfow.

 

You can use a python orm like that https://docs.sqlalchemy.org/en/14/core/reflection.html

for exemple geometry is userdefinedtype and if you want details of contains (geom, type, proj) you need a specific traitement for this type

 

So we also can create multiple sqlexecutor to get a part of contains an realize action on this specific part. So don't do a transformer but a project with this solution is a good point. The base of element can be add or not and the render can be customizable


Something like catching the data dictionary would be really useful.

And I think more generally. Eg. when the data source is calculated (API, etc), it would be useful to cache the schema in the reader/feature reader so they wouldn't attempt to connect/attempt to the data source when changing unrelated things (eg. output incoming attributes, etc.)


What problem is this solving? If I update a (feature)reader/writer I’d want to be absolutely sure I get the latest schema, so I wouldn’t want to fetch that from a cache.


The product team has decided to no pursue this idea - previously mentioned methods with SQLExecutor, PythonCaller or the newer Schema transformers make this possible. 
 
OpenArchived