Skip to main content

Hi,

I need to compare data from an Oracle Spatial database and from an ArcSDE Geodatabase (with an underlying Oracle database). I carry out the comparison of the data from these two sources in FME using the ChangeDetector. Since FME treats dates between the reader and writer as strings, I have the following issues in ChangeDetector:

DateOracle SpatialArcSDE31/12/2011 00:00:00311220110000003112201131/12/2011 03:27:303112201103273031122011032730

In the table above you see that when the time is 00:00:00, the date from ArcSDE is a string without the time. The Oracle Spatial reader, however, builds a string that includes the time. When the time is something other than 00:00:00, there is no difference between Oracle Spatial and ArcSDE. As a consequence, ChangeDetector will report changes even for identical data because the date-strings can be different.

Since the Workspace is generic and used for a lot of different feature classes, a DateFormatter does not seem to be the solution for me. If I want to modify the date-attributes before the ChangeDetector reads them, I will probably have to use PythonCaller and detect the concerned attributes through their name ("contains 'DATE'", not really reliable) or their data type (by using a FeatureReader and merge the schema through FeatureMerger).

However, as the PythonCaller-code could slow down the processing significantly, I wanted to ask whether there is another "FME standard way" of dealing with or avoiding this date-issue.

Greetings,

 

André

As I could not find another solution that worked with generic Workspaces, I nevertheless used an approach based on FeatureReader and PythonCaller. The latter builds a list of all attribute names having originally a DATE-data type by using the schema object of the FeatureReader. This variable is then passed to another PythonCaller that checks all incoming data features and modifies the values of DATE-attributes that fulfill certain conditions (e.g. length of 8 -> no time Information). Maybe this helps if someone else encounters the same issue.

Greetings,

André


As I could not find another solution that worked with generic Workspaces, I nevertheless used an approach based on FeatureReader and PythonCaller. The latter builds a list of all attribute names having originally a DATE-data type by using the schema object of the FeatureReader. This variable is then passed to another PythonCaller that checks all incoming data features and modifies the values of DATE-attributes that fulfill certain conditions (e.g. length of 8 -> no time Information). Maybe this helps if someone else encounters the same issue.

Greetings,

André

Would the NullAttributeMapper be of use here, using the attribute value matches regex function and mapping to a new value which adds in the missing zeros to the ArcSDE value

 

 


As I could not find another solution that worked with generic Workspaces, I nevertheless used an approach based on FeatureReader and PythonCaller. The latter builds a list of all attribute names having originally a DATE-data type by using the schema object of the FeatureReader. This variable is then passed to another PythonCaller that checks all incoming data features and modifies the values of DATE-attributes that fulfill certain conditions (e.g. length of 8 -> no time Information). Maybe this helps if someone else encounters the same issue.

Greetings,

André

Possibly the StringFormatter can do the job too.

 

 


As I could not find another solution that worked with generic Workspaces, I nevertheless used an approach based on FeatureReader and PythonCaller. The latter builds a list of all attribute names having originally a DATE-data type by using the schema object of the FeatureReader. This variable is then passed to another PythonCaller that checks all incoming data features and modifies the values of DATE-attributes that fulfill certain conditions (e.g. length of 8 -> no time Information). Maybe this helps if someone else encounters the same issue.

Greetings,

André

Thanks for your answers. The Problem with both the NullAttributeMapper and the StringFormatter is that they need to know the attribute names to consider in advance. However, the attributes containing date-values to be modified vary between the feature classes read by the Workspace.

 


Hi @andr_, if the native data type of those attributes was definitely "date" or "datetime", I think the schema features can be referred to identify the datetime attributes, and it's possible to create a Python script to format those attributes with a common datetime format.

However, if the native data type was not "date" nor "datetime", there might not be a reliable way to resolve the issue. If you check all attribute values, it might be possible to detect DDMMYYYY-format values and change them to DDMMYYYY000000, but I'm afraid that it could also change non-date value that matches DDMMYYYY-format coincidentally.


Thanks for your answers. The Problem with both the NullAttributeMapper and the StringFormatter is that they need to know the attribute names to consider in advance. However, the attributes containing date-values to be modified vary between the feature classes read by the Workspace.

 

You can send all attributes to the NullAttributeMapper so don't need to know names in advance

 

 


Hi @andr_, if the native data type of those attributes was definitely "date" or "datetime", I think the schema features can be referred to identify the datetime attributes, and it's possible to create a Python script to format those attributes with a common datetime format.

However, if the native data type was not "date" nor "datetime", there might not be a reliable way to resolve the issue. If you check all attribute values, it might be possible to detect DDMMYYYY-format values and change them to DDMMYYYY000000, but I'm afraid that it could also change non-date value that matches DDMMYYYY-format coincidentally.

The PythonCaller with this script converts "DDMMYYYY" (eight digits and a valid date) to "DDMMYYYY000000". It's not perfect, since it converts unconditionally every "DDMMYYYY" even if it's not a date value actually. FYI.

 

import re
from datetime import datetime
class FeatureProcessor(object):
    def input(self, feature):
        for attr in feature.getAllAttributeNames():
            if attr.startswith('fme_'):
                continue
            try:
                value = str(feature.getAttribute(attr))
                m = re.match('''^\d{8}$''', value)
                if m:
                    datetime.strptime(value, '%d%m%Y')
                    feature.setAttribute(attr, '%s000000' % value)
            except:
                pass
        self.pyoutput(feature)

You can send all attributes to the NullAttributeMapper so don't need to know names in advance

 

 

That is true. However, as I (have to) use a FeatureReader, the features in FME have no information about their original data type. This means that strings, dates and even some of the numbers are treated as strings in FME. So I have a mixture of format and user attributes marked as strings irrespectively of being dates. ThusI have to find a way to recognize which attributes were originally dates. For this task, I use the schema object and PythonCaller.

 

 


The PythonCaller with this script converts "DDMMYYYY" (eight digits and a valid date) to "DDMMYYYY000000". It's not perfect, since it converts unconditionally every "DDMMYYYY" even if it's not a date value actually. FYI.

 

import re
from datetime import datetime
class FeatureProcessor(object):
    def input(self, feature):
        for attr in feature.getAllAttributeNames():
            if attr.startswith('fme_'):
                continue
            try:
                value = str(feature.getAttribute(attr))
                m = re.match('''^\d{8}$''', value)
                if m:
                    datetime.strptime(value, '%d%m%Y')
                    feature.setAttribute(attr, '%s000000' % value)
            except:
                pass
        self.pyoutput(feature)
What I have used so far is:

 

0684Q00000ArN5lQAF.png

 


What I have used so far is:

 

 

I think that it would be the most reliable way, if the native data type of target attributes was definitely "date" or "datetime".

 


Hi,

I thought I could quickly summarize all the differences that occured to me when comparing features from an Oracle Spatial database with features from an SDE Geodatabase:

 

* Date/Timestamp-Attributes: Difference, when time-part is 00:00:00. Solution: PythonCaller if generic workflow with FeatureReader, DateFormatter if static workflow

 

* Geometry: SDE saves vertices in a predefined grid. Solution: ArcSDEGridSnapper

 

* Numbers: SDE saves small floats (e.g. 0.000022) in scientific notation (e.g. 2.2e-5). Solution: PythonCaller

In addition I stumpled upon the behavior of ChangeDetector, that format attributes (even when not exposed) are included in the comparison as soon as "Match All Except Selected Attributes " is selected. At least that is what I encoutered in the generic workflow using the FeatureReader.

Greetings,

 

André


Hi,

I thought I could quickly summarize all the differences that occured to me when comparing features from an Oracle Spatial database with features from an SDE Geodatabase:

  • Date/Timestamp-Attributes: Difference, when time-part is 00:00:00. Solution: PythonCaller if generic workflow with FeatureReader, DateFormatter if static workflow
  • Geometry: SDE saves vertices in a predefined grid. Solution: ArcSDEGridSnapper
  • Numbers: SDE saves small floats (e.g. 0.000022) in scientific notation (e.g. 2.2e-5). Solution: PythonCaller

In addition I stumpled upon the behavior of ChangeDetector, that format attributes (even when not exposed) are included in the comparison as soon as "Match All Except Selected Attributes " is selected. At least that is what I encoutered in the generic workflow using the FeatureReader.

Greetings,

 

André


Reply