Skip to main content
Solved

How to convert ambiguous dates to unique dates with time zone information?


ewb_fme
Contributor
Forum|alt.badge.img+4

I have a time series with ambiguous dates due to the daylight savings time change that I would like to convert to unique datetimes with time zone information:

 

  • 2021-10-31 02:00 > 2021-10-31 02:00:00+02:00
  • 2021-10-31 02:15 > 2021-10-31 02:15:00+02:00
  • 2021-10-31 02:30 > 2021-10-31 02:30:00+02:00
  • 2021-10-31 02:45 > 2021-10-31 02:45:00+02:00
  • 2021-10-31 02:00 > 2021-10-31 02:00:00+01:00
  • 2021-10-31 02:15 > 2021-10-31 02:15:00+01:00
  • 2021-10-31 02:30 > 2021-10-31 02:30:00+01:00
  • 2021-10-31 02:45 > 2021-10-31 02:45:00+01:00

I tried to do this in FME with the "TimeZoneSet"-function as described in https://community.safe.com/s/article/converting-time-and-date-fields-to-local-timezones. This works for all datetimes except the ambigous ones above (I get null values).

 

In Python this can easily be done by using pandas and the option "ambigous='infer'":

import pandas as pd
df = pd.read_csv(file, sep=';', parse_dates=['dt'])
df['dt'] = df['dt'].dt.tz_localize('CET', ambiguous='infer')

For this reason I wanted to use this code in a PythonCaller. But unfortunately this does not work. I assume this is because FME can not handle pandas DataFrame objects, see https://community.safe.com/s/question/0D54Q000080hYEoSAM/displaying-data-frame-in-python-caller.  I got errors like:

  • AttributeError: Can only use .dt accessor with datetimelike values

 

Any tips on how I could solve this problem?

Best answer by ebygomm

ebygomm wrote:

It is possible to transform the datetime values using pandas, but you'll need to create the DataFrame within the pythoncaller. How is your source data structured?

I suspect a better way would be to import the csv file, do the transformation with pandas and then process the csv file directly into features from there.

 

Something like this

import fme
import fmeobjects
import pandas as pd
 
 
class FeatureProcessor(object):
    def __init__(self):
 
        self.df = pd.read_csv(FME_MacroValues['SourceDataset_CSV2'], sep=';', parse_dates=['dt'])
        self.df['dt'] = self.df['dt'].dt.tz_localize('CET', ambiguous='infer')
 
 
    def input(self, feature):
        for index, row in self.df.iterrows():
            f = fmeobjects.FMEFeature()
            f.setAttribute('dt', str(row['dt']))
            f.setAttribute('attr_value_1', row['attr_value_1'])
            f.setAttribute('attr_value_2', row['attr_value_2'])
            self.pyoutput(f)
 
    def close(self):
        pass

 

View original
Did this help you find an answer to your question?

7 replies

  • June 2, 2022

I was able to solve this problem with ordered and sampled input and then process the datetime values manually in the AttributeManager. This works well but is not really a robust solution (e.g. different timestamp resolution, more than one timezone switch in the dataset etc.)


ewb_fme
Contributor
Forum|alt.badge.img+4
  • Author
  • Contributor
  • June 8, 2022

Isn't it possible to use a pandas DataFrame to transform datetime values? Any other recommendations?


ebygomm
Influencer
Forum|alt.badge.img+31
  • Influencer
  • June 8, 2022
ewb_fme wrote:

Isn't it possible to use a pandas DataFrame to transform datetime values? Any other recommendations?

It is possible to transform the datetime values using pandas, but you'll need to create the DataFrame within the pythoncaller. How is your source data structured?


ewb_fme
Contributor
Forum|alt.badge.img+4
  • Author
  • Contributor
  • June 8, 2022
ebygomm wrote:

It is possible to transform the datetime values using pandas, but you'll need to create the DataFrame within the pythoncaller. How is your source data structured?

Thank you for your answer! My source is a csv-file with a "Date Time" column containing values like

  • 2021-10-31 02:00
  • 2021-10-31 02:15
  • ...

 

screenshot 

My solution so far:

I use a PythonCreator to import a csv file, do my "datetime transformations" with pandas (as described above), and then export a new csv file that I can then import into FME with a CSV-reader and do further transformations. Like this I can solve my problem. However I doubt that this is really best practice how to work with pandas and fme. What do you think of this solution?


ebygomm
Influencer
Forum|alt.badge.img+31
  • Influencer
  • Best Answer
  • June 8, 2022
ebygomm wrote:

It is possible to transform the datetime values using pandas, but you'll need to create the DataFrame within the pythoncaller. How is your source data structured?

I suspect a better way would be to import the csv file, do the transformation with pandas and then process the csv file directly into features from there.

 

Something like this

import fme
import fmeobjects
import pandas as pd
 
 
class FeatureProcessor(object):
    def __init__(self):
 
        self.df = pd.read_csv(FME_MacroValues['SourceDataset_CSV2'], sep=';', parse_dates=['dt'])
        self.df['dt'] = self.df['dt'].dt.tz_localize('CET', ambiguous='infer')
 
 
    def input(self, feature):
        for index, row in self.df.iterrows():
            f = fmeobjects.FMEFeature()
            f.setAttribute('dt', str(row['dt']))
            f.setAttribute('attr_value_1', row['attr_value_1'])
            f.setAttribute('attr_value_2', row['attr_value_2'])
            self.pyoutput(f)
 
    def close(self):
        pass

 


ewb_fme
Contributor
Forum|alt.badge.img+4
  • Author
  • Contributor
  • June 9, 2022
ebygomm wrote:

It is possible to transform the datetime values using pandas, but you'll need to create the DataFrame within the pythoncaller. How is your source data structured?

Good idea! This has worked like a charm. Thank you!


  • January 28, 2023

The ‘.dt’ accessor only works with datetimelike values, and if you try to use it on non-date-time-like objects, it will throw an error.

 

The '.dt' accessor can extract the year, month, day, hour, minute, and second information from a datetime value.

 

You can easily fix the AttributeError: can only use .dt accessor with datetimelike values by using the pd.to_datetime(df['column'], errors='coerce') function to transform missing values into NaN and then use the .dt accessor to get the date values.


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings