Question

Question of the Week: Number Rounding Errors in Datetimes

  • 28 February 2020
  • 0 replies
  • 18 views

Userlevel 4
Badge +25

Happy Friday FME'ers,

Computers are fascinating to me because what you do in a piece of software doesn't often reflect what is going on underneath. As this article says,

Computers operate in binary, meaning they store data and perform calculations using only zeros and ones... However, multiple binary digits can be used to represent large numbers and perform complex functions.

So, although your computer is carrying out complex binary calculations for everything you do, you don't get to see that... except for cases like today's question-of-the-week, where it comes bubbling to the surface.

Question of the Week

Q) I have some dates with fractional seconds and I want to truncate them to whole seconds. In the fme help it says the @int function converts by truncation, which is exactly what I want.

Most dates I'm sending in are truncated. But ones of the type 20200220015759.999000 are getting rounded up. In the case of above, it's an invalid date because it rounds up to 60 seconds.

Is this a bug or am I missing something?

Asked here by @jakethepainter (and similarly here at least five years ago by @ebygomm)

 

A) The quick answer (thanks @david_r) is to use a StringReplacer transformer to simply get rid of values after the decimal place. However, I think it's worth explaining why this is an issue, and what difference it makes by being a datetime value.

Binary/Float Conversions

I've talked about this before (the most memorable article being on our blog) so I'll make this brief. In decimal, 1/3 = 0.333333 etc, the point being that wherever you truncate this value, there is a small error against the true result.

Binary computations have the same issue, but with different numbers. So 1/10 in decimal is 0.1, but in binary it returns a recurring number that ends up being truncated, introducing a small error.

It means that when software converts from float to binary (or vice versa) it can include a small rounding error, without it being obvious to users.

As this article says, this is not a bug; it's the nature of binary floating-point arithmetic.

Presumably for this user, the error when 20200220015759.999000 is converted to binary pushes it over the nearest integer (for example it becomes 20200220015760.000001) and rounding down brings it back to 20200220015760

So it's not a bug?

It's really a matter of opinion here, so here's mine. Note that my opinion does not constitute this being official Safe policy!

If you're using functions like int() or floor() - or for any other oddities like this when using a function inside of the arithmetic editor (yes, 0.1 + 0.2 = 0.30000000000000004) - then this is not a bug.

These sort of functions are inherently tied to low-level behaviour. If you wrote this as a piece of code in Python, C, tcl, or whatever, you'd get the same result.

I'd say, it's up to the user to be aware of the issues (and I concede we could perhaps document this somewhere).

On the other hand... I'd expect transformers to insulate users against this. Transformers are high-level tools. I don't know how, but I'd want this to round correctly. So I'm currently talking to our developers to see if they can do anything in this case.

However, the data here is not a true number, but a datetime...

Is it different because this is a datetime?

Yes, absolutely. At least, that's my opinion again! A datetime isn't a number as such, it's just a string that happens to be made up of numbers! So in this case using a StringReplacer is absolutely the way to go.

However, perhaps there are better solutions?

Are you rounding down just because you know that it will produce a valid datetime? Perhaps you could use the Round function (instead of Int) and then put the data through a DateTimeConverter.

In the DateTimeConverter, you don't change the format, but you do apply the Repair Overflow option:

Now your date will enter as 20200220015760 but it will exit as 20200220015800 - i.e it is now a completely valid timestamp. If you just want to round to the nearest second, with a guaranteed correct datetime, this is the way to go I think.

It also does look like you may have created this datetime with the DatetimeStamper transformer. I've already asked if we can make it an option to have integer datetimes, with fractional seconds excluded. So I'm hopeful that option will appear soon and resolve the problem before it even occurs.

I see a couple of ideas on the community; this one: Date and Time Rounder; and this one: DateFormatter to Handle Milliseconds - so interested parties may want to go and vote those up, or add a comment about what scenarios you are trying to handle.

The Safe internal reference, in case it's of use, is FMEENGINE-10228

Other Notable Questions

As always, there were a number of items to catch my interest...

  • Datetimes in Input FileNames
  • Value used in Radian to Degrees Conversion
    • A user asked via our support queue what the conversion rate is between degrees and radians in FME. A developer let us know that we use exactly: 180.0 / 3.14159265358979323846, which should come out to around 57.29577951308232 degrees per radian.
  • While Loops
  • Delimited Text whose Values Contain the Delimiter
    • Here's an interesting question from @david.benoit, and I'm certainly infuriated when someone gives me data like this! Lots of answers. Maybe do a cleanup on the source data? Perhaps the Field Qualifier Character parameter in the CSV reader will help? Or read with the Textline reader and use Python to sort it out?
  • The Wrong Log File gets Written to...
    • A workspace writes to the wrong log file. As @tibor points out, use the Save As option to save a workspace, to force it to write to the same logfile name as the workspace. Simply renaming the file in Windows Explorer won't do it.
  • Batch Deploy Won't Work for AGOL
    • Asked by @anders_d and it's probably because Batch Deploy is getting old now. It was made at a time before web services were really a thing in FME. The best technique now is the two-workspace helper/processor model.

0 replies

Be the first to reply!

Reply