Every time I deviate from using unix milliseconds as my timestamps, I end up regretting it. If we use unix seconds, we get infinite bugs related to people forgetting to convert to millis when comparing against the current time. If we use Date objects, it's an even larger surface of potential bugs. Every Date interface I've ever seen makes it far too easy to accidentally create a relative time (i.e. anything that can't be mapped unambiguously to a single unix millis timestamp. Usually means a datetime that defaults to the current timezone). Does anyone have a preferred method that avoids these pitfalls?
At the end of the day I always come back to "solution with lots of possible bugs" or "unix millis everywhere". And I always choose the latter. It means we can't use nice date features in a lot of databases, but...eh? They've never seemed worth it.
I would say: Think carefully about what kind of 'time' you are trying to represent. Instants like unixtime are common in many problem domains, but there are plenty of situations where other choices are appropriate.
For example, I wrote an event registration marketplace some time ago. You might think "start time for event" would naturally fit as a unix timestamp, but it's a mistake. If you have an event at 10am in Las Vegas, moving it to Chicago shouldn't suddenly change the start time. And never store "all day" dates as a timestamp (ie datemidnight); timezone issues can easily produce off-by-one dates.
Basically, 'time' is not a single thing. You usually want to represent it the way your users think about it - and that isn't always like a unix timestamp (although it very often is).
Unix timestamps are bound to UTC[1], so it would never be correct to have a timestamp represent number of seconds since 1970 _somewhere_. You would have to convert the timezone for any location you want it to be relative to.
Yes yes. But your user doesn't think in terms of seconds-since epoch. Your problem domain includes two pieces of information: a localdatetime and a location. 10am 12/22/17 in Las Vegas. They set those fields separately in pretty much every conceivable UI, and you store them separately in pretty much every conceivable database.
If you store your event time as epoch time, you're storing "time of event in UTC" and "location of event". Which means every change of location requires you to recalculate and update the epoch time of the event. When users change Las Vegas to Chicago, you must update two fields - and if you ever screw it up, you have no way of knowing what the user originally intended.
This is a very bad case for epoch time. You may want to create an index on the calculated epoch time for queries, but store the datetime as localdatetime - the form that most accurately represents the source data.
You are of course free to ignore this advice and learn the hard way. Just stay off my lawn.
Maybe here's an example that's more useful: Some years ago, I worked on an in-house IaaS platform that included several workflow modules for the sysadmins (alert dispatching, time-tracking, etc.).
One of those components was a scheduler for one-time and recurring tasks. For example, when you delete a system, you set a timer for 14 days to have the system remind you (via e-mail or by issuing an alert into the queue) to delete the system's backup. There were also a lot of recurring tasks that needed to be performed daily or weekly. Now if you have a task thats configured daily at 9 AM, it's tempting to implement the timestamp series as
while (true) { timestamp += 86400; }
And indeed, that's how it was done in the existing code. But that means that once DST starts or ends, your daily-at-9-AM-task suddenly happens at 8 AM or 10 AM instead. Whether that's a problem depends on the type of task and how the sysadmins organize their work. And then there's the monthly recurrence, which is even messier with plain timestamps.
I cannot recall all details anymore, but I definitely remember that twice a year, after each DST change, someone would go through the list of recurring tasks, and move the times forward (or backward) by one hour manually.
EDIT: Maybe the simplest (though not easiest) solution to the irregular month lengths would be to attach giant thrusters to Earth and push it away from the sun a bit, so that our year is 366 days instead of 365 long. Then we make a calendar with 6 months per 61 days. As a bonus, it would reverse some of the effects of global warming. (Alternatively, go to 368 days and have 16 months with 23 days each.)
Much simpler calendar has 364 = 13 * 28 days. 13 months of 28 days (EXACTLY 4 weeks) each. This one is so simple, it might be worth having 1-2 extra-calendar days per year, even with a 365.24... day year.
It depends on the application semantics, for example if I scheduled a Breakfast for 08:00 in Las Vegas, but then moved the event to Chicago, I wouldn't want the time to be auto-converted to 06:00, (or whatever is appropriate given DST), because that's a silly time to have breakfast.
It doesn't need to come up "a lot" for your application to feel buggy. It just needs to come up.
Your UI has a selector for Date/Time and a selector for Location. They might even be different screens. The Location may have been prefilled with a guess based on geoip. Or maybe they saved the wrong 'Springfield'.
The user took an action to change Location. If that has the side effect of changing Time, the user will be surprised.
In some sense, Postgres agrees with you, since the underlying storage for a timestamp is something morally equivalent to that. (Milliseconds from 4713 BC.)
However, you do need to do date arithmetic from time to time, whether using a wrapped epoch time in the database or in the application. "One day from now" turns out to be complicated enough that we delegate to libraries to get it right; and Postgres's implementation of these features is solid. When you want to `GROUP BY` day, for example, there are performance benefits to doing that on the database side -- and for analysts, there is often little alternative but to handle dates with DB provided functionality.
When it comes to date arithmetic, how do you handle that with UNIX timestamps?
The problem with Unix milliseconds is it's actually not always increasing thanks to leap seconds. A positive leap second will result in the fractional part of the last second of the day going up to .999..., then resetting to .000... over again.
I'm with Google on this one... Just smear the leap seconds out in the general case, and anything that needs to be within 500ms of UTC can be handled as a special case.
Standard python date library does not allow to do any sorts of operations between TZ-aware and TZ-unaware dates. You're expected to explicitly convert between the two.
Django postgresql adapter will aggressively show warnings for all of the cases where you're trying to insert a TZ-unaware date into a TZ-aware column.
It sounds like what you really want is a wrapped "instant" class, which contains something more or less equivalent to a Unix timestamp inside it, but prevents you from accidentally interpreting it using the wrong epoch or units. That's what a Postgres timestamp is. Internally, it's a modified Julian date + nanoseconds since midnight UTC, or something like that. But you don't care if you don't work inside the Postgres source.
It is regrettable that APIs make it easy to create datetime objects without timezones, which is why sane people have moved to e.g. Joda Time and the libraries based on it.
When you talk about Date objects, do you mean datetime? Because a Date with no time compenent has tons and tons of real world uses that a unix time stamp would be inappropriate for.
Can you provide examples? It only seems useful to me for display purposes if you know you'll never need to worry about any specific local time zone (or if you will, then you know what time zone that is and know how to deal with it).
Just general calendar arithmetic. There are tons of cases where you need to do stuff like calculate "one month before", "one month after", "beginning of the year", ... --that's a mess with unix timestamps, but trivial with a good calendar library (or postgres, for that matter).
Well, yes, you can convert on input and/or output from/to unix timestamps, sure (and you can with postgres, too), but the point is that you have to convert first, you can't do date arithmetic on unix timestamps directly (because there is no fixed relation between unix timestamps and days).
At the end of the day I always come back to "solution with lots of possible bugs" or "unix millis everywhere". And I always choose the latter. It means we can't use nice date features in a lot of databases, but...eh? They've never seemed worth it.