Timestamps done right(getkerf.wordpress.com) |
Timestamps done right(getkerf.wordpress.com) |
As many have said elsewhere in the thread, what is `m` supposed to mean, minute or month? If the smallest value you can represent with this special syntax is 1s, you're excluding people who work with sub-second timestamps. Unless, you want them to do like 1m0.0000415s or something like that, but now you have the same problem as you started with (long series of digits being hard to read for humans).
More problems: different cultures have different concepts of a "month" (because they have different calendars). Also, the units that you want to use depend heavily on the application. `1y` is different from `52w` is different from `365d`, but all of those concepts are very similar to users. I expect people to create bugs where these tokens are mixed. When you write `now + 1y` do you mean "this time on this date, one year from now" or do you mean "this instant plus 606024*365 seconds" (which is slightly greater/less, I forget which way)? How do you communicate that to the user? If they can't control which sense ("human" time vs "physical" time) the larger units represent, then those people have to calculate the time offsets by hand.
Just pick nanos or micros since some Epoch (the Unix one is one reasonable choice), use them everywhere internally, and let the user pretty-print them. If you want to support high-energy timeseries where the time scale is smaller, either make your nanos fractional or use femtos/attos/zeptos/plancks.
https://azure.microsoft.com/en-us/blog/summary-of-windows-az...
For ages JodaTime actually nailed it, and the Java 8 date API was based off this.
> Not an add on type as in R or Python or Java.
Again, let's talk about the modern version of the language and not act like prior screw ups are the end all for a language.
Also
> 2012.01.01 + 1m1d
How is that more clean than:
> new DateTime(2012, 1, 1).plusMonths(1).plusDays(1)
> D + 1m1d != D + 1d1m
Mixing time units like days, months, years (where units are intransitive) is, in my opinion, a bad idea.
The only sane way I've found to work with time is to convert any timestamp into a seconds-since-the-epoch value when doing any internal work and then covert back to the timestamp format for display. As an added bonus your code won't get super messy when you start getting timestamps from different sources that are formatted differently. Everything gets normalized to the internal representation before you do work on it.
For storing a timestamp, absolutely, you should use an integer-based format, counting discrete somethings since somewhen.
For working with timestamps, you need all the nuance, you need something that can manipulate the different parts of it independent of each other.
And finally, for displaying or reading timestamps, you need all the localization and parsing crap to figure out what "020304" means. Fourth of March 2002? Third of February 2004?
I agree that this sort of first-class datetime-type representation may not be appropriate for every language, but myself, I find it refreshing and brilliant, and I'd love to see more languages support this sort of syntax instead of overloading strings or using complicated objects or APIs.
It's like comparing the power and ease of using regular expressions in Perl or Ruby versus in Java or Python. In Perl and Ruby, regexes are built-in to the syntax of the language itself. They're a truly first-class type, like strings and integers are in all four languages, and like lists and dicts/hashes/associative arrays are in Perl, Ruby, and Python.
I'd love to see datetime objects promoted to similar first-class native syntax support in this way in more languages. It won't be appropriate everywhere, but in the right language it'd be amazing.
> overloading strings
I'm pretty sure writing "+ 1m" is more of an overloading of a string than ".plusMinutes(1)".
Length, in either direction, does not correlate to "clean". Clarity and intent does. Clean Coder (http://www.amazon.ca/Clean-Code-Handbook-Software-Craftsmans...) does a fantastic job talking about this, it's worth picking up if you haven't read it in the past.
If character count were really the ultimate measure of cleanliness, then we'd all be programming pointlessly (https://en.wikipedia.org/wiki/Tacit_programming) and using single-character names for any variables that slipped through. (That's not to say that shorter is never better, but rather that, when it is better, its brevity is not the only reason.)
The IDE provides context-sensitive cues as one types so that you don't have the cognitive paper-cut from having to think for even a second if "1m" means "one minute" or "one month".
At a deeper level, chaining method calls to build-up an object is perfectly understandable way to modify something like a date.
Maybe it's not "first-class" (not sure what this means in context?), but it's definitely there and not in a third-party JAR or anything.
However, it's bad for other reasons, the first being that it extends java.util.Date (the Javadoc seems to admit this) and combined with the related java.sql.Date (which also extends java.util.Date) makes for a very confusing API.
For this reason, Oracle recommends just using the new Date APIs, [2] and mapping a SQL TIMESTAMP to LocalDateTime.
1. https://docs.oracle.com/javase/8/docs/api/java/sql/Timestamp...
2. http://www.oracle.com/technetwork/articles/java/jf14-date-ti...
However the new javax.time APIs (created by Stephen Colebourne) are excellent and probably one of the best designed APIs. It's funny what twenty years difference can make :)
1. Leap seconds? Those don't exist, right? 2. Years date from 1900. 3. January is Month 0 (ignoring the fact that there is already a widespread convention of numbering January 1).
Of course, the "fix" of Calendar didn't attempt to fix any of the POSIX-did-it-first problems but instead mostly limited itself to supporting other locales by allowing non-Gregorian calendars.
However, getting it wrong the first time is different than getting wrong again, as you pointed out with the Calendar type. 0 == JANUARY, but most other things are indexed from 1...
Third time's the charm, I guess :) Getting the input from Stephen Colebourne (of Joda-Time fame) was undoubtedly a key point in getting this done.
I'm convinced most languages do timestamps wrong. Specifically, they separate out the "time" component from the "date" component.
What's a more common operation? Counting how many events happened in a span of time? Or shifting every timestamp by 15 days?
Timestamps should generally be designed for extremely fast and lightweight comparison, but keep enough information that a shift is doable. From my experience, all you need is: a unix timestamp, a microsecond (or nanosecond) offset, and the source timezone.
In this case, if you want to find elapsed time between two timestamps you simply subtract the unix timestamps and offsets. Very CPU friendly and easily vectorizable. You can do this even if the timestamps originated at different timezones (since everything is UTC under the hood).
What if you want to shift the date? Or group by date? Turns out computing the date on the fly is a very cheap operation. Easily can do hundreds of millions/second on a single high-end server core.
an use whatever calendar system floats your boat.
Any timestamp system that relies on year/month/day semantics is rarely going to be optimized for the most common operations users do with timestamps. Even worse, for simple comparison you run into all the weird edge-cases that you wouldn't have cared about if you stuck with a unix timestamp under the hood.
Basically, he's big on Kerf because he's involved in the development. The above blog post will tie it with Kx and other APLs.
https://getkerf.wordpress.com/ is the official Kerf blog. And Kerf is not meant to be free...
What the article sez: GET KERF IT'S THE BEST AND ONLY WAY TO SOLVE THIS
Numericaly sortable date [YYYY+][mm][dd].[HH][MM][SS],[NNNNNNNNN]
date -u +%Y%m%d.%H%M%S,%N
20160122.044145,052215000
Absolute dates with nanos as seconds since epoch
date -u +%s.%N
1453437715.409682000
Hyphenated,"chunks", with nanos and TZ offset.
date -uIns
2016-01-22T04:41:03,121501000+0000But...the display and storage of a timezone is a datetime, and it doesn't appear to have a timezone attached. Meaning I'd still rather just store/retrieve/work with millis since epoch, since that avoids any ambiguity about what timezone the timestamp takes place in. With just a datetime...I can ~assume~ it's GMT, but...is it? Millis since epoch are the same across all timezones, there is no ambiguity; datetimes vary, and I have to make assumptions, and make sure my library/driver/etc handles conversions correctly.
It seems unlikely to me that any language could become mainstream without being open source. The expectation that a compiler or interpreter should have its source available is only growing.
At the same time, kx's kdb+ is an example of a product that sits in a niche and has generated significant revenue despite being closed source. kdb+ has achieved this by primarily targeting financial services, which is a sector that's less sensitive to closed source than most, and is able to spend money on whichever product solves their problem.
If kerf manages to gain an edge over its competitors for a particular set of problems then sure, it can thrive in the long run the way kdb+ has.
View -> Page Style -> No Style date -u +'%FT%H:%M:%S %Z' date -u +%FT%TZ
:) date -uIs
Shorter but sadly not RFC3339 compliant although it is ISO8601 compliant.I like Windows NT's system of seconds-since-1600-or-so in increments of 100 nanoseconds. That system has served me well. 100ns isn't crazy to read directly and often from a hardware register (talk to a hardware engineer about latching a rapid counter sometime...), it covers a reasonable range for humans (handles everyone currently alive, for their whole lifetimes and most events they care about) and it fits pretty well with events that occur in multiprocessor systems (I'll be saying something different if I ever work on systems with terahertz clock rates, though).
Ultimately that was the point of the article, but the tone of the article about how they got it right and everyone else is wrong bugged me.
My approach has been to store everything in epoch form, do all of my calculations and manipulations from there, then build tools that make converting back to a human readable representation when and where it is needed. I think the idea that you have a problem completely solved though just prevents the search for any improvements.
[0] Assuming that today + 1 year is defined to be 2017-01-21, and today + 2 years is defined to be 2018-01-21; and, if not, then one faces other problems with intuition.
Dealing with sub-second precision isn't that hard. You just need either a second value that holds the microseconds/nanoseconds or 64 bit time that counts nanoseconds since the epoch instead of seconds.
If you get timestamp data from a system outside of your control though you always have to make sure you know what it means. At least half the time it seems like a date without a timezone isn't UTC, but in whatever the originating timezone was but the developers didn't include a timezone offset in the data...Timezones....the bane of my existence.
Exactly. So, the first example starts with a date in a way of representing dates that will register immediately for even a lay person. The developer intends to add time to that date. The example does this with an addition operator then a value with letters representing recognizable units of time. Matter of fact, this was so obvious that I knew what the author was doing before I read the explanation. I'd probably do "min," "sec," "hr," etc to aid intuition, though. Esp avoid confusion on months vs minutes for m.
Then, there's the other example. It appears to create an object. It then calls a method on that clearly adds one month. It also calls a method of that method, that object... idk that language so I don't really know the semantics of what it's doing... to add 1 day.
One is definitely more clear and intuitive than the other. It also has the rare property of being easier to type. Epic win over whatever the other thing is. Not to say the other one was bad: still pretty clear. Just not as much as a straight-forward expression.
The Java was isn't really that different. Just more verbose, but again, I don't mind trading a few key strokes for clarity.
Take matrix algebra for example: what takes less time to process (e.g. when debugging code) --
`a = b'*c+d` or
`a = matrixAdd(matrixMult(matrixTranspose(b),c),d)`
or perhaps
`a = b.inverse().times(c).add(d)`? What if there are hundreds of operations like that?
Now, there _might_ be people for whom syntax (3) is the clearest -- for example, these could be people who know some programming, but are not familiar with matrix data or operations. However, their convenience, or that of your grandparents, doesn't really matter: if it's a one-off job for them, figuring it out would take a very small fraction of their time and is not worth optimizing for, and if it's not, extra time to learn the syntax would more than pay for itself when they have to regularly work with it. We don't use words to describe such expressions in print for exactly the same reason! :)
2011-02-28 + 1d1m = 2011-04-01
(edit: stupid leap-year!)
2011-02-28 + 1d1m = (2011-02-28 + 1d) + 1m = 2011-03-01 + 1m = 2011-04-01
Also, you have these overflow rules:
2011-03-31 + 1m = 2011-04-30
2011-04-30 - 1m = 2011-03-30
Adding 1 month or 1 year to the current date is usually a mistake. Quick question, what do you expect to happen when you code "Jan 31 + 1m"? What do you expect to happen when you code "Feb 29 2016 + 1y"? If you are thinking about doing this, ask yourself if it wouldn't make more sense to define your time by days instead. Jan 31 + 30 days, or Feb 29 + 365.25 days. Of course days are easy to implement on epoch time as well ( time + 30 * SECONDS_PER_DAY ).
Which is why you shouldn't do it, which is just what I said. :-)
> Adding 1 month or 1 year to the current date is usually a mistake.
No, I can think of many use-cases where this is useful. For example, what should Siri do if you tell it to "move today's 1 o'clock to the next month" ?
Jan 31 + 1m = Feb 28/29
Feb 29 + 1y = Feb 28
SECONDS_PER_DAY is not a constant, because of leap seconds. (Or it is, and shifting between UTC and UT1 causes them to appear. I forgot. It's messy no matter how you slice it.)
Luckily most people can ignore leap seconds, just like pretty much every system time library. Because they don't happen on regular intervals it is impossible to code a fixed rule for dealing with leap seconds so very few things even attempt it.
Time is hard enough to deal with already and few human scale things care about 1 second differences that happen once every 3-5 years or so.
> what should Siri do if you tell it to "move today's 1 o'clock to the next month" ?
This is an interesting question, because there are at least two valid options. If it is the 31st of the month, then maybe you want it to happen on the first of next month. Something a human secretary might intuit. On the other hand, you might mean moving the appointment back a whole month, unless it's the 31st and then you mean to have it a day earlier on the next month...
Like I said, this kind of logic gets you in ambiguous edge case hell in a hurry. FWIW, I don't think Siri even attempts to deal with a request like that.
A better solution is probably to require the person to be a bit more explicit in their request "Siri, move my 1 o'clock to next Tuesday".
Besides, these kinds of interactions aren't impossible with epoch time, they just require more work. One can argue that it would be good to make this sort of thing a little tougher as it will encourage the programmer to think harder about what they are doing and reconsider if it is a good idea.
Operation like "1 day from this timestamp" is locale-dependent. And SECONDS_PER_DAY is not a constant.
Besides, I can practically guarantee that the libraries discussed here don't deal with leap seconds. They're going to get it just as wrong.
Except for Daylight savings time, which adds or subtracts a whole hour to the day.
In C#, the largest part of a TimeSpan is a day, you can't have a TimeSpan of "one month and one day", because adding that to a DateTime would add a different amount of absolute time depending on the DateTime. You can have a TimeSpan of "32 days", and adding that to any DateTime yields a consistent result.