Going long long on time_t(openbsd.org) |
Going long long on time_t(openbsd.org) |
http://www.openbsd.org/papers/eurobsdcon_2013_time_t/mgp0000...
http://www.openbsd.org/papers/eurobsdcon_2013_time_t/mgp0000...
etc.
Here's the whole thing in one page, minus images:
https://gist.github.com/anonymous/6757266/raw/3469464cb802e7...
a) Theo De Raadt works much further down the stack
b) This is probably something put up by the EuroBSDCon folks after converting his slides (can someone who was at the Con weigh in on this? If he actually presented his talk like this, that's extra bad).
A few takeaways:
- Embedded 32bit is everywhere. Sure they'll fix the obvious ones, but I'm sure some things will be forgotten about. This problem might not be taken seriously after the Y2K debacle.
- The OpenBSD guys & gals like to do implement new designs and ideas. Sometimes radical. (but I already knew that)
- A transitional solution can end up being the solution ans stick around forever :(
- Theorem "In operating systems, increased popularity leads to greater resistance to change". Probably true in most "products".
I wish people would stop saying things like that. I was one of many programmers who spent some overtime over the course of about a year on fixing code in 1998-9. I was just the junior programmer at the time; the head guy in the department had some nearly sleepless nights around then.
I can promise you that in one East Bay school district, nobody would've gotten their paychecks, report cards would not have worked, DNS would have stopped working for the entire school network, and finance & budget would have had some really crazy errors in the output -- those are the systems I still remember requiring the most attention.
Y2K was a "debacle" because a bunch of people busted their asses fixing old code.
It feels a lot like pulling two consecutive all-nighters as an engineer on a project, to bring the project up to its deadline on-time and under-budget, only to have your department manager the next morning stroll in, well-rested, coffee cup in hand, and say, "See? Told you it was no big deal."
I remember a lot of hype and no actual problems. I suspect others do as well, and this could lead to people thinking about the whole problem as over-hyped and a non-issue.
Not saying that it's correct, but that's the catch-22 of preventing problems: If you do a too good job, nobody will notice =/
"%" PRI_TIME_T time_t x = ...;
printf("%lld\n", (long long) x);
The above code works on any platform, as long as time_t contains no values that do not fit in long long. Using a macro for the format specifier would require defining such a macro on other systems, because they have to be able to compile much of this code on Linux, OS X, or even Windows in some cases. #ifdef LONG_LONG_TIME_T
#define PRI_TIME_T "lld"
#else
#define PRI_TIME_T "ld"
#endif
There, I just made it so that you can do: time_t x = ...;
printf("%" PRI_TIME_T "\n", x);
I don't much like how it looks, but it works and it'd be easy to make portable code with it. All you need is a preprocessor flag to indicate when you were using long long.The real problem is that it'd be easier to get old embedded systems upgraded to 64-bit in the next 25 years than to get those old systems retrofitted with such an annoying syntax. Forcing everything to a know and use a "wide enough" integer width is probably the best you can really do with format strings anyway.
I don't see why it would be a good idea to convert "time_t" to "long long". Having an alias specifically for time_t is part of what makes this kind of work doable. I could see maybe introducing another alias like "time64_t" or something, but once you convert it to "long long" the type is no longer tagged in a way that makes it easy to find and more importantly suggests to the programmer they ought to NOT make assumptions about its size. Heck, in a perfect world I'd either introduce a new % symbol specifically for time_t width or have a macro that expands to represent its width (not to mention make it mandatory to use compiler warnings about string formats not matching argument widths).
I also found the comment about "would love better compiler tools -- none found so far". Certainly there are things like Sparse (https://sparse.wiki.kernel.org/index.php/Main_Page) which correctness verification easier.
time_t asdf = time();
And now you want to use printf to print that value to the screen. You can cast to 'long long', which is guaranteed to be at at least 64 bits wide, and ensure no loss of precision occurs: printf("%lld", (long long)asdf);
That will work whether time_t is 32 bit or 64 bit."int" makes some sense. Word-size of the machine.
"short", "long", "long long" are all non-sensical. You use them when you want to trade size and range. When you want to make that trade-off, you care what their sizes are.
Instead of lower/upper bounds on their sizes, which aren't very useful, they should just have specific sizes. At which point, you might as well use uint32_t, and uint64_t in place of long and long long.
Prefer the sized int types over the "long"/"long long" ones when you can, for saner coding.
Use uintptr_t and such when you need a ptr-sized int, rather than a specific size.
And why wouldn't int64_t be available in kernel space?
In order to find and change occurrences of time_t in ports more easily, they could use the Coccinelle tool.[1] The following semantic patch would find and replace variable declarations of type time_t:
@sys_types@
@@
#include <sys/types.h>
@time_t depends on sys_types@
identifier x;
@@
- time_t x
+ long long int x
;
Replacing printf format specifiers is more difficult, so the following semantic patch will find printf statements which use time_t variables, which can then be edited manually: @sys_types@
@@
#include <sys/types.h>
@stdio@
@@
#include <stdio.h>
@printf depends on sys_types && stdio@
identifier x;
@@
time_t x;
...
* printf(..., x, ...);
These can be used as follows: $ spatch --sp-file foo.cocci --dir /path/to/ports
where `foo.cocci` is the name of one of the semantic patches above.(sorry.)
That was part of a larger changeset that enabled 64bit time_t on 2013-08-13. The 'long' type is changed to 'time_t', which is now 64bit everywhere on OpenBSD.
I didn't see think talk but I think it's confusing because we are just seeing the slides. Here is what I got from it:
remove time_t from network/on-disk/database formats
Right now if you have a userspace app that has some kind of binary disk format, say a database, and you use the time_t typedef your binary files will not be portable between systems which have differently sized time_t's. However, if you use 'long long' or 'int64_t' and cast time_t's to those, your files will be portable and 64bit everywhere.If you're using time_t in network formats, systems with different time_t sizes will confuse each other!
remove as many (time_t) casts as possible
This is trouble because you don't know the size of time_t, if it's 32bit you might be truncating! It's better to cast to a 64bit size, that will always work, at least for the next 292 billion years.I'm also not sure why the OpenBSD people think that there will be lots of problems with ports [2], I'm typing this on a NetBSD system with 64 bit time_t, things that broke have had patches pushed upstream.
[1] http://www.openbsd.org/papers/eurobsdcon_2013_time_t/mgp0003... [2] http://www.openbsd.org/papers/eurobsdcon_2013_time_t/mgp0003...
printf("%lld\n", (long long) x); is ugly but it works on every system, including existing systems with 32-bit time_t that don't make any changes to their headers.
You are familiar with the concept of headers, right? ;-)
Just put that in the project's common header (which if it is dealing with so many different OS's, invariably already have a bunch of platform abstractions). I deliberately structured the solution so the only "extra work" that is needed is for whatever platform has created a long long time_t (and if you really wanted to, you could probably get rid of even that work and base the entire thing off of sizeof(long long) even without using something like autoconf).
> printf("%lld\n", (long long) x); is ugly
Wow, we couldn't be of more different opinions. I'd argue the virtue of that approach is it is less ugly are more likely to be easily accepted as a change for crufty old 32-bit code in some embedded system that everyone has forgotten about.
And end up with even more horrible and less readable format strings.
I personally find inttypes format strings readable.
Of course, there would be performance implications - not to mention the added complexity for implementors.
[1]http://www.gnu.org/software/libc/manual/html_node/Customizin...
You could work out all sorts of namespacing and automatic reassignment schemes, of course.
I can't think of anything without trade-offs off the top of my head (I double that anything exists), but in my (limited) experience it's very workable.
It's so much easier to port code between the two worlds when you don't have to litter the #include prelude of every C file with conditionals.
I'm not familiar with the OpenBSD kernel. Is there any good reason (for OpenBSD or any other kernel) why <stdint.h> shouldn't be available -- or at least why int64_t shouldn't be available in some header?
It's deeply ugly to have to maintain "%llu", "%lu", and various other format strings for a single (uint64_t) or indeed (time_t). It's also ugly to up-cast everything to the largest potential size whenever you use format strings.
This seems like a case of choosing deep ugliness over superficial ugliness...
"%" PRIu64 ": The time is: %" PRI_TIME_T, x, y
Seems nicer to me than: "%llu: The time is: %lld", (long long unsigned)x, (long long)y /* time_t is signed? Who knows? */
What type would you use if you wanted to print uint128_t? %llld ?Finally, I think rejecting a standard C header file because it is "ugly" and coming up with your own solution is unnecessarily fragmenting things, especially when it isn't clearly better (IMO it is clearly worse).
I'm not sure that is true. It accurately captures the reality that you don't know the size of the type, but that you have determined what the maximum size can be and hopefully made considerations for it.
I should think there isn't even necessarily a performance cost, as it wouldn't be hard to trick out a compiler to recognize what was going on and optimize accordingly.
> What type would you use if you wanted to print uint128_t? %llld ?
IIRC, there is no standard portable format string length modifier for 128-bits (I think some platforms used %q for it, but that's definitely not portable), so literally nothing. Format strings suck.
> Finally, I think rejecting a standard C header file because it is "ugly" and coming up with your own solution is unnecessarily fragmenting things, especially when it isn't clearly better (IMO it is clearly worse).
Note that as the presentation points out, the better thing to do is whatever is going to easily adopted. In this case, where people are already using format strings, and already working with a time_t that might be only 32-bits wide, this might actually be that solution.
I just checked an OpenBSD machine's inttypes.h. Some of the format macros (I presume the ones present in the standard) are there. So I can't say they rejected the header.
It would probably be weird to introduce one there that isn't in the standard. OTOH %lld for long long is in the standard.
It's Comic Sans with JPEG artifacts. The only consolation is that it doesn't blink.
[1]: http://threebean.org/presentations/fedmsg-flock13/
http://rwmj.wordpress.com/2012/01/31/tech-talk-pse-1-1-0/#co...
and embed terminals and other programs along the way.
> int64_t is a much nicer type than "long long".
Yes, except that "long long" has been part of the standard for longer. in64_t was part of C99 but is tricky to include in software up to the mid 2000's due to slow adoption of the standard.
You can use "long long" without headers in most C compilers from the last 20 years. int64_t when present is usually just a typedef to "long long". Keep it simple.
> "int" makes some sense. Word-size of the machine.
Except that it isn't. That was its original intent but for historical reasons, it is a 32 bit integer in almost all cases now, regardless of machine word size.
#if __WORDSIZE == 64
typedef long int int64_t;
#else
typedef long long int int64_t;
#endifYou don't want to use "long long" because that's not necessarily 64-bits. You want to use int64_t which guarantees it is 64-bits.
And then, the correct format specifier for that is PRIi64, and not "%ld" or "%lld" which will break in different platforms.
Unless they're doing something really weird with time_t values, I don't think there's any reason they should know whenever it's long long or int64_t or whatever under the hood.
And I thought `%lld` actually means `long long`... So, http://ideone.com/SJJFPs seems like a proper approach to me. That said, if compiler supports %lld (an %I64d or alike might be required for older compilers), so a better cross-compiler approach would be in lines of `printf("test: " TIME_FMT "\n", (TIME_FMT_CAST)t)`. Or, ahem, maybe, `print_time(t)`.
So that's why we prefer "long long" rather than "int64_t", which I thought was your original question.