They needed to have a locale matching the language of the localised time string they wanted to parse, they needed to use strptime to parse the string, they needed to use timegm() to convert the result to seconds when seen as UTC. The man pages pretty much describe these things.
The interface or these things could certainly be nicer, but most of the things they bring up as issues aren't even relevant for the task they're trying to do. Why do they talk about daylight savings time being confusing when they're only trying to deal with UTC which doesn't have it?
int main(void) {
struct tm tm = {0};
const char *time_str = "Mon, 20 Jan 2025 06:07:07 GMT";
const char *fmt = "%a, %d %b %Y %H:%M:%S GMT";
// Parse the time string
if (strptime(time_str, fmt, &tm) == NULL) {
fprintf(stderr, "Error parsing time\n");
return 1;
}
// Convert to Unix timestamp (UTC)
time_t timestamp = timegm(&tm);
if (timestamp == -1) {
fprintf(stderr, "Error converting to timestamp\n");
return 1;
}
printf("Unix timestamp: %ld\n", timestamp);
return 0;
}
It is a C99 code snippet that parses the UTC time string and safely converts it to a Unix timestamp and it follows best practices from the SEI CERT C standard, avoiding locale and timezone issues by using UTC and timegm().You can avoids pitfalls of mktime() by using timegm() which directly works with UTC time.
Where is the struggle? Am I misunderstanding it?
Oh by the way, must read: https://www.catb.org/esr/time-programming/ (Time, Clock, and Calendar Programming In C by Eric S. Raymond)
I thought the default output of date(1), with TZ unset, is something like
Mon Jan 20 06:07:07 UTC 2025
That's the busybox default anywayThe first sentence of your link reads:
>The C/Unix time- and date-handling API is a confusing jungle full of the corpses of failed experiments and various other traps for the unwary, many of them resulting from design decisions that may have been defensible when the originals were written but appear at best puzzling today.
It’s twelve lines or more, if you include the imports and error handling.
Spreadsheets and SQL will coerce a string to a date without even being asked to. You might want something more structured than that, but you should be able to do it in far less than 12 lines.
C has many clunky elements like this, which makes working with it like pulling teeth.
But only when you don't want them to, when you do want them to do it it's still a pain.
In C++, maybe. In C, not necessarily. If you're not willing to reinvent the wheel why'd you choose C anyway?
I've wasted so many dreary hours trying to figure out crappy time processing APIs and libraries. Never again!
Try not to have to do this sort of thing. You might have to though, and then you'll have to figure out what adding months means for your app.
It does remove a lot of the ambiguity of "I wonder what this stdlib's quirks are in their date calculations" but it also seems like a non-trivial amount of effort to port every time.
Surely we'll have everything patched up by then..
I wonder if people will still be repeating the "Y2k myth" myth as things start to fail.
[0] https://en.wikipedia.org/wiki/Year_2038_problem#Implemented_...
The overflow happens at 2038-01-19T03:14:08Z.
ClickHouse has the "parseDateTimeBestEffort" function: https://clickhouse.com/docs/en/sql-reference/functions/type-... and here is its source code: https://github.com/ClickHouse/ClickHouse/blob/74d8551dadf735...
I won't believe anyone who tells me that handling time in c/c++ isn't perilous.
Explanation: you can learn heap sort or FFT or whatever algorithm there is and implement it. But writing your own calendar from scratch, that will do for example chron job on 3 am in the day of DST transition, that works in every TZ, is a work for many people and many months if not years...
Also, naming things, cache coherency, and off by one errors are the two hardest problems in computer science.
If you are happy for the time to perhaps be wrong around the hours timezone changes, this is an easy hack:
import time
def time_mktime_utc(_tuple):
result = time.mktime(_tuple[:-1] + (0,))
return result * 2 - time.mktime(time.gmtime(result))
If you are just using it for display this is usually fine as time zone changes are usually timed to happen when nobody is looking.[Missing scene]
" We are releasing Http1.1 specifications whereby expirations are passed as seconds to expire instead of dates as strings."
Why such self flagellation?
Without any marking, it could be anything
I did write such code in RISC-V assembly (for a custom command line on linux to output the statx syscall output). Then, don't be scared, with a bit of motivation, you'll figure it out.
I wrote conversion code, I know what I am talking about.
[0] https://www.ibm.com/docs/en/aix/7.1?topic=c-ctime-localtime-...
mktime() parses the time string which lacks any information on time zones
then the article uses timegm() to convert it to unixtime on the assumption that it was in UTC
also it's about C
No, mktime() doesn't parse a string. Parsing the string is done by strptime(). mktime() takes the output of strptime(), which is a C structure or the equivalent in Python - a named tuple with the same fields.
> the time string lacks any information on time zones
Not necessarily. Time strings often contain a time zone. The string you happen to be parsing doesn't contain a time zone you could always append one. If it did have a time zone you could always change it to UTC. So this isn't the problem either.
The root cause of the issue is the "struct tm" that strptime() outputs didn't have field for the time zone so if the string has one, it is lost. mktime() needs that missing piece of information. It solves that problem by assuming the missing time zone is local time.
> then the article uses timegm() to convert it to unixtime on the assumption that it was in UTC
It does, but timegm() is not a POSIX function so isn't available on most platforms. gmtime() is a POSIX function and is available everywhere. It doesn't convert a "struct tm", but it does allow you to solve the core problem the article labours over, which is finding out what time zone offset mktime() used. With that piece of information it's trivial to convert to UTC, as the above code demonstrates in 2 lines.
> also it's about C
The python "time" module is a very thin wrapper around the POSIX libc functions and structures. There is a one to one correspondence, mostly with the same names. Consequently any experienced C programmer will be able translate the above python to C. I chose Python because it expresses the same algorithm much more concisely.
>>> from email.utils import parsedate_tz, mktime_tz
>>> mktime_tz(parsedate_tz("Fri, 17 Jan 2025 06:07:07"))
1737094027
It converts rfc 2822 time into POSIX timestamp ([mean solar] seconds since epoch--elapsed SI seconds not counting leap seconds).You could use `"%a %b %d %H:%M:%S %Z %Y"` for `fmt` (which is indeed the default for `date`) and it would work with yours.
Both results in the same timestamp.
date.l:
int fileno (FILE *);
FILE *f;
int printf(const char *__restrict, ...);
#include <time.h>
char *strptime(const char *s, const char *f, struct tm *tm);
struct tm t;
a (Mon|Tue|Wed|Thu|Fri|Sat|Sun)
b (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)
d [0-2][0-9]|3[01]
H [0-2][0-9]
M [0-5][0-9]
S [0-5][0-9]
Y [1-9][0-9][0-9][0-9]
%option nounput noinput noyywrap
%%
{a}[ ]{b}[ ]{d}[ ]{H}:{M}:{S}[ ]UTC[ ]{Y} {
strptime(yytext,"%a %b %d %H:%M:%S UTC %Y",&t);
printf("%ld\n",mktime(&t));
}
.|\n
%%
int main(){yylex();exit(0);}
flex -8Cem date.l
cc -O3 -std=c89 -W -Wall -pipe lex.yy.c -static -s -o yydate
date|yydate
This works for me. No need for timegm().But if I substitute %Z or %z for "UTC" in strptime() above then this does not work.
Fun fact: strptime() can make timestamps for dates that do not exist on any calandar.
echo "Thu Jun 31 01:59:26 UTC 2024"|yydateIn fairness, it’s not something that should happen much at all, if ever.
[0]: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1019716
But you don't want to be processing data in locale dependent-ways using the crap available in ISO C.
Until you know properly the leap years. Leap year rules on the long run are are bit funky. Just have a look at wikipedia.
(do not use gogol search since they are now forcing javascript by default)
But the title specifically say "from a UTC string", so it _is_ a UTC string, always.
> ascii string you are showing us to actually make sense in our specific locale
Locale and TZ are two completely separate things. You can use any locale in any TZ. You can use any locale in any location, too.
However you also run into day to day business issues like:
* What if it's now a Holiday and things are closed?
* What if it's some commonly busy time like winter break? (Not quite a single holiday)
* What if a disaster of somekind (even just a burst waterpipe) halts operations in an unplanned way?
Usually flexability needs to be built in. It can be fine to 'target' +3 months, but specify it as something like +3m(-0d:+2w) (so, add '3 months' ignoring the day of month, clamp dom to a valid value, allow 0 days before or 14 days after),
72 business hours sounds more like human time than computer time anyways.
Noon UTC is "12:00Z".
RFC 3339 is nice for this reason. Always UTC and terminated with a Z.
[1] POSIX-2024 incorporates C17, not C23, but in practice the typical POSIX environment going forward will likely be targeting POSIX-2024 + C23, or just POSIX-2024 + extensions; and hopefully neither POSIX nor C will wait as long between standard updates as previously.
It's not posix, but it's pretty available
So, while `timegm` is not standardized in C99 or POSIX, it is a practical solution in most real-world environments, and alternatives exist for portability, and thus: handling time in C is not inherently a struggle.
As for the link, it says "You may want to bite the bullet and use timegm(3), even though it’s nominally not portable.", but see what I wrote above.
[1] https://zolk3ri.name/cgit/m4conf/about/
Nice job though.
Oh look, m4sugar.m4 is taken from GNU Autoconf and is GPLv3. The. m4conf project's master license doesn't mention this; it's a "BSD1" type license (like BSD2, but with no requirement for the copyright notice to appear in documentation or accompanying materials). Oops!
m4sugar says that it requires GNU m4, not just any POSIX m4.
I wrote the configure script in shell because that's what I decided I can depend on being present in the target systems. I deliberately rejected any approach involving m4 to show that a GNU style configure script can be obtained in as straightforward way, without convoluted metaprogramming.
There is a lot of copy-paste programming in my configure script, but it doesn't have to be that way; a script of this type can be better organized than my example. Anyway, you're not significantly disadvantaged writing in shell compared to m4, especially if you're mostly interested in probing the environment, not complex text generation.
I don't think that it's enough to test for header files being present. Almost all my tests target library features: including a header, calling some functions and actually linking a program. The contents of headers vary from system to system and with compiler options.
You mean build-time (with m4conf, that is).
The license of m4conf itself is ISC.
m4sugar could be vetted so it is can become less than 120 KB.
I don't know if it is messy, look at the example configuration file. For me, it is more straightforward and less bloated than autotools, for example.
> I don't think that it's enough to test for header files being present. Almost all my tests target library features: including a header, calling some functions and actually linking a program. The contents of headers vary from system to system and with compiler options.
This is configurable as well in base.m4.
Oh and by the way:
> I don't think that it's enough to test for header files being present.
It checks for functions, too, not just header files, along with CPU features.