This question is hinted at in this one, but the answer to that question doesn't answer this question at all, and I've conflicting suggestions and hints scattered around.
My problem is relatively simple, but in digging into it, I'm getting a bit tripped up.
Suppose I have a string in a format like this: 2023-06-07 03:04:56 -0700
The goal is to normalize this into an epoch timestamp (time_t in C). I assumed this would be simple enough, but it seems not. The gotcha here seems to be the -0700 at the end.
It seems that strptime(3) ignores the %z modified, possibly, maybe (again, I've conflicting reports as to how this is used, in different implementations, etc.). FWIW, I'm using Linux/glibc so I more care about whether it works there, not that it's not in the C standard.
Playing around with it a little bit, it seemed to me like strptime does ignore the timezone offset. The hour in the struct tm is simply the hour in the string. The hour isn't modified based on the timezone offset at all. Supposedly that's what the non-standard tm_gmoff member is for, but I seem to just get a gigantic value when reading that that is definitely much larger than any UTC offset in seconds, so I'm not sure what to make of that either.
As an example:
#define _XOPEN_SOURCE
#include <stdio.h>
#include <string.h>
#include <time.h>
int main()
{
struct tm tm;
time_t epoch;
char buf[40];
strcpy(buf, "2023-06-07 03:04:56 -0700");
memset(&tm, 0, sizeof(tm));
strptime(buf, "%Y-%m-%d %H:%M:%S %z", &tm);
printf("Parsed datetime %s (hour %d, offset %lu)\n", buf, tm.tm_hour, tm.__tm_gmtoff);
tm.tm_isdst = -1;
setenv("TZ", "US/Eastern");
epoch = mktime(&tm);
printf("Parsed datetime -> epoch %lu\n", epoch); // 7:04AM UTC
epoch = timegm(&tm);
printf("Parsed datetime -> epoch %lu\n", epoch); // 3:04AM UTC
return 0;
}
when run on https://www.onlinegdb.com/online_c_compiler, gives:
Parsed datetime 2023-06-07 03:04:56 -0700 (hour 3, offset 18446744073709526416)
Parsed datetime -> epoch 1686121496
Parsed datetime -> epoch 1686107096
Note that -0700 offset in the string is arbitrary, and the local time zone on the system is also arbitrary. For example, -0700 is Pacific Time, but the system could be in Eastern Time, which is actually completely irrelevant to the problem (i.e. the local time zone should not be used in the conversion, since it's irrelevant - the time zone of the offset should be used instead - and importantly, the local time zone should not mess up the answer).
Above, the correct answer is 10:04AM UTC (what the string obviously should convert to). Blindly using mktime gives the wrong answer, and timegm is even more off. The problem seems to be that the offset is not taken into account here. The second answer using timegm would be correct, if the struct tm had +7 hours added to it for the offset, or if timegm added +7 hours to the answer based on something in the struct tm, such as tm_gmtoff. But neither of those things seems to happen.
Short of writing a manual function to parse the %z in the time string and manually add this offset to the time_t, is there a better "builtin" way of doing this with standard functions? (Portability isn't super important here, as long as it works in glibc.) Given this would seem to be a very common type of conversion, I'm thinking there must be a way to do this properly without manually doing calculations, using gmtime. I thought this was what tm_gmtoff was for but it seems otherwise - am I missing something here?
A few issues ...
__tm_gmtoffis signed [so it printed incorrectly] with%lu__tm_gmtoffis set correctly (e.g.-7 * 3600).setenv("TZ",...)does not work. It uses the local timezone set by the system. (e.g. -0700 is US/Pacific(?) DST but I got -0400 (US/Eastern DST).timegmwill ignore__tm_gmtofftm_gmtoff[AFAICT].timegmand applytm_gmtoffmanually to get the correct timezone.Here is the somewhat corrected code (in stages). It may still be broken. Important to read the comments:
Here is the program output: