How convert very long string to double in portable C

347 Views Asked by At

I want to convert a very long string of numbers to a double in a portable way in C. In my case, portable means that it would work in Linux and Windows. My ultimate goal is to be able to pack a string of numbers into an 8-byte double and fwrite/fread to/from a binary file. The number is always unsigned.

I am using this string to pack a 4 digit year, 2 digit month, 2 digit day, 4 digit HH:MM, 1 digit variable, and a 10 digit value. So, trying to pack 23 bytes into 8 bytes.

I have tried all of the standard things:

char myNumAsString[] = "1234567890123456789";

char *ptr;
char dNumString[64];
double dNum;


dNum = atol(myNumAsString);
sprintf(dNumString, "%lf", dNum);

dNum = atof(myNumAsString);
sprintf(dNumString, "%lf", dNum);

dNum = strtod(myNumAsString, &ptr);
sprintf(dNumString, "%lf", dNum);

sscanf(myNumAsString, "%lf", &dNum);
sprintf(dNumString, "%lf", dNum);

And none of these work; they all round off the last several numbers. Any portable way to do this?

2

There are 2 best solutions below

0
On BEST ANSWER

Take advantage that part of the string is a timestamp and not any set of digits.

With 60 minutes, 24 hours, 365.25 days/year, y years, a digit and 10 digits, there are 60*24*365.25*y*10*pow(10,10) combinations or about 5.3e16 * y

An 8-byte, 64-bit number has 1.8e19 combinations. So if the range of years is 350 or less (like 1970 to 2320), things will fit.

Assuming unix timestamp, and OP can convert a time string to time_t (check out mktime()) ....

time_t epoch = 0;  // Jan 1, 1970, Adjust as needed.

uint64_t pack(time_t t, int digit1, unsigned long long digit10) {
  uint64_t pack = digit1 * 10000000000 + digit10;
  time_t tminutes = (t - epoch)/60;

  pack += tminutes*100000000000;
  return pack;
}

Reverse to unpack.


Or a more complete portable packing (code untested)

#include <time.h>
// pack 19 digit string
// "YYYYMMDDHHmm11234567890"
uint64_t pack(const char *s) {
  struct tm tm0 = {0};
  tm0.tm_year = 1970 - 1900;
  tm0.tm_mon = 1-1;
  tm0.tm_mday = 1;
  tm0.tm_isdst = -1;
  time_t t0 = mktime(&tm0);  // t0 will be 0 on a Unix system
  struct tm tm = {0};
  char sentinal;
  int digit1;
  unsigned long long digit10;
  if (strlen(s) != 4+2+2+2+2+1+10) return -1;
  if (7 != sscanf(s, "%4d%2d%2d%2d%2d%1d%10llu%c", &tm.tm_year,
          &tm.tm_mon, &tm.tm_mday, &tm.tm_hour, &tm.tm_min,
          &digit1, &digit10, &sentinal)) return -1;
  tm.tm_year -= 1900;
  tm.tm_mon--;
  tm.tm_isdst = -1;
  time_t t = mktime(&tm);

  double diff_sec = difftime(t, t0);
  unsigned long long diff_min= diff_sec/60;
  return diff_min * 100000000000 + digit1*10000000000ull + digit10;
}
0
On

You can save some bits as soon as you know the numbers can not be any value.

  • HH:MM : 0<=HH<=23 <32 : 5 bits, 0 <= MM <= 59 <64 : 6 bits
  • DD : 1 <= DD <= 31 < 32 : 5 bits
  • mm (month) : 1 <= mm <= 12 < 16 : 4 bits

So instead or 8 bytes you only need 20 bits that is less than 3 bytes.

  • YYYY : do you really need to accept any year between 0 and 9999 ??? If you could limit the interesting part to just 2 centuries, 8 bits would be enough.

So a full date could stand in as little as 4 bytes instead of 12.

But if you want to add to that a 10 digit number + 1 variable, that would not stand in the 4 remaining bytes because the greatest uint32_t is 4294967295 enough for any 9 digit number and about half of 10 digit numbers.

If 32 years were enough, you could represent up to 34359738360 that is 10 digits and a variable taking values 0 1 or 2

Lets see that more precisely; the transformations would be:

uint64_t timestamp;
uint8_t minute(uint64_t timestamp) { return timestamp & 0x3f; }
uint8_t hour(uint64_t timestamp) { return (timestamp >> 6) & 0x1f; }
uint8_t day(uint64_t timestamp) { return (timestamp >> 11) & 0x1f; }
uint8_t month(uint64_t timestamp) { return (timestamp >> 16) & 0x1f; }
uint8_t year(uint64_t timestamp) { return orig_year + ((timestamp >> 20) & 0x3f); } // max 64 years
uint64_t ten_digits(uint64_t timestamp) { return orig_year + ((timestamp >> 26) & 0x7FFFFFFFF); }
uint8_t var(uint64_t timestamp) { return (timestamp >> 61) & 0x7); } // 8 values for the one digit variable

If you can accept only 4 values for the one digit variable, end part becomes:

uint8_t year(uint64_t timestamp) { return orig_year + ((timestamp >> 20) & 0x7f); } // max 128 years
uint64_t ten_digits(uint64_t timestamp) { return orig_year + ((timestamp >> 27) & 0x7FFFFFFFF); }
uint8_t var(uint64_t timestamp) { return (timestamp >> 61) & 0x3); } // 4 values for the one digit variable

You could even save some bits if you computed an absolute number of minutes since an epoch, but computations would be much more complexes.