[Monotone-devel] nvm.dates review

monotone-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] nvm.dates review

From:	Zack Weinberg
Subject:	[Monotone-devel] nvm.dates review
Date:	Sat, 25 Oct 2008 18:19:03 -0700

diff -ruN S-vanilla/dates.hh S-dates/dates.hh
--- S-vanilla/dates.hh  2008-05-13 20:07:26.693813111 -0700
+++ S-dates/dates.hh    2008-10-25 17:07:22.986962447 -0700
@@ -19,31 +19,62 @@

 struct date_t
 {
-  // For the benefit of the --date option.
-  date_t() : d("") {}
-  bool valid() const { return d != ""; }
+  // initialize to an invalid date
+  date_t() : d(-1) {}

Under what circumstances would one want a date_t with an invalid value?

+  // initialize from a unix timestamp
+  date_t(u64 d);

I am not sure I like having this be a constructor instead of a factory
function.  I don't feel strongly about it, but it was more
self-documenting the other way.  Also, you still have the ::now() and
::from_string() factories, so now there's both factories and public
constructors, which is confusing.

Also, throughout, please make all your comments be complete sentences.

+  // initialize from multiple values

"Initialize from broken-down time" would be clearer.  That is the term
used by the documentation of the C <time.h> functions that deal in
this representation.

+  date_t(int sec, int min, int hour, int day, int month, int year);

Why doesn't this take milliseconds as well?

...
+  // Date comparison operators
+  bool operator <(struct date_t const & other) const

No "struct", please (same for all the others)

+  // Our own gmtime function which converts the internal Unix epoch
+  // into a struct tm.
+  void gmtime(struct tm & tm) const;
+  void mktime(struct tm const & tm);

Two problems here: first, you must not use struct tm for internal
interchange.  It is not as portable as one might think (yes, it's in
C89, but BSD, SysV, and Windows all extended it in incompatible ways),
and because of those extensions it is unsafe to fill one in by hand.
You should use it only in the few cases where you have to use a
<time.h> function that takes or returns it, and you should strive to
avoid doing that wherever possible.  Fixing this is a requirement for
merging back to n.v.m.

Second, as a matter of clarity, please do not give any member function
(of anything) the same unqualified name as a C library function!  I
was quite confused the first time I read the code.

+date_t::date_t(int sec, int min, int hour, int day, int month, int year)
+{
+  // general validity checks
+  I((year >= 1970) && (year <= 9999));

Here and elsewhere - since you took out the buffer that would be too
small in CE 10000, why are you still restricting the year to <= 9999?
The year limit you should enforce is 2147483647, since all system
structures for broken-down time that I am aware of (i.e. struct tm +
the several similar things Windows has) use a 32-bit *signed* year
field.  The corresponding seconds-since-1970 count was in the old
version of the file, although of course it'll need multiplying by
1000.

+  I((month >= 1) && (month <= 12));
+  I((day >= 1) && (day <= 31));
+  I((hour >= 0) && (hour < 24));
+  I((min >= 0) && (min < 60));
+  I((sec >= 0) && (sec < 60));

sec <= 60.  Leap seconds do happen.

+bool
+date_t::valid() const
+{
+  // year 10000 limit
+  return d < u64_C(253402300800000);

See above.

+  // This is the only place we rely on the system's time and date function
+  // which might operate on different epochs (i.e. 1980-01-01, as some
+  // Windows, old MacOS and VMS systems used). We immediately transform that
+  // to a struct tm representation, which is independent of the system's
+  // epoch.
   time_t t = time(0);
   struct tm b = *gmtime(&t);

Since your endpoint format is a millisecond count, why do the
expensive conversion to broken-down time and back?  All you need to do
is calculate and memoize the epoch offset used by this system, and
then add that value to what you get back from time() and multiply by
1000.

static s64
get_epoch_offset()
{
  static s64 epoch_offset;
  static bool know_epoch_offset = false;

  if (know_epoch_offset)
    return epoch_offset;

  std::time_t epoch = 0;
  std::tm t = *std::gmtime(&epoch);
  // you should use the internal function that does this directly, of course;
  // and take care it doesn't bomb out if the offset is negative
  // (system epoch earlier than unix epoch)
  epoch_offset =
      date_t::from_broken_down_time(t.tm_sec, t.tm_min, t.tm_hour,
                                    t.tm_mday, t.tm_mon + 1,
                                    t.tm_year + 1900)
      .as_unix_epoch();

  know_epoch_offset = true;
  return epoch_offset;
}

date_t
date_t::now()
{
  std::time_t t = std::time(0);
  date_t d;
  d.d = u64(t)*1000 + get_epoch_offset();
  return d;
}

Easy!

(Since the internal representation is in milliseconds, it would be
nice to use gettimeofday() where available, but that's not urgent.)

-unsigned int const MIN = 60;
-unsigned int const HOUR = MIN * 60;
-unsigned int const DAY = HOUR * 24;
-unsigned int const YEAR = DAY * 365;
-unsigned int const LEAP = DAY * 366;
+u64 const SEC = u64_C(1000);
+u64 const MIN = 60 * SEC;
+u64 const HOUR = 60 * MIN;
+u64 const DAY = 24 * HOUR;
+u64 const YEAR = 365 * DAY;
+u64 const LEAP = 366 * DAY;

Why was this necessary?


+string
+date_t::as_iso_8601_extended() const
+{
+  struct tm tm;
+  gmtime(tm);
+  return (FL("%04u-%02u-%02uT%02u:%02u:%02u")
+             % (tm.tm_year + 1900) % (tm.tm_mon + 1) % tm.tm_mday
+             % tm.tm_hour % tm.tm_min % tm.tm_sec).str();
+}

Another advantage of not using struct tm is, you could report the
milliseconds too.

+u64
+date_t::as_unix_epoch() const
+{
+  return d;
+}

This function should maybe have the word "millisecs" somewhere in its
name.

+void
+date_t::gmtime(struct tm & tm) const
 {
   // these types hint to the compiler that narrowing divides are safe
   u64 yearbeg;
-  u32 year;
+  u16 year;
   u32 month;
   u32 day;
-  u32 secofday;
+  u32 msofday;
   u16 hour;
-  u16 secofhour;
+  u32 msofhour;
   u8 min;
+  u16 msofmin;
   u8 sec;
+  u16 msec;

It might be more efficient to divide out the milliseconds at the very
beginning.  (And come to think of it, you could do seconds, minutes,
and hours that way too, and deal in day counts for the rest of the
code, which might be more comprehensible than how I wrote it
originally.)

+      // ignore milliseconds, if present, or go back to the end of the date
+      // string to parse the digits for seconds.

It's fractional seconds (you could have an arbitrary number); why
ignore them, when you support millisecond precision?

+  // make sure we are still before year 10'000

Please stick to English notation for numbers (i.e. 10,000) -- but this
check will be changing anyway...

   // more than four digits in the year
-  OK("120070301T184113", "12007-03-01T18:41:13");
+  NO("120070301T184113");   // equals 12007-03-01T18:41:13

I expect this will be going away again after you take out the
four-digit year check.

zw

[Prev in Thread]

Current Thread

[Next in Thread]

[Monotone-devel] nvm.dates review, Zack Weinberg <=
- [Monotone-devel] Re: nvm.dates review, Markus Wanner, 2008/10/27
  - Re: [Monotone-devel] Re: nvm.dates review, Markus Wanner, 2008/10/27

Prev by Date: Re: [Monotone-devel] new test --no-workspace
Next by Date: [Monotone-devel] changing behavior of --message and --message-file for propagate
Previous by thread: [Monotone-devel] new test --no-workspace
Next by thread: [Monotone-devel] Re: nvm.dates review
Index(es):
- Date
- Thread