Re: current-posix-second is a disastrous mistake
Taylor R Campbell 17 Dec 2010 03:40 UTC
Date: Thu, 16 Dec 2010 16:36:50 -0800
From: Ray Dillinger <bear-65eDfwRo+1xeoWH0uzbU5w@public.gmane.org>
In practice, on machines with POSIX time available, there is no
error in any interval which does not include the midnight following
June 30 or December 31 in any year.
Therein lies the danger.
99.999998% of the time, your POSIX clock appears to behave well.
Stress tests of your networked system show that it is robust to random
clock errors. You put the system into production, and it runs without
a hitch for two years. Random hardware clock failures in individual
nodes with probability much greater than .000002% are handled
gracefully by the network as a whole.
Then, during a leap second, every node *simultaneously* observes an
erratic clock. Would you like to be on call on New Year's -- nay, at
New Year's Second -- to clean up the mess?
I propose that you implement SECONDS-SINCE-UTC-EPOCH by calling
ntp_gettime (or adjtimex/ntp_adjtime on Linux) and computing the
formula
ntptimeval.time.tv_sec + ntptimeval.tai - 10 + 63072000
(or timex.time.tv_sec + timex.tai - 10 + 63072000 on Linux).
ntptimeval.time represents the POSIX time. ntptimeval.tai represents
the current TAI - UTC offset, which is stored in the kernel's time_tai
variable. Thus, ntptimeval.tai - 10 gives the number of leap seconds,
because 1972-01-01T00:00:00Z was 1972-01-01 at 00:00:10 in TAI.
If your kernel is informed of a pending leap second, then at the end
of the leap second, it will simultaneously increment ntptimeval.tai
and rewind the clock, i.e. hold ntptimeval.time.tv_sec fixed for a
second. Thus, the formula I gave will increment normally at the leap
second: it represents a well-behaved clock.
If your kernel is not informed of a pending leap second, then at the
end of the leap second, it will increment ntptimeval.time.tv_sec
normally while holding ntptimeval.tai fixed. Thus, the formula I gave
will increment normally at the leap second: it represents a
well-behaved clock.
What if your kernel's time_tai is not set by the operator or by the
ntpd on boot, so that it uses a default `TAI - UTC offset' of 0? The
formula I gave still represents a well-behaved clock. Your node might
refuse to talk with peers whose clocks appear a few seconds off from
yours, because their time_tai values are correct. But this problem
manifests immediately and is easily remediable by the operator. Your
calendars might be off by a few seconds. But do you care? Probably
not. If you did, you would have set time_tai correctly.