How to force chrony to sync NOW?

13.3k Views Asked by At

I run a DB cluster that needs to be in time. Sadly, sometimes my VM hoster is moving the VM with such DB node to another host and then the time is lacking a second or more. My DB node then shuts down and is restarted by systemd.

My systemd file contains this:

ExecStartPre=-+/usr/bin/chronyc -a makestep
ExecStart=/usr/local/bin/.......

I expected this to sync my time immediately after such time lag shut down the database. But due to my logs it took up to 7 minutes until the difference was recognized and fixed. My database detected the gap on every restart and shut down again. Finally, I get this chronyd log:

Nov 16 10:25:51 dc3-sirius chronyd[164166]: System clock was stepped by 0.000020 seconds
Nov 16 10:26:07 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:26:23 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:26:39 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:26:55 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:27:11 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:27:27 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:27:43 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:27:59 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:28:15 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:28:31 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:28:47 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:28:59 dc3-sirius chronyd[164166]: Source 81.169.199.94 replaced with 212.71.244.243
Nov 16 10:29:03 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:29:19 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:29:32 dc3-sirius chronyd[164166]: Selected source 109.230.227.90
Nov 16 10:29:35 dc3-sirius chronyd[164166]: System clock was stepped by 0.003850 seconds
Nov 16 10:29:51 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:30:07 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:30:23 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:30:39 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:30:55 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:31:11 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:31:27 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:31:43 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:31:59 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:32:13 dc3-sirius chronyd[164166]: Can't synchronise: no majority
Nov 16 10:32:15 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:32:31 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:32:33 dc3-sirius chronyd[164166]: Selected source 109.230.227.90
Nov 16 10:32:33 dc3-sirius chronyd[164166]: System clock wrong by 1.101260 seconds, adjustment started
Nov 16 10:32:48 dc3-sirius chronyd[164166]: System clock was stepped by 1.003151 seconds
Nov 16 10:33:04 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:33:21 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:33:37 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:33:51 dc3-sirius chronyd[164166]: Selected source 162.159.200.123
Nov 16 10:33:53 dc3-sirius chronyd[164166]: System clock was stepped by 0.409613 seconds

As you can see, it started to sync the clock after >7 minutes:

My DB detected the issue at 10:25:51. From this, the above command was executed several times to re-sync the clock before every database restart. But it needed until 10:32:33 and 10:33:53 to really finally fix the clock.

Any idea how I can force the clock to get synchronized directly and not many minutes later?

1

There are 1 best solutions below

0
On BEST ANSWER

I finally found a solution to keep chrony and also force an immediate time sync in case of a time lag (detected by the DB node). The solution was to restart the chronyd service - simulating a reboot of the system.

I changed my systemd file of the database to look like this:

ExecStartPre=-+systemctl restart chronyd
ExecStartPre=/bin/sleep 5
ExecStart=/usr/local/bin/cockroach start ...

In the /etc/chrony.conf file I added the lines:

initstepslew 0.5 pool.ntp.org
makestep 0.5 -1

This forces chronyd to re-sync the time on a restart if the time offset is bigger than 0.5 seconds.

This finally makes my system re-sync directly and then restart the database node immediately.

You can look here to find more information about the chrony.conf options.