The Downfalls of Relying on Internet Public Time
There are a number of protocols that are recognised as the lifeblood of the internet, such as DNS to resolve Hostnames to IP addresses, or NTP/PTP for the delivery of timing information which is vital to ensure systems communicating with each other are in sync and can securely exchange messages.
The chances are that either you or your company are working on digital transformation at the moment and I would expect that DNS and Time Synchronisation are some way down the list of tasks to complete due to them being perceived as low priority for implementation.
Implementing and installing a local time source into your network isn’t always as easy as it sounds, there are a couple of common reasons why this is the case:
- You’re unable to install an antenna on the roof or your DC hosting company can’t provide you with a GNSS signal
- Your environment is in the Public Cloud
- The business case for the equipment or work to install a local grandmaster with a GNSS time source doesn’t add up for your use case.
In these scenarios, you will likely opt to consume time from a public NTP server, of which there are hundreds available for free. There are options available from most of the major tech companies, government bodies and academic organisations. This option is the default for many.
One downside is that the quality of public time sources can vary a lot. Take the following plot as an example, it is tracking three public NTP time sources over a 24-hour period (note: the server taking the measurements is hosted in AWS),
- Green: AWS NTP Server
- Yellow: Facebook
- Purple: NIST
The X-axis is the 24-hour time period that is being sampled. The Y-axis shows the level of accuracy to UTC, measured in fractions of a second.
Both the AWS and Facebook time sources hold a pretty steady time, each consistently within +/- 20 microseconds of its meantime. But if we look at the NIST Time Source you will see there are peaks and troughs which are up to 2.5 milliseconds. Even outside of the peaks the amount it drifts from its mean point is far higher than the other two sources, around a 100microseconds - 5x that of the AWS and Facebook time sources.
Another monitoring system we have deployed in Austin, Texas observes a nightly slow down lasting 4 hours of a NIST NTP Server in Colorado. The slow down affects the accuracy of the NTP source by adding around 7ms of drift in the four-hour window before returning to normal, it’s a regular as clockwork – it’s plotted in orange below.
At the same time, the frequency of the NIST source also increases by up to 10ms from its baseline which is typically in the sub 1ms range, this is shown in the graph below. The frequency is the amount of correction that is required between the received time update and the systems host clock when each sync occurs.
Further investigation showed that this was due to network routing changes between the time server at NIST and our lab in Austin which occur in the same four-hour window each night. Close to the NIST server, there is an additional couple of hops added in the path, each of the extra hops has a very high level of jitter in the round trip time. The root cause for this is likely to be ISP congestion which causes the routing change.
This might be all very interesting but you’ll likely be asking what does this mean in practice…..
The chances are if you are using NTPD, Chrony, Win32Time or similar daemons / services to synchronise the clock on your networking devices that you will be unaware of drift events such as those discussed earlier due to the lack of inbuild reporting and notifications capabilities in these daemons/services. Drift events effecting the accuracy could compromise your security effectives or ability to comply with regulatory measures as you don’t know if you are always consuming time from an accurate source or not and the traceability of the time source is limited.
Only utilising a single time source can result in large amounts of drift during the day on key systems in your network, add in to the fact that some systems may only poll a single NTP server to update its clock once every few hours. The time that different systems in your network are logging events at can be widely different. It is not easy to follow events through a multi-hop environment when the time is not accurate, regardless if it is being done automatically or by a SIEM solution.
There are a few things we can do to mitigate the risks we have mentioned so far, these include;
Configuring Multiple Time Sources
It's good practice to configure at least four-time sources on your system. With two-time sources your host cannot tell which one of the two-time sources will be correct if they disagree, using three will allow a timing daemon/service, such as Chrony, TimeKeeper or NTPD, to identify if there is a falseticker. A falseticker is a source that is providing an incorrect time. The benefit of a fourth-time source is that it allows a failure of a timing source and still be able to have the benefit of being able to identify a falseticker. Another good practice is to diversify your time sources, don’t pick NTP servers from the same organisation or location, have them spread around. This will help protect against geographical, infrastructure or company outages.
Client Accuracy Alerting and Reporting
Having the ability to instantly see and report upon the synchronization state of a client and time sources, namely how long ago since it last synchronized its clock and what was the accuracy and frequency at the last sync and overtime is invaluable, particularly in a large estate, to ensure that all systems – both clients and sources are operating as normal. If your system can proactively alert you upon issues and anomalies even better.
If you operate in a regulated industry, such as Financial Trading, there could well be regulations that dictate the level of accuracy to UTC that records need to be time-stamped to. If you look at the news pages of industry regulators, such as the SEC is in the US or the FCA in the UK you will find on their websites multiple of reports of fines being issued due to inaccurate trading records, some of which are due to an inaccurate time being used. The traceability and the accuracy of the time source you are using in your trading infrastructure is something that will no doubt be under the spotlight more over the coming years.
Keysight’s TimeKeeper solution is an Enterprise-grade solution that can help you ensure that your systems are receiving an accurate time, if client devices are syncing and the accuracy is within a definable window, automatically notify you if a client or time source stops syncing, or the accuracy exceeds a warning or compliance threshold.
If you would like to learn more about TimeKeeper or any other Keysight product please drop me a message or email me at firstname.lastname@example.org
Thanks for reading,