A Timely Discovery: Examining Our AMD 2nd Gen Ryzen Results
by Ian Cutress & Ryan Smith on April 25, 2018 11:15 AM ESTLast week, we published our AMD 2nd Gen Ryzen Deep Dive, covering our testing and analysis of the latest generation of processors to come out from AMD. Highlights of the new products included better cache latencies, faster memory support, an increase in IPC, an overall performance gain over the first generation products, new power management methods for turbo frequencies, and very competitive pricing.
In our review, we had a change in some of the testing. The big differences in our testing for this review was two-fold: the jump from Windows 10 Pro RS2 to Windows 10 Pro RS3, and the inclusion of the Spectre and Meltdown patches to mitigate the potential security issues. These patches are still being rolled out by motherboard manufacturers, with the latest platforms being first in that queue. For our review, we tested the new processors with the latest OS updates and microcode updates, as well as re-testing the Intel Coffee Lake processors as well. Due to time restrictions, the older Ryzen 1000-series results were used.
Due to the tight deadline of our testing and results, we pushed both our CPU and gaming tests live without as much formal analysis as we typically like to do. All the parts were competitive, however it quickly became clear that some of our results were not aligned with those from other media. Initially we were under the impression that this was as a result of the Spectre and Meltdown (or Smeltdown) updates, as we were one of the few media outlets to go back and perform retesting under the new standard.
Nonetheless, we decided to take an extensive internal audit of our testing to ensure that our results were accurate and completely reproducible. Or, failing that, understanding why our results differed. No stone was left un-turned: hardware, software, firmware, tweaks, and code. As a result of that process we believe we have found the reason for our testing being so different from the results of others, and interestingly it opened a sizable can of worms we were not expecting.
An extract from our Power testing script
What our testing identified is that the source of the issue is actually down to timers. Windows uses timers for many things, such as synchronization or ensuring linearity, and there are sets of software relating to monitoring and overclocking that require the timer with the most granularity - specifically they often require the High Precision Event Timer (HPET). HPET is very important, especially when it comes to determining if 'one second' of PC time is the equivalent to 'one second' of real-world time - the way that Windows 8 and Windows 10 implements their timing strategy, compared to Windows 7, means that in rare circumstances the system time can be liable to clock shift over time. This is often highly dependent on how the motherboard manufacturer implements certain settings. HPET is a motherboard-level timer that, as the name implies, offers a very high level of timer precision beyond what other PC timers can provide, and can mitigate this issue. This timer has been shipping in PCs for over a decade, and under normal circumstances it should not be anything but a boon to Windows.
However, it sadly appears that reality diverges from theory – sometimes extensively so – and that our CPU benchmarks for the Ryzen 2000-series review were caught in the middle. Instead of being a benefit to testing, what our investigation found is that when HPET is forced as the sole system timer, it can sometimes a hindrance to system performance, particularly gaming performance. Worse, because HPET is implemented differently on different platforms, the actual impact of enabling it isn't even consistent across vendors. Meaning that the effects of using HPET can vary from system to system, as well as the implementation.
And that brings us to the state HPET, our Ryzen 2000-series review, and CPU benchmarking in general. As we'll cover in the next few pages, HPET plays a very necessary and often very beneficial role in system timer accuracy; a role important enough that it's not desirable to completely disable HPET – and indeed in many systems this isn't even possible – all the while certain classes of software such as overclocking & monitoring software may even require it. However for a few different reasons it can also be a drain on system performance, and as a result HPET shouldn't always be used. So let's dive into the subject of hardware timers, precision, Smeltdown, and how it all came together to make a perfect storm of volatility for our Ryzen 2000-series review.
242 Comments
View All Comments
JlHADJOE - Wednesday, April 25, 2018 - link
This.It stands to reason that there is no actual performance difference, just an inconsistency in how time is measured. For that matter, we're not even sure whether either system is accurately timing itself.
IMO we shouldn't be trusting the benchmarked system's timer at all. Run an ntp server elsewhere on the network and get time from that before and after each benchmark. Likewise all gaming results really should go through an external tool like FCAT.
AFAIK it's only in the PC industry that benchmarks trust the system being measured to do book/time keeping for itself, which is kinda nuts considering the system clock will be going from base to boost and each core will be running at different frequencies, and the whole system is subject to thermal swings.
ReverendCatch - Wednesday, April 25, 2018 - link
Agreed, using the system to basically audit itself, is kind of a flaw in the design of testing.However, easily applying a third party time index isn't so easy? I guess you could film each game's performance on the monitor with a high speed camera, but parsing that data would be nightmarish at best.
Easiest way would be to use an external computers (such as a web time server) timestamp before the test, and when it finishes, with the variation being the average ping time to the server. I guess. But that changes the way testing and benchmarks are done.
eddman - Wednesday, April 25, 2018 - link
The solution already exists. DigitalFoundry does it. They capture the output video with an external device and then run it through a special software that is able to determine frame times and produce a frame rate graph. This is how they manage to determine exact frame rates for consoles even.Cooe - Wednesday, April 25, 2018 - link
FCAT testing. Super expensive to do right (requires beefy enough hardware on both the dedicated capture rig & it's actual video capture card itself such that the video capture of whatever's being tested doesn't drop a single frame [as the capture rig isn't what's being tested/analyized, it needs to be as close to perfect frame-pacing/capture as possible]), but suuuuper freaking awesome haha.I'm pretty sure Digital Foundries FCAT analysis software was even designed in-house. Lol Richard's steezy FCAT testing has become like his calling card by this point.
Cooe - Wednesday, April 25, 2018 - link
FCAT testing. Super expensive to do right (requires beefy enough hardware on both the dedicated capture rig & it's actual video capture card itself such that the video capture of whatever's being tested doesn't drop a single frame [as the capture rig isn't what's being tested/analyized, it needs to be as close to perfect frame-pacing/capture as possible]), but suuuuper freaking awesome haha.I'm pretty sure Digital Foundries FCAT analysis software was even designed in-house. Lol Richard's steezy FCAT testing has become like his calling card by this point.
BillyONeal - Wednesday, April 25, 2018 - link
If HPET results in a system call, it is both. The Meltdown and Spectre mitigations make ordinary system calls *much* more expensive, and AMD's platform isn't mitigating those yet.Topweasel - Wednesday, April 25, 2018 - link
More stringent testing of HPET needs to be done. It could be the case that everything is performing the same in all tests but the results are reporting the wrong numbers (which I would assume would be the case for the HPET not forced results). But forcing the HPET when not expected could be causing other timer related issues in the programming that could result in loss of performance.ReverendCatch - Wednesday, April 25, 2018 - link
Yeah, basically. It's the time portion that is problematic. It's been the case since reviewers were reviewers and using FPS.A more accurate measure would be frames rendered for the same, identical test, for each system. Most games do not provide such information or tests, though.
Alistair - Wednesday, April 25, 2018 - link
No, I believe he was saying that if you aren't messing around with extreme OC and altering base clocks etc., the time portion is always accurate. The raw performance does change from the CPU overhead of HPET in Intel systems, by a lot in some cases.BillyONeal - Wednesday, April 25, 2018 - link
Not just extreme OC; anything that changes the clock speed, for example the CPU down clocking at idle, will change the rate of TSC relative to "real time". HPET exists to be the arbiter of "real time" unmoored from CPU frequency.