The 2020 Mac Mini Unleashed: Putting Apple Silicon M1 To The Test
by Andrei Frumusanu on November 17, 2020 9:00 AM ESTLast week, Apple made industry news by announcing new Mac products based upon the company’s new Apple Silicon M1 SoC chip, marking the first move of a planned 2-year roadmap to transition over from Intel-based x86 CPUs to the company’s own in-house designed microprocessors running on the Arm instruction set.
During the launch we had prepared an extensive article based on the company’s already related Apple A14 chip, found in the new generation iPhone 12 phones. This includes a rather extensive microarchitectural deep-dive into Apple’s new Firestorm cores which power both the A14 as well as the new Apple Silicon M1, I would recommend a read if you haven’t had the opportunity yet:
Since a few days, we’ve been able to get our hands on one of the first Apple Silicon M1 devices: the new Mac mini 2020 edition. While in our analysis article last week we had based our numbers on the A14, this time around we’ve measured the real performance on the actual new higher-power design. We haven’t had much time, but we’ll be bringing you the key datapoints relevant to the new Apple Silicon M1.
Apple Silicon M1: Firestorm cores at 3.2GHz & ~20-24W TDP?
During the launch event, one thing that was in Apple fashion typically missing from the presentation were actual details on the clock frequencies of the design, as well as its TDP which it can sustain at maximum performance.
We can confirm that in single-threaded workloads, Apple’s Firestorm cores now clock in at 3.2GHz, a 6.66% increase over the 3GHz frequency of the Apple A14. As long as there's thermal headroom, this clock also applies to all-core loads, with in addition to 4x 3.2GHz performance cores also seeing 4x Thunder efficiency cores at 2064MHz, also quite a lot higher than 1823MHz on the A14.
Alongside the four performance Firestorm cores, the M1 also includes four Icestorm cores which are aimed for low idle power and increased power efficiency for battery-powered operation. Both the 4 performance cores and 4 efficiency cores can be active in tandem, meaning that this is an 8-core SoC, although performance throughput across all the cores isn’t identical.
The biggest question during the announcement event was the power consumption of these designs. Apple had presented several charts including performance and power axes, however we lacked comparison data as to come to any proper conclusion.
As we had access to the Mac mini rather than a Macbook, it meant that power measurement was rather simple on the device as we can just hook up a meter to the AC input of the device. It’s to be noted with a huge disclaimer that because we are measuring AC wall power here, the power figures aren’t directly comparable to that of battery-powered devices, as the Mac mini’s power supply will incur a efficiency loss greater than that of other mobile SoCs, as well as TDP figures contemporary vendors such as Intel or AMD publish.
It’s especially important to keep in mind that the figure of what we usually recall as TDP in processors is actually only a subset of the figures presented here, as beyond just the SoC we’re also measuring DRAM and voltage regulation overhead, something which is not included in TDP figures nor your typical package power readout on a laptop.
Starting off with an idle Mac mini in its default state while sitting idle when powered on, while connected via HDMI to a 2560p144 monitor, Wi-Fi 6 and a mouse and keyboard, we’re seeing total device power at 4.2W. Given that we’re measuring AC power into the device which can be quite inefficient at low loads, this makes quite a lot of sense and represents an excellent figure.
This idle figure also serves as a baseline for following measurements where we calculate “active power”, meaning our usual methodology of taking total power measured and subtracting the idle power.
During average single-threaded workloads on the 3.2GHz Firestorm cores, such as GCC code compilation, we’re seeing device power go up to 10.5W with active power at around 6.3W. The active power figure is very much in line with what we would expect from a higher-clocked Firestorm core, and is extremely promising for Apple and the M1.
In workloads which are more DRAM heavy and thus incur a larger power penalty on the LPDDR4X-class 128-bit 16GB of DRAM on the Mac mini, we’re seeing active power go up to 10.5W. Already with these figures the new M1 is might impressive and showcases less than a third of the power of a high-end Intel mobile CPU.
In multi-threaded scenarios, power highly depends on the workload. In memory-heavy workloads where the CPU utilisation isn’t as high, we’re seeing 18W active power, going up to around 22W in average workloads, and peaking around 27W in compute heavy workloads. These figures are generally what you’d like to compare to “TDPs” of other platforms, although again to get an apples-to-apples comparison you’d need to further subtract some of the overhead as measured on the Mac mini here – my best guess would be a 20 to 24W range.
Finally, on the part of the GPU, we’re seeing a lower power consumption figure of 17.3W in GFXBench Aztec High. This would contain a larger amount of DRAM power, so the power consumption of Apple’s GPU is definitely extremely low-power, and far less than the peak power that the CPUs can draw.
Memory Differences
Besides the additional cores on the part of the CPUs and GPU, one main performance factor of the M1 that differs from the A14 is the fact that’s it’s running on a 128-bit memory bus rather than the mobile 64-bit bus. Across 8x 16-bit memory channels and at LPDDR4X-4266-class memory, this means the M1 hits a peak of 68.25GB/s memory bandwidth.
In terms of memory latency, we’re seeing a (rather expected) reduction compared to the A14, measuring 96ns at 128MB full random test depth, compared to 102ns on the A14.
Of further note is the 12MB L2 cache of the performance cores, although here it seems that Apple continues to do some partitioning as to how much as single core can use as we’re still seeing some latency uptick after 8MB.
The M1 also contains a large SLC cache which should be accessible by all IP blocks on the chip. We’re not exactly certain, but the test results do behave a lot like on the A14 and thus we assume this is a similar 16MB chunk of cache on the SoC, as some access patterns extend beyond that of the A14, which makes sense given the larger L2.
One aspect we’ve never really had the opportunity to test is exactly how good Apple’s cores are in terms of memory bandwidth. Inside of the M1, the results are ground-breaking: A single Firestorm achieves memory reads up to around 58GB/s, with memory writes coming in at 33-36GB/s. Most importantly, memory copies land in at 60 to 62GB/s depending if you’re using scalar or vector instructions. The fact that a single Firestorm core can almost saturate the memory controllers is astounding and something we’ve never seen in a design before.
Because one core is able to make use of almost the whole memory bandwidth, having multiple cores access things at the same time don’t actually increase the system bandwidth, but actually due to congestion lower the effective achieved aggregate bandwidth. Nevertheless, this 59GB/s peak bandwidth of one core is essentially also the speed at which memory copies happen, no matter the amount of active cores in the system, again, a great feat for Apple.
Beyond the clock speed increase, L2 increase, this memory boost is also very likely to help the M1 differentiate its performance beyond that of the A14, and offer up though competition against the x86 incumbents.
- Page 1: Apple Silicon M1: Recap, Power Consumption
- Page 2: Benchmarks: Whatever Is Available
- Page 3: M1 GPU Performance: Integrated King, Discrete Rival
- Page 4: SPEC2006 & 2017: Industry Standard - ST Performance
- Page 5: SPEC2017 - Multi-Core Performance
- Page 6: Rosetta2: x86-64 Translation Performance
- Page 7: Conclusion & First Impressions
682 Comments
View All Comments
andrewaggb - Tuesday, November 17, 2020 - link
Pretty much. There's no reason to think the cores will be better on a chip with more of them. The only thing that is a possibility (certainly not a given) is that the clock speed will be substantially higher which should put Apple in the lead. That said, the previous review showed a very modest IPC improvement this time around even with huge reorder buffers and an 8-wide design. So I suspect apple's best course for improved performance is higher clocks but that always runs counter to power usage so we'll see. AMD and Intel will probably have to go wider to compete with Apple for single thread IPC in the long run.GPU-wise it's pretty decent for integrated graphics but if you want to play games you shouldn't be running Mac OS or using integrated graphics. It'll be interesting to see if Apple's market share jumps enough to pull in some game development.
Eric S - Tuesday, November 17, 2020 - link
I’m don’t think any of these benchmarks are optimized for TBDR. Memory bound operations could be significantly faster if optimized for the chip. Many render pipelines could run 4X faster. I’m curious to see iOS graphics benchmarks run on this that are more representative. Of course I hope we see apps and games optimized for TBDR as well.Spunjji - Thursday, November 19, 2020 - link
@andrewaggb - Agreed entirely. The cores themselves aren't going to magically improve, and it's not clear from the meagre scaling between A14 at 5-10W and M1 at 10-25W that they can make them a lot faster with clock speed increases. But a chip with 12 Firestorm cores and 4 Icestorm cores would be an interesting match for the 5900X, and if they beef the GPU up to 12 cores with a 192bit memory interface and/or LPDDR5 then they could have something that's actually pretty solid for the vast majority of workloads.I don't think games are going to be moving en-masse from Windows any time soon, but I guess we'll see as time goes on.
Stephen_L - Tuesday, November 17, 2020 - link
I feel very lucky that I didn’t use your mindset when I decided to buy AMD R5-1600X instead of an Intel i5 for my pc.Spunjji - Thursday, November 19, 2020 - link
@YesYesNo - you responded to a comment about how they *will* be releasing faster chips by talking about how they haven't done so yet. This is known. You're kind of talking past the people you're replying to - nobody's asking you to reconsider how you feel about the M1 based on whatever comes next, but it doesn't make sense to assume this is the absolute best they can do, either.andreltrn - Tuesday, November 17, 2020 - link
This is not their High-end chip! This a chip for low-end devices such as fan-less laptops. They attacked that market first because this where they will make the most money. High end Pro won't go for a new platform until it is proven and that they are 100% sure that they will be able to port their workflow to it. They are starting with the low-end and follow up with probably a 10 or 12 core chip in the spring for the high-end laptop and the iMac.vlad42 - Tuesday, November 17, 2020 - link
I just do not see Apple using any but a low power mobile chip for consumer devices.Think about it, about half the time we did not see Apple release a tablet optimized A#X chip for the iPad. In their recent earnings reports the combined iPad and Mac revenue is still only half that of the iPhone. By using the same chip for the iPad and all Mac machines, except the Mac Pro, maybe Apple will actually update the soc every year.
If apple were to provide a higher performing chip for consumer devices, then it would probably be updated only once every few years. Apple just does not make enough money from high end laptops and the iMac to justify dedicated silicon for those products without pulling an Intel and reusing the soc for far too many product cycles. Just look at the Mac Pros. The engineering resources needed to design the most recent x86 Mac Pro is a drop in the bucket compared to designing and taping out a new soc. Despite this, Apple has only been updating the Mac Pro lineup once every 5-7 years!
The problem, is that by the time they are willing to update those theoretical high end consumer chips, they will have been long since been made obsolete. Who in their right mind would purchase a "high end" laptop or an iMac if it is out performed by an entry level Air or an iPad or was lacking in important features (hardware codec support, the next stupid version of HDCP needed for movies/TV shows, etc.). Even worse for Apple is if their customers by a non-Apple product instead. Much of Apple's current customer base does not actually need a Mac. They would be fine with any decent quality high end laptop or any all-in-one with a screen that is not hot garbage.
Eric S - Tuesday, November 17, 2020 - link
They are working on updates for the high end. I expect they will be amazing. At least two higher end chips are in late design or early production.Eric S - Tuesday, November 17, 2020 - link
You are probably right in that they may only be updated every few years, but the same can be said of the Xeon which also skips generations.vlad42 - Tuesday, November 17, 2020 - link
But the Xeon chips are a bad example because Intel shot themselves in the foot through a combination of complacency, tying their next gen products too tightly to the manufacturing process and a shortage of 14nm capacity. We used to get new Xeons if not every year, then at least every time there was an architecture update.A better more recent comparison would be with AMD which has always updated the Threadripper lineup. Granted, we technically do not know if the Threadripper Pro lineup will be updated every year, but it very likely will be.