Qualcomm Previews Snapdragon X Elite SoC: Oryon CPU Starts in Laptopsby Ryan Smith on October 24, 2023 3:00 PM EST
While Qualcomm has become wildly successful in the Arm SoC market for Android smartphones, their efforts to parlay that into success in other markets has eluded them so far. The company has produced several generations of chips for Windows-on-Arm laptops, and while each has incrementally improved on matters, it’s not been enough to dislodge a highly dominant Intel. And while the lack of success of Windows-on-Arm is far from solely being Qualcomm’s fault – there’s a lot to be said for the OS and software – silicon has certainly played a part. To make serious inroads on the market, it’s not enough to produce incrementally better chips – Qualcomm needs to make a major leap in performance.
Now, after nearly three years of hard work, Qualcomm is getting ready to do just that. This morning, the company is previewing their upcoming Snapdragon X Elite SoC, their next-generation Arm SoC designed for Windows devices. Based on a brand-new Arm CPU core design from their Nuvia subsidiary dubbed “Oryon”, the Snapdragon X Elite is to be the tip of the iceberg for a new generation of Qualcom SoC designs. Not only is it the heart and soul of Qualcomm’s most important Windows-on-Arm SoC to date, but it will eventually be in smartphones and a whole lot more.
But we’re getting ahead of ourselves. For now let’s focus on the Snapdragon X Elite SoC and the Oryon cores underpinning it.
While this morning’s announcement from Qualcomm is far from a deep dive on the hardware, it’s our first look at what will be Qualcomm’s flagship SoC, and the new CPU cores within it. With a projected launch date of mid-2024, the first laptops based on the SoC are still several months away from hitting retail shelves – and about a year delayed overall. None the less, Qualcomm has finished their silicon development work, and with the chip’s specifications locked down, the company is now on to polishing things for a launch next year.
The Oryon CPU cores within the Snapdragon X Elite are the culmination of Qualcomm’s Nuvia acquisition from early 2021, and an even longer period of work for the Nuvia team. The ambition of the team, and the importance of the custom Arm architecture CPU cores, cannot be overstated. So the Snapdragon X Elite is going to be an interesting chip on multiple levels, as it sets the pace for the next generation of Qualcomm chip designs.
|Snapdragon Compute (Windows-on-Arm) Silicon
|Snapdragon X Elite
2C Turbo: 4.3GHz
|4 x C-A76
|4 x C-A76
|4 x C-A55
|4 x C-A55
SD X Elite
8cx Gen 3
45 TOPS (INT8)
|Hexagon 8cx Gen 3
|8 x 16-bit
|8 x 16-bit
|8 x 16-bit
|8 x 16-bit
|Wi-FI 7 + BE 5.4
|Wi-Fi 6E + BT 5.1
|Wi-Fi 6 + BT 5.1
|Wi-Fi 5 + BT 5.0
Starting with a high-level look at the chip, the Snapdragon X Elite is a high-performance SoC designed to power Windows-on-Arm laptops. Qualcomm isn’t listing any official TDPs, but the company has told us that the Elite is designed to scale across a “broad range” of thermal designs. Active cooling will be needed to get the most out of the Elite, but according to Qualcomm, passive/fanless designs are possible as well, and we should expect to see some retail devices designed as such.
Qualcomm is fabbing the chip on an unspecified 4nm process. Given their previous performance issues with Samsung’s 4nm line, it’s a very safe bet that they’re building this chip at TSMC – possibly using the N4P line. The silicon itself is a traditional monolithic die, so there is no use of chiplets or other advanced packaging here (though the wireless radios are discrete).
CPU: Oryon By The Dozen
The star of the show (if you’ll forgive the pun) is Oryon, Qualcomm’s new custom-designed Arm CPU core. Designed by the Nuvia team that Qualcomm acquired in 2021, Oryon is the first high-performance, fully-custom Arm CPU core created by Qualcomm in several years. And following multiple generations of lackluster Snapdragon Compute SoCs built out of Arm Cortex-A/X designs and functionally bigger versions of Qualcomm’s mobile SoCs, Oryon marks a major change in direction for Qualcomm.
Being that this is a preview, there are no significant architectural details to share on Oryon at this time. We don’t know the width, or various buffer sizes, execution ports, etc. But what we do know is that Qualcomm didn’t aim low with this SoC – the Nuvia team was working on a server-grade CPU core prior to their acquisition, and that kind of aggressive design has carried over into Oryon as well. Which, after all, was one of the major goals of Qualcomm’s acquisition, as they have desired a high performance CPU core to push them ahead of the other laptop (and eventually mobile) chip makers.
The Snapdragon X Elite SoC ships with 12 Oryon CPU cores – and that’s it. Unlike Qualcomm’s 8cx family of designs, there are no distinct “efficiency” and “performance” cores based on different microarchitectures; this is a homogenous CPU design, more akin to traditional PC processors. This means that Oryon needs to pull double duty, excelling in performance in heavy workloads without chewing up a bunch of power in light workloads.
The Oryon CPU cores are broken up into three clusters of 4 cores each. We’re still waiting on further technical details, of course, but it’s a safe assumption that each cluster is on its own power rail, so that unneeded clusters can be powered down when only a handful of cores are called for.
Just on this basis alone, Snapdragon X Elite looks like a far more potent performer than the 8CX chips it replaces. The 8cx Gen 3 offered just 4 performance cores (Cortex-X1) and another 4 eficiency cores (Cortex-A78), so Snapdragon X Elite will hit the streets with 50% more CPU cores never mind the higher performance of those cores. For a laptop chip, Qualcomm is throwing a lot of CPU cores at the matter.
With regards to clockspeeds, in an all-core turbo workload, all 12 Oryon CPU cores can hit run at up to 3.8GHz, power and thermal headroom permitting. Meanwhile in lighter workloads, the chip supports turboing up to 4.3GHz on 2 cores. Qualcomm’s slide on this matter shows a core from each cluster, but it’s unclear whether this is some kind of prime/favored core in action (where only certain cores are designed/validated for those speeds) or if it’s simply a stylistic choice.
Either way, Qualcomm is aiming to turbo to relatively high clockspeeds for their laptop chip, a notable distinction from their much more modestly clocked 8CX chips. While high clockspeeds alone do not make for a fast chip, one of the performance bottlenecks the 8CX chips were their pokey clockspeeds, so if Oryon offers as high an IPC rate as we suspect it will, then this would go a long way towards boosting Qualcomm’s CPU performance to compete with the industry’s strongest players.
Memory: 128-bit LPDDR5x
Feeding the beastly Oryon CPU cores (as well as the rest of the chip) is a 128-bit LPDDR5x memory bus. This is less remarkable than the CPU side of the chip, but it’s important to note all the same. With the previous 8CX chips only supporting LPDDR4x, this brings Qualcomm back to parity with the latest PC chips in terms of memory technology support. And with supported data rates as high as LPDDR5x-8533, this will give Qualcomm one of the fastest memory controllers on the market.
Qualcomm is also quoting a total of 42MB of cache in the system sitting between the various processor blocks and system memory. Given the explicit mention of “total cache”, this is almost certainly L2 + L3. Previous Qualcomm designs have offered a 6MB shared L3 (last level) cache. If that’s the case again here, then that would mean there’s 3MB of L2 cache available for each CPU core – or some permutation thereof.
GPU: Latest Generation Adreno
On the graphics side of matters, Snapdragon X Elite incorporates Qualcomm’s latest generation Adreno GPU. As is typical for Qualcomm in these matters, the company is saying virtually nothing about the architecture employed here, though it goes without saying that this is the latest and greatest iteration of Qualcomm’s in-house GPU design.
From a feature perspective, this is a DirectX 12-class GPU with ray tracing support, mirroring the capabilities Qualcomm introduced with last year’s Snapdragon 8 Gen 2 mobile SoC. Within the Windows ecosystem, it will almost certainly qualify as a DirectX 12 Ultimate (feature level 12_2) design.
Qualcomm is quoting a single throughput figure for the design: 4.6 TFLOPS at an unspecified bit depth/format (we’d guess FP32). Qualcomm has not previously disclosed similar figures for the 8CX chips, so it’s hard to say how this will compare. Or even how it will compare to other integrated GPUs, since there’s a lot more to real-world GPU performance than pure FLOPS.
The display controller portion of the GPU offers support for up to 4 DisplayPort displays. Besides an internal display for the laptop, it can drive a further 3 external displays (all DP 1.4), with one output being 5K capable, while the rest are 4K.
Finally, the SoC is getting Qualcomm’s latest video processing block (VPU) as well. This latest design not only support AV1 decoding, but in a first for a Qualcomm SoC, AV1 encoding as well.
NPU: Hitting Hard with Hexagon
Next to the use of Oryon CPU cores, Qualcomm’s other big bet with the Snapdragon X Elite SoC is on the AI/neural processing unit side of things with their latest generation Hexagon NPU. Qualcomm is expecting that AI use will continue to rapidly grow over the next few years, and that the next big push is going to be AI models running locally on users’ systems. So they have invested a significant amount of resources in bulking up their Hexagon NPU for this generation of chips (X Elite and 8 Gen 3).
The end result is a heavily revised NPU, which should greatly exceed the 8CX Gen 3’s NPU performance. Qualcomm is quoting 45 TOPS of performance here for modest precision INT8, whereas 8CX Gen 3 was previously quoted at 15 TOPS for an unspecified data format.
Unlike their CPU and GPU, Qualcomm is sharing some architectural details here about the NPU, and what they’ve done to boost its performance. The tensor accelerator block, used in the densest matrix math, is outright 2.5x faster than before. Backing that (and the rest of the NPU) is a 2x larger shared memory/cache (though Qualcomm is not disclosing the actual size). Qualcomm is targeting large language models (LLMs) in particular with this change, as these are notoriously memory bound; according to the company, the chip will have enough resources to run a 13 billion parameter Llama 2 model locally.
Qualcomm has also made some power delivery changes to help drive more performance/efficiency out of the NPU. The power-hungry tensor block is now on its own power rail, with the rest of the NPU sitting on a separate shared rail. The company has also made some further undisclosed improvements to how they handle micro-tiling of inferencing workloads, which directly impacts how well they can split up workloads to keep the various sub-blocks of the NPU as busy as possible while minimizing intermediate memory operations.
I/O: USB4, PCIe 4, & Discrete Wi-Fi 7
Rounding out the Snapdragon X Elite, let’s talk I/O.
For internal I/O, the SoC offers PCIe 4.0 connectivity for NVMe storage. Elsewhere, the company is using PCIe 3 to supply connectivity to their modem and Wi-Fi solutions. No mention has been made of whether there are any free PCIe lanes for further peripherals.
For external I/O, the SoC supports USB4. According to Qualcomm, it can drive up to 3 such Type-C ports, and there are also a pair of USB 3.2 Gen2 outputs, and a single USB 2.0 output for internal use.
As noted earlier, both Wi-Fi and the modem are discrete for this product. The chip is intended to be paired with Qualcomm’s FastConnect 7800 silicon in the form of an M.2 card. The 7800 is their latest-generation Wi-Fi 7 solution, with support for 4 spatial streams as well as Bluetooth 5.4. The modem pairing is the Snapdragon X65, a high-performance 5G modem which was also available for the 8CX Gen 3.
The fact that neither wireless system is integrated into the SoC is unusual for Qualcomm, but perhaps not too surprising since they want to bring the Elite to market ASAP. Integrating these modules would take further time, and as a laptop SoC, Qualcomm doesn’t need to be as space efficient. In any case, the official line from Qualcomm is that the discrete modem is for OEM flexibility – to give OEMs the option to either include a modem or not – though Qualcomm of course will be strongly encouraging OEMs to include one as a major feature differentiator of the platform.
As we don’t have enough architectural details to make any meaningful performance projections, the best thing we have for now are Qualcomm’s vague comparisons to their competitors. This is also the closest thing Qualcomm has provided to energy efficiency data for the chip (though, as always, target clockspeeds for a SKU play a massive part there).
With 12 performance cores, Qualcomm is pushing hard on multi-threaded performance. In fact, multi-threaded performance is the only CPU performance comparisons Qualcomm makes, as there are no single-threaded comparisons to speak of. Make of that what you will.
Against what is implied to be an Intel 12 core mobile CPU design, Qualcomm is reporting that Snapdragon X Elite delivers 2x the multi-threaded performance in Geekbench 6. Or at iso-performance, they hit the same mark at one-third the power consumption.
Even against Intel’s best 14-core (H-class) chips, Qualcomm still reports that they lead by 60% in performance, and again are consuming one-third the power at iso-performance. Undoubtedly, a lot of this is down to the process node used, as TSMC N4 should be delivering a significant advantage over the Intel 7 process used on Intel’s current chips. This is also why the “moving target” aspect is so critical, as Snapdragon X Elite should be competing with the Intel 4 based Meteor Lake lineup by the time it launches next year.
More interesting, perhaps, is that Qualcomm is reporting a 50% multi-threaded performance advantage over an unspecified "Arm-based competitor,” This is meant to imply Apple, but depending on just how vague Qualcomm wishes to be, MediaTek does offer some Windows-on-Arm chips as well.
Qualcomm also expects to lead in GPU performance in 3DMark Wildlife Extreme. Which again, with a process node advantage and a tendency to build bigger iGPUs overall, is not surprising.
As always, these claims should be taken with a large grain of salt, especially for a platform that is still several months away from launching.
Snapdragon X Elite: Coming Mid-2024
Wrapping things up, Qualcomm is at this point putting the final touches on the Snapdragon X Elite. The company has deemed it one of their “most pivotal platform announcements in the company's recent history”, and for good reason. The Oryon CPU core being introduced here will eventually be at the heart of a good deal more products, so how competitive Oryon is will make or break Qualcomm’s next few generations of designs.
Devices based on the Snapdragon X Elite should be available in mid-2024. Which on that schedule, should see the Snapdragon X Elite competing against Intel’s Meteor Lake (Core Ultra) chips, AMD’s Phoenix chips (Ryzen Mobile 7000), and whatever the latest available iteration is of Apple’s M-series chips.