AMD Evolving Fast to Survive in the Server Market Jungle
by Johan De Gelas on June 18, 2013 6:00 AM EST- Posted in
- AMD
- CPUs
- Arm
- Enterprise
Low Power Server CPUs
When you read about micro servers, the spotlights are on very low power servers with Atom and ARM servers. The reality in the server market is however very different. But as you probably understand from the reasoning above, a server with an Atom CPU or ARM CPU is hardly a good solution for the homogenous webfarms.
In fact, with the exception of Seamicro, many micro servers have utterly failed in that market. Quite a few server vendors have offered dense Atom based solutions, but those solutions were poorly received. Simply cramming tens of Atom based servers in a small chassis is a pretty bad idea:
1. The single threaded performance of the Atom is even for webfarms too low
2. The power gains that you make by using lower power CPUs are negated by the cable management costs, the higher amount of PCBs and PHYs.
To sum it up: the performance per watt of those servers did not and will not thrill anyone. Seamicro, the pioneer of this niche market, was successful despite and not thanks to the Atom processor inside it’s micro servers. When Seamicro offered Low Power Xeons instead of Atoms inside their Borg Cube inspired server, the demand for their products really took off.
And even Calxeda, the champion of the low power servers admits that the current “ultra low power but low performance” microserver market is a small one. Calxeda has high expectations for its next generation of servers as they will be based on the more powerful Cortex A15 and A57 CPUs.
Seamicro and Calxeda succeeded where others fails as they understood that an optimized PCB and network fabric was necessary to make the concept of micro server work. Seamicro reduced the serverboard to its bare minimum (“credit card size”), turned unnecessary features off and connected all I/O via a high performance 3D torus interconnect.
Calxeda integrated several servers on one PCB and connected them together with a 2D torus network fabric. The result was a low power draw per server, not just per CPU. Once you add a CPU with good enough single threaded performance, things get very interesting.
80 Comments
View All Comments
silverblue - Wednesday, June 19, 2013 - link
The 14nm node has been delayed until 2015.AMD's power usage would be better if they didn't implement CnQ on a per-module basis. Sharing resources is one thing, but power circuitry? The dynamic L2 is a big step in the right direction though, especially considering there's up to 8MB of it. They could always go inclusive on the caches as they did with Jaguar.
Klimax - Wednesday, June 19, 2013 - link
14nm wasn't delayed. (Only desktop won't receive it)1008anan - Tuesday, June 18, 2013 - link
Johann, in the latest 500 supercomputer list, three supercomputer sport 22 nm Intel Ivy Bridge E5 Xeons, including the most powerful supercomputer in the world (which uses 32 thousand Ivy Bridge E5s). The Ivy Bridge E5s, although not yet listed on the Intel Price list, are already available to select customers. Ivy bridge E7s come out in weeks.AMD's Warsaw gross margins will be squeezed.
Let us assume you are right about power efficient micro-servers. Here is the problem:
--Haswell ultra low power Xeon E3s come out in weeks
--Avoton is likely released by September.
What do you think this will do to Seattle and Berlin CPU gross margins? Keeping in mind how much Global Foundries rips AMD off in fabricating? Seattle CPU gross margins will be reduced by the fact that they cannot process native X86 code.
This said, Seattle and Berlin CPUs will do okay in the 2nd half of 2014 (doubt AMD will be able to sell them in mass before the 2nd half of 2014). The real Seattle killers will be ultra low power 14 nm broadwell E3 Xeons and airmont--avoton's 14 nm successor. Both should be available in Q4, 2014.
In 2015, Intel will release microservers powered by 14 nm multisocket Xeon Phis. Not PCI express Xeon Phis, but microservers consisting entirely of Xeon Phis that are better at processing X86 code then any ARM Holding derivatives.
The medium term Intel roadmap for new server chips, and server noncore functionality will be extremely difficult for AMD to compete against.
JohanAnandtech - Thursday, June 20, 2013 - link
Seattle is not build by Global Foundries and I can imagine that the R&D on Seattle is quite cheap. After all, it is not like AMD had to reinvent the A57 core. And a lot can happen in 3 quarters, so I don't think Seattle should be compared to Airmont.bji - Tuesday, June 18, 2013 - link
I think AMD still has a chance. Intel's most recent releases have shown that the per-core performance increases are asymptotically reaching the fastest x86 is going to get. Haswell was insignificantly faster than Ivy Bridge, which itself was only marginally faster than Sandy Bridge. It's clear that the year-on-year speed increases are quickly going to approach zero.In this environment AMD has some room to catch up. If they can hold on for a few more years it should give them some time to improve until their IPC is as good as or so close to as good as Intel's as to not really matter.
Depending on how you look at it, it either sucks or is great that x86 is approaching its performance limit. Most would say it sucks because they always want something faster (even if 99.9% of the time they don't actually need it!). I actually am kind of happy that things are slowing down considerably. Rather than focusing on performance for performance' sake, perhaps we can start focusing on the entire computing infrastructure and make it work better for people instead of just faster. Lower power devices, more ubiquitous devices, more useful, more user friendly ... these are all great trends, and it's where the future competition in the computing world will predominantly be.
It reminds me of gaming consoles and their effect on games. I know alot of people dislike the fact that consoles set a performance standard that it was no longer beneficial for most games to go beyond, leading to a sort of performance stagnation. On the other hand, companies could spend more resources focusing on making better games and less on higher and higher detailed eye candy. I actually appreciate that because I think game play is more important than graphics.
Anyway, I don't think you can start sounding AMD's death knell yet. They're going to reach near performance parity with Intel in a few years just because they have more headroom for performance improvement than Intel as we approach the limit. At that point, x86 won't really be getting significantly faster year by year but it sure will get cheaper. I am sure we can all appreciate at least that aspect of it.
inighthawki - Tuesday, June 18, 2013 - link
x86 is far from hitting its performance limit. The reason Intel's CPUs haven't drastically improved performance in the last few generations is because their efforts have been focused elsewhere, most namely the integrated graphics chip, which has made significant strides even since Sandy Bridge, and consumes a large portion of the die. Haswell put most of their focus on Iris and power consumptions. Ivy bridge was already fast enough for 99.99% of the userbase, so it makes strong business sense to focus efforts elsewhere, and utilize their transistors on more important matters.If Intel wanted to increase CPU performance, they could. But they have no reason to bother doing so. Other features are more important and their competition (AMD) is still a generation or two behind in performance. There's not a whole lot of pressure for faster CPUs.
1008anan - Tuesday, June 18, 2013 - link
inighthawki is right. In addition to what you said, Intel was focused on reducing the TDP envelope of CPU cores rather than increasing the single threaded performance of CPU cores. Since Intel uses the same architecture for everything from E7 Xeons to ultra-low power Haswell SoCs that power tablets (two 256 bit SIMDs wide with FMA), the single threaded performance of E7 Xeons is reduced to fit Haswell and Broadwell into tablets.However, this will not last. Intel now has three X86 architectures:
--main Core architecture
--Avoton and Avoton successors
--Xeon Phi (including multisocketed E5 and E7 versions)
Intel will be able to focus on increasing the number of bits wide for the main core from 512 to 1024 for future micro-architectures, as well as improving macro fusion, branch predictors, buffer sizes (better unified reservation stations, instruction decode queues, reorder buffers) increasing the number of parallel execution ports, L1, L2, L3, etc.
The reason why is that the successors of Avoton will focus on low TDP parts over time, freeing Intel to prioritize increasing single threaded performance on the main core. Similarly Xeon Phi's focus on improving multi-threaded performance allows the main core to focus more on single threaded performance.
In any case I would argue that single threaded performance has improved more since conroe than you imply. Anand Shimpi explains why better than I could here:
http://www.anandtech.com/show/6355/intels-haswell-...
bji - Tuesday, June 18, 2013 - link
You refuted your own point at the end when you said "There's not a whole lot of pressure for faster CPUs". This is exactly why x86 is asymptotically approaching its IPC limit. There simply are not enough dollars chasing increased x86 performance to justify the huge costs involved in advancing x86 performance significantly. With mobile markets eating away at desktop and laptop market share, the number of tasks that require ever increasing CPU performance dwindling, and the increasing cost of increasing x86 performance, there just isn't going to be enough money chasing improved x86 performance to justify the billions that it costs to make significant headway.inighthawki - Tuesday, June 18, 2013 - link
I apologize if I misinterpretted your original point, but I was speaking explicitly from a technological perspective. There are plenty of ways to improve performance, just very little will to do so. You are correct on that one.However I'm not sure I agree that the reason is solely just a need to justify R&D costs. I think it's more simply that the performance is high enough that they have the opportunity to better spend their time on more important things. Take away the costs they've put in since Sandy Bridge associated with improving the integrated graphics performance and lowering power consumption and I bet they could've easily ended up with a chip that was at LEAST 50% faster than it is now. It's just that nobody really wants or needs that.
bji - Wednesday, June 19, 2013 - link
I think we are in agreement. And I must say, things sure are different than they were 15 years ago when the gigahertz wars were heating up. Back then there was almost nothing you could do with a computer that couldn't benefit from more speed. Now ... well it's easy to go 4 or 5 years without bothering to upgrade, and the year-on-year speed bumps are almost not noticeable.