AMD Evolving Fast to Survive in the Server Market Jungle
by Johan De Gelas on June 18, 2013 6:00 AM EST- Posted in
- AMD
- CPUs
- Arm
- Enterprise
Low Power Server CPUs
When you read about micro servers, the spotlights are on very low power servers with Atom and ARM servers. The reality in the server market is however very different. But as you probably understand from the reasoning above, a server with an Atom CPU or ARM CPU is hardly a good solution for the homogenous webfarms.
In fact, with the exception of Seamicro, many micro servers have utterly failed in that market. Quite a few server vendors have offered dense Atom based solutions, but those solutions were poorly received. Simply cramming tens of Atom based servers in a small chassis is a pretty bad idea:
1. The single threaded performance of the Atom is even for webfarms too low
2. The power gains that you make by using lower power CPUs are negated by the cable management costs, the higher amount of PCBs and PHYs.
To sum it up: the performance per watt of those servers did not and will not thrill anyone. Seamicro, the pioneer of this niche market, was successful despite and not thanks to the Atom processor inside it’s micro servers. When Seamicro offered Low Power Xeons instead of Atoms inside their Borg Cube inspired server, the demand for their products really took off.
And even Calxeda, the champion of the low power servers admits that the current “ultra low power but low performance” microserver market is a small one. Calxeda has high expectations for its next generation of servers as they will be based on the more powerful Cortex A15 and A57 CPUs.
Seamicro and Calxeda succeeded where others fails as they understood that an optimized PCB and network fabric was necessary to make the concept of micro server work. Seamicro reduced the serverboard to its bare minimum (“credit card size”), turned unnecessary features off and connected all I/O via a high performance 3D torus interconnect.
Calxeda integrated several servers on one PCB and connected them together with a 2D torus network fabric. The result was a low power draw per server, not just per CPU. Once you add a CPU with good enough single threaded performance, things get very interesting.
80 Comments
View All Comments
andrewaggb - Wednesday, June 19, 2013 - link
I like AMD. But they are in a tough spot and I don't see them getting out of it with any of their roadmaps. The only interesting product is their ARM server chip. Everything else appears to be more of the same with slight improvements. That's not going to change the market dynamics. Intel squeezed up to 50% better idle battery life (and 30-100% faster graphics) with haswell and is targeting 2x cpu performance for the next atom and has their new xeon phi powering the fastest supercomputer (even if it's slightly less efficient than some of the other machines). They are setting aggressive targets and meeting them.AMD needs to hit targets that are at least 30-100% better than current offerings at something that matters. Be it IPC, power consumption, graphics, maximum clocks, core counts... but something.
They should just sell the PS4 or XBOne chips. 8 cores and gddr5 or edram. That would give probably a 100% increase in apu graphics performance and double the cores which would give some cpu tasks a large increase.
silverblue - Tuesday, June 18, 2013 - link
I'm not sure I agree. Even with Steamroller fixing multithreading, that in itself won't result in a big gain. If Cinebench is anything to go by, FX scales decently enough there so there's not a huge amount of performance left on the table without a decent single core IPC gain.Each BD core only contains 2 ALUs and 2 AGUs. AMD argues that dropping the third of each was a necessary evil, however whereas Stars couldn't really utilise them, Core certainly can. Additionally, FlexFPU isn't helping - theoretically, AMD shouldn't lag behind in AVX but they do significantly, so they'd need to boost the FPU to 256-bit per pipe in order to gain parity... but they're not doing this with Steamroller as far as I've heard (in fact, the only thing they're doing is removing an MMX unit). FX's memory controller doesn't appear to be that good, either, and the significant power savings from high density libraries don't come until Excavator (though Steamroller does get the dynamic L2, which is neat).
Not having an FX or Opteron Steamroller 8-core (or higher) is a bit daft regardless of the above, unless it's not a big enough jump, OR Excavator has been pushed forwards.
jjj - Tuesday, June 18, 2013 - link
A bit alarming that there is no 12-16 Steamroller cores on 28 or 20nm on the roadmap and what does that tells us about desktop (they don't really have a public roadmap for 2014 desktop)?I do wonder if Warsaw is one huge die or they are still going with 2 dies with 8 cores each.
On the ARM side it will be interesting to see power consumption on 28nm,at least this way they'll have the product sooner and i do wish the DIY players and the SoC makers would set some standards for ARM desktops. At the very least there is a HTPC and file server market that can expand to desktop.
SilentSin - Tuesday, June 18, 2013 - link
I wonder if AMD is taking an Intel release approach to this. The high core-count EP server CPUs coming out a year after the smaller ones with some tweaks. Steamroller is set to be released for desktop in Kaveri form this year- same die as Berlin I am guessing. But you are right that I don't see a Steamroller FX chip or a server variant with 4m/8c which is puzzling. I hope that they aren't going to rely on fused-off Kaveri 2m/4c CPU only parts for clients as it simply isn't enough horsepower. The BD architecture is actually extremely competitive with anything Intel has when used in a proper Linux environment.Hopefully these products come out on time, AMD really needs timely execution to have any hope here. Other 64-bit ARM solutions are just around the corner and they will be left in the dust if they have constant delays.
Gigaplex - Tuesday, June 18, 2013 - link
"The BD architecture is actually extremely competitive with anything Intel has when used in a proper Linux environment."Only for integer based workloads. It still struggles with floating point.
TiredOldFart2 - Tuesday, June 18, 2013 - link
Integer workload is most of the workload for now. Fp intensive will naturally migrate onto opencl over time, as it is more efficient to do so.sanaris - Monday, February 17, 2014 - link
You are newbie in server market. Server folks will never go to closed language.Because they value their time of development.
And of cause server processors need as much floating point as possible in 64/128 wide range.
The funny "32 fp performance" numbers which opencl and cuda creators are showing to us are irrelevant because you need as much significant bits as possible - 32 minus exponent minus reserve - conclusion is 32 fp is irrelevant.
jjj - Tuesday, June 18, 2013 - link
Intel got greedy and lazy ,AMD had an opportunity in 200+$ desktop with many cores, preferably on 20nm so they can fit 16 cores in a reasonable die size). AMD also has an opportunity to make an APU similar in perf to Xbox One, for a nice budged gaming box capable of playing console ports for quite a few years and more when the APU gets CFed with a discrete card.We'll see big of a GPU Kaveri has ,it's debatable if it's worth using a big enough GPU on 28nm.In ARM consumer, both Intel and M$ are somewhat alienating customers so the ARM players should push on all fronts.
Sadly it seems AMD doesn't have the resources to be aggressive and fight too many battles.
sanaris - Monday, February 17, 2014 - link
AMD had opportunity everywhere since the times they made opteron 180 and overall transition to manycore era. But they throw their chance with opteron decision to drop floating point. And of cause AMD needs manual cache regulation instruction set.JDG1980 - Tuesday, June 18, 2013 - link
I don't think we can necessarily infer desktop strategy from the server roadmap. AMD already has a really, really small market share (something like 4%) in big x86 server parts. Their market share on the desktop is nowhere near that bad. Sure, they're going to release the APUs first for Steamroller since that's a mainstream product and has a bigger market than the enthusiasts. But once they get that done, they're going to need to do something to keep themselves in the news until the next architecture, and putting together a FX chip wouldn't be too much of a stretch. I do think that Socket AM3+ is dead; the FX parts will most likely be on the same socket as the APUs from now on. This will minimize engineering costs.