It feels like the build-up to Vega has been just shy of forever. And in a way it has been: AMD first announced the architecture back in March of 2016, and we’ve known for almost as long that Polaris would be a volume play, not a play for the high performance market. It would be Vega that would be AMD’s next high-end GPU, and true successor to 2015’s unique (if mistimed) Fiji GPU.

Since then AMD has slowly been releasing details on the Vega architecture. Bits and pieces related to the Playstation 4 Pro, which incorporated some Vega features, and then a much more significant information dump at the start of this year with their Vega teaser, closely aligned with their associated announcement of the Radeon Instinct accelerator, which set some pretty specific guidelines for what kind of performance AMD was aiming for. Until today all we’ve known for sure is that Vega would be ambitious, but not unrealistic, and that AMD wanted to launch it by the end of June.

So for today’s AMD Financial Analyst Day, AMD has released a little bit more information as part of the next step of their campaign. The first Vega product to be released has a name, it has a design, and it has performance figures. Critically, it even has a release date. I hesitate to call this a full announcement in the typical sense – AMD is still holding some information back until closer to the launch – but we now finally have a clear picture of where the Vega generation kicks off for AMD.

Say hello to the Radeon Vega Frontier Edition.

AMD Workstation Card Specification Comparison
  Radeon Vega Frontier Edition Radeon Pro Duo (Polaris) Radeon Pro WX 7100 Radeon Fury X
Stream Processors 4096 2 x 2304 2304 4096
Texture Units ? 2 x 144 144 256
ROPs 64? 2 x 32 32 64
Boost Clock ~1.59GHz 1243MHz 1243MHz 1050MHz
Single Precision ~13 TFLOPS 11.5 TFLOPS 5.7 TFLOPS 8.6 TFLOPS
Half Precision ~26 TFLOPS 11.5 TFLOPS 5.7 TFLOPS 8.6 TFLOPS
Memory Clock ~1.88Gbps HBM2 7Gbps GDDR5 7Gbps GDDR5 1Gbps HBM
Memory Bus Width 2048-bit 2 x 256-bit 256-bit 4096-bit
Memory Bandwidth ~480GB/sec 2x 224GB/sec 224GB/sec 512GB/sec
VRAM 16GB 2 x 16GB 8GB 4GB
Typical Board Power ? 250W 130W 275W
GPU Vega (1) Polaris 10 Polaris 10 Fiji
Architecture Vega Polaris Polaris GCN 1.2
Manufacturing Process GloFo 14nm GloFo 14nm GloFo 14nm TSMC 28nm
Launch Date 06/2017 05/2017 10/2016 06/24/15
Launch Price (A profitable number) $999 $649 $649

How AMD is going to sell this card and its target market is a huge subject (and arguably the most important aspect of today’s announcement) but I think everyone would rather dive into the specs first. So let’s start there.

The Radeon Vega Frontier Edition is meant to be a powerhouse of a card for AMD. While we’re still not quite sure which Vega chip we’re looking at – AMD has confirmed two chips so far, and it now looks like the second chip is going to be farther down the line – the Vega that we are getting now is AMD’s high-end play. And this is reflected in the Vega Frontier Edition high performance specifications.

AMD previously indicated with the Radeon Instinct MI25 announcement that they would be targeting 25 TFLOPS or better of half precision (FP16) performance on their high-end Vega parts, and the Vega Frontier Edition will be delivering on the “or better” part of that, with performance in the ballpark of 26 TFLOPS. With the caveat that AMD isn’t providing precise numbers at this time – like the Polaris launch, they’re being a bit unspecific – this indicates that AMD is hitting their earlier performance goals.

Furthermore, as part of today’s announcement we finally know that the Vega Frontier Edition will ship with 64 CUs (4096 stream processors), which also allows us to narrow down the clockspeeds of the card. To hit 26 TFLOPS FP16, the card needs to be able to boost to around 1.59GHz. This is particularly notable because AMD’s reference Polaris designs only go to 1.34GHz, which means that AMD is expecting to clock Vega cards a good bit higher than their Polaris cards.

Hitting these high clockspeeds is important for AMD, because while Vega is slated to be a powerful chip, in terms of execution width it’s actually not an especially wide chip. With 64 CUs, the Vega FE is no wider than AMD’s last-generation Fiji-based cards like the Radeon Fury X, which had the same 64 CU configuration. This means that relative to its predecessor, Vega FE is going to rely on the combination of significantly higher clockspeeds and its architectural enhancements to deliver the big performance gains AMD is touting for the card.

Meanwhile, although we now have a good picture of the CU situation for Vega FE, we have a bit of a mystery on our hands with the ROP situation. AMD has not disclosed the number of ROPs, but it has disclosed that the card will have a pixel fillrate of about 90 Gigapixels/second, which doesn’t divide neatly into the earlier clockspeed. It’s quite likely that we’re looking at a 64 ROP design and that AMD’s figures are factoring in a more conservative clockspeed (e.g. the base clock rather than the boost clock). But for the moment, this dichotomy means that the ROP configuration of the Vega FE isn’t as clear as the CU configuration.

Moving on, AMD has confirmed that the Vega FE will have approximately 480GB/sec of memory bandwidth, and in a big surprise, that this will come from 16GB of HBM2 memory. AMD has been heavily promoting their use of HBM2 and what it means for their memory architecture – the High Bandwidth Cache Controller – particularly banking on the fact that the HBM2 specification allows for memory speeds of up to 2Gbps/pin. Coupled with the fact that we’ve already seen an early Vega chip with 2 HBM2 stacks (indicating a 2048-bit bus), and AMD’s figures confirm that they’re going to be coming very close to 2Gbps for the Vega FE. The card’s 480GB/sec of bandwidth means that the HBM2 stacks need to be clocked at 1.875GHz, just short of that 2Gbps figure.

However the far more interesting news here isn’t the bandwidth, but the amount of total HBM2 memory. AMD has revealed that the Vega FE will ship with 16GB of memory, a figure that until today didn’t look to be possible. And to be honest, we’re still not sure how AMD is doing it.

Along with the higher data rate of HBM2, the other major feature of the newer HBM specification is that it allows for much higher capacity stacks, up to 4GB for a 4 layer (4-Hi) stack, or 8GB for an 8 layer (8-Hi) stack, solving the capacity issues that dogged the HBM1-powered Fiji GPU. However to date, neither Samsung nor SK Hynix have offered 8-Hi HBM2 in their parts catalogs. No specific reason has been given, but the use of TSVs means that the greater the number of layers, the harder the memory is to assemble. For the moment, a pair of 8-Hi stacks is by far the most likely explanation for how AMD has put 16GB of VRAM on the Vega FE. However until we hear more from AMD, it’s not the only possibility, and given that 8-Hi HBM2 still isn’t on anyone’s parts list, it may be that AMD has adopted more creative measures to get this much VRAM on a Vega card.

Rounding out the specifications, as AMD’s focus in today’s announcement was on compute performance and memory, the company hasn’t disclosed too much else about the hardware itself. The TDP has not been announced, however as we’re looking at a high-end card, 250W or better is a very good assumption, especially as AMD’s renderings of the Vega FE cards all show the card sporting a pair of 8-pin PCIe power sockets.

What we do know is that there will be two different Vega FE cards, which AMD is calling the Vega Frontier Blue Edition and the Vega Frontier Gold Edition. Information on the differences between these cards is very limited, but from the information AMD has provided, the Blue Edition will be a traditional blower-cooled card, while the Gold Edition will be taking a page from the Radeon Fury X and will feature an external radiator attached via an integrated closed loop liquid cooling system (CLLC). The use of a CLLC on the Fury X in 2015 made it a very quiet card, but at the same time it was also a necessary move due to its 275W TDP. So this may be a sign that the Gold card will have a particularly high TDP.

Otherwise at an architectural level, AMD hasn’t revealed any further details on the Vega architecture, and indeed it’s not clear if there’s much they left uncovered in this winter’s Vega architecture teaser. HBM2, the improved geometry engine, the new FP16-capable CUs, and the tiling rasterizer were all touched upon in today’s announcement. And as mentioned previously, these new architectural optimizations, along with Vega FE’s high clockspeed, are going to be the major components of the card’s performance.

Finally, with today’s announcement AMD has further narrowed down the launch date of their first Vega cards. The Vega Frontier Edition will be the first cards to ship, and as per the company’s earlier promise of an H1’17 launch, the cards will be available in late June. AMD has not announced a price for the cards yet, and like the detailed specifications themselves, I expect we’ll see that closer to next month’s launch.

But what is clear right now is that the Vega Frontier Edition is not meant to be a consumer gaming card, and that takes us into what I think is the more interesting aspect of today’s announcement: the target market.

Vega Frontier Edition’s Target Market: AI, Machine Learning, and other Professionals
POST A COMMENT

134 Comments

View All Comments

  • BurntMyBacon - Thursday, May 18, 2017 - link

    I suppose it's not as obvious to me as to you. The article doesn't present any benchmarks, much less gaming benchmarks. So I ask again, based on what premise?

    Theoretical (Max) single / double / half precision TFLOPS perhaps. Historically these numbers haven't been very useful for comparing gaming performance between vendors.

    Memory bandwidth - suffers the same problem as above.

    Pixel fill rate - only a small part of the story that may be more or less useful depending on application.

    Outside benchmarks - please share. I'd love to see some actual gaming (or gaming related) performance number. The validity of leaked or cherry picked benchmarks is perhaps questionable, but by necessity incorrect. Don't let that keep you from sharing.

    I apologize if I sound obstinate to you. The truth is, I actually like your conclusion and I am inclined to agree with it based on a combination of specifications, theoretical performance numbers, historical tendencies, market forces, and "gut feeling". However, I have yet to see anywhere near enough evidence to conclude more than a very rough wag on this one. Known changes to the architecture make historical tendencies a fuzzy approximation at best. We don't yet know how much (or little) they will affect the gaming performance.
    Reply
  • Meteor2 - Wednesday, May 17, 2017 - link

    The software stack slide is telling; CUDA owns the acceleration space and OpenCL isn't very popular. AMD doesn't have an answer here and it's too late anyway because CUDA is established and it works. Nvidia are being bastards and not supporting OpenCL 2 either, locking the market in. Reply
  • Yojimbo - Wednesday, May 17, 2017 - link

    It's not too late, but it's going to take a lot of hard work, determination, and resources. Koduri's claim that 10 or 20 engineers working for a few weeks on each framework is all that's necessary is not auspicious. Developers need assurance that AMD are going to actively support the ecosystem, something that they haven't been doing up to this point. Those number of engineers for that amount of time probably is what it took them to be able to run one chosen benchmark well that matched up particularly well with their chosen architecture (my guess is that the high bandwidth cache is a prime candidate for an advantage to focus on). As far as I know, for general usage, the BLAS libraries in GPUOpen are significantly slower than NVIDIA's cuBLAS.

    There's a lot more to support in GPU computing than just machine learning, as well. If they only focus on machine learning they will lose a lot of opportunities from companies that want to do simulations and analytics along with machine learning, which is probably the majority of them. Each application has its own issues, and the people in those market segments are mostly not machine learning experts. AMD has 3,000 employees in its Radeon Technologies Group. NVIDIA has 10,000 employees, and they don't have thousands of them sitting around doing nothing.

    As far as OpenCL, even when NVIDIA's OpenCL support was more current, CUDA had the advantage because NVIDIA actively supported the space with high performance libraries. If NVIDIA controls both the architecture and the programming model they are able to bring features to market much faster and more efficiently, which is pretty important with the pace of innovation that's going on right now. My guess is that opening CUDA would probably be a more beneficial action for the community than supporting OpenCL at the moment, unless opening CUDA meant losing control of it.
    Reply
  • BurntMyBacon - Wednesday, May 17, 2017 - link

    Yes, it would be more beneficial to the community to open CUDA up to other vendors. However, I think it is about as likely to happen as opening up PhysX or G-Sync. nVidia doesn't exactly have a reputation for opening up proprietary tech. Reply
  • Yojimbo - Wednesday, May 17, 2017 - link

    Well, they are open sourcing their upcoming deep learning accelerator ASIC. They recently open sourced a lot of GameWorks, I think. They will open source things when it's beneficial to them. That's the same thing that can be said for AMD or most tech companies. Considering their output with these initiatives and market position, AMD open sourcing GPUOpen and FreeSync was beneficial to them.

    NVIDIA has a history of not waiting around for committees and forging on their own the things they need/want, such as with CUDA, NVLink, and G-Sync. They are trying to build high-value platforms. They spend the money and take the risks in doing that and their motivation is profitability.

    I also don't expect them to open up CUDA at the moment.
    Reply
  • Meteor2 - Wednesday, May 17, 2017 - link

    I'd forgotten NVLink. I'm not seeing any answer to that at all from AMD and NVLink 2 is fundamental to accelerated HPCs which are vaguely programmable. Reply
  • tuxRoller - Wednesday, May 17, 2017 - link

    Opencapi (phenomenal latency). Given the huge number of lanes that Naples has, it would be a good target for opencapi.
    There's also the in-progress genz fabric, but that's not close to being standardized, and is more of an interconnect between nodes, I suppose.
    Reply
  • Yojimbo - Wednesday, May 17, 2017 - link

    OpenCAPI is managed by the host processor (CPU). From the way I've seen it described, it is not meant for accelerators to use to link to each other without going through a host processor that supports OpenCAPI. AMD presumably might support OpenCAPI on their CPUs, but they can't count on their GPUs being used with their CPUs. Perhaps CCIX plans to allow accelerator to accelerator interconnect, I'm not sure.

    Regardless, both OpenCAPI and CCIX are not available yet, and certainly not enabled on the upcoming Vega products.
    Reply
  • tuxRoller - Friday, May 19, 2017 - link

    Capi is definitely managed by the processor. Opencapi is supposed to be a "ground up" rethink of a high-speed, low latency interconnect for various on-node resources.
    Ibm appears to be calling theirs "Blue Link" (or something similar), and that's supposed to be released this year.
    I would be astonished if amd hasn't done some work towards supporting one of these standards on vega, even if not on this exact chip.
    Regardless, these are the options for amd, and since they are "standards" the hope is that others will begin buying into the spec as well.
    If opencapi remains a star network then amd can offer a competitive, even if not identical, solution instead of ceding the market to Nvidia.
    Reply
  • BurntMyBacon - Thursday, May 18, 2017 - link

    @Yojimbo: "Well, they are open sourcing their upcoming deep learning accelerator ASIC."

    Good Point.

    @Yojimbo: "They recently open sourced a lot of GameWorks, I think."

    I didn't know about that, but I'm not sure how much this one matters as gameworks products already run on competitor's GPUs and we already know that gameworkst is optimized for nVidia GPUs (Why wouldn't it be?).

    @Yojimbo: "They will open source things when it's beneficial to them. That's the same thing that can be said for AMD or most tech companies."

    I generally agree, but I would be remiss if I did not point out HyperTransport as an example of an AMD open standard initiative that predated it's proprietary Intel counterpart (QPI). It was also introduced in 2001 with their Athlon64 architecture, a time when AMD was doing well due to the recent success of the Athlon processors. It doesn't happen as often as I'd like, but sometimes tech companies will prioritize long term interoperability and viability over short term gain.

    @Yojimbo: "Considering their output with these initiatives and market position, AMD open sourcing GPUOpen and FreeSync was beneficial to them."

    I don't disagree. Historically, AMD has found themselves in this position far more often than nVidia.

    @Yojimbo: "NVIDIA has a history of not waiting around for committees and forging on their own the things they need/want, such as with CUDA, NVLink, and G-Sync.

    They don't need to wait on a committee to open up a standard. AMD forged ahead with mantle until such a point as the industry was ready to work the issue. They also don't necessarily need to give up control of their standards to allow other vendors to use them. It would even be reasonable to demand a licensing fee for access to some of their technologies.

    @Yojimbo: "They are trying to build high-value platforms. They spend the money and take the risks in doing that and their motivation is profitability."

    Keeping proprietary tech proprietary is their right. When they are are dominating a one or two player market, the risk is low, the returns are high, and the decision makes sense. If there were several major players or they weren't the dominant player, this would be far more risky and interoperability would be a bigger concern. Given the current market, I expect they'll keep their proprietary tech in house.
    Reply

Log in

Don't have an account? Sign up now