NVIDIA's GeForce GTX Titan, Part 1: Titan For Gaming, Titan For Compute
by Ryan Smith on February 19, 2013 9:01 AM ESTThe launch of the Kepler family of GPUs in March of 2012 was something of a departure from the normal for NVIDIA. Over the years NVIDIA has come to be known among other things for their big and powerful GPUs. NVIDIA had always produced a large 500mm2+ GPU to serve both as a flagship GPU for their consumer lines and the fundamental GPU for their Quadro and Tesla lines, and have always launched with that big GPU first.
So when the Kepler family launched first with the GK104 and GK107 GPUs – powering the GeForce GTX 680 and GeForce GT 640M respectively – it was unusual to say the least. In place of “Big Kepler”, we got a lean GPU that was built around graphics first and foremost, focusing on efficiency and in the process forgoing a lot of the compute performance NVIDIA had come to be known for in the past generation. The end result of this efficiency paid off nicely for NVIDIA, with GTX 680 handily surpassing AMD’s Radeon HD 7970 at the time of its launch in both raw performance and in power efficiency.
Big Kepler was not forgotten however. First introduced at GTC 2012, GK110 as it would come to be known would be NVIDIA’s traditional big, powerful GPU for the Kepler family. Building upon NVIDIA’s work with GK104 while at the same time following in the footsteps of NVIDIA’s compute-heavy GF100 GPU, GK110 would be NVIDIA’s magnum opus for the Kepler family.
Taped out later than the rest of the Kepler family, GK110 has taken a slightly different route to get to market. Rather than launching in a consumer product first, GK110 was first launched as the heart of NVIDIA’s Tesla K20 family of GPUs, the new cornerstone of NVIDIA’s rapidly growing GPU compute business.
Oak Ridge National Laboratory's Titan Supercomputer
Or perhaps as it’s better known, the GPU at the heart of the world’s fastest supercomputer, Oak Ridge National Laboratory’s Titan supercomputer.
The Titan supercomputer was a major win for NVIDIA, and likely the breakthrough they’ve been looking for. A fledging business merely two generations prior, NVIDIA and their Tesla family have quickly shot up in prestige and size, much to the delight of NVIDIA. Their GPU computing business is still relatively small – consumer GPUs dwarf it and will continue to do so for the foreseeable future – but it’s now a proven business for NVIDIA. More to the point however, winning contracts like Titan are a major source of press and goodwill for the company, and goodwill the company intends to capitalize on.
With the launch of the Titan supercomputer and the Tesla K20 family now behind them, NVIDIA is now ready to focus their attention back on the consumer market. Ready to bring their big and powerful GK110 GPU to the consumer market, in typical NVIDIA fashion they intend to make a spectacle of it. In NVIDIA’s mind there’s only one name suitable for the first consumer card born of the same GPU as their greatest computing project: GeForce GTX Titan.
GeForce GTX Titan: By The Numbers
At the time of the GK110 launch at GTC, we didn’t know if and when GK110 would ever make it down to consumer hands. From a practical perspective GTX 680 was still clearly in the lead over AMD’s Radeon HD 7970. Meanwhile the Titan supercomputer was a major contract for NVIDIA, and something they needed to prioritize. 18,688 551mm2 GPUs for a single customer is a very large order, and at the same time orders for Tesla K20 cards were continuing to pour in each and every day after GTC. In the end, yes, GK110 would come to the consumer market. But not until months later, after NVIDIA had the chance to start filling Tesla orders. And today is that day.
Much like the launch of the GTX 690 before it, NVIDIA intends to stretch this launch out a bit to maximize the amount of press they get. Today we can tell you all about Titan – its specs, its construction, and its features – but not about its measured performance. For that you will have to come back on Thursday, when we can give you our benchmarks and performance analysis.
GTX Titan | GTX 690 | GTX 680 | GTX 580 | |
Stream Processors | 2688 | 2 x 1536 | 1536 | 512 |
Texture Units | 224 | 2 x 128 | 128 | 64 |
ROPs | 48 | 2 x 32 | 32 | 48 |
Core Clock | 837MHz | 915MHz | 1006MHz | 772MHz |
Shader Clock | N/A | N/A | N/A | 1544MHz |
Boost Clock | 876Mhz | 1019MHz | 1058MHz | N/A |
Memory Clock | 6.008GHz GDDR5 | 6.008GHz GDDR5 | 6.008GHz GDDR5 | 4.008GHz GDDR5 |
Memory Bus Width | 384-bit | 2 x 256-bit | 256-bit | 384-bit |
VRAM | 6 | 2 x 2GB | 2GB | 1.5GB |
FP64 | 1/3 FP32 | 1/24 FP32 | 1/24 FP32 | 1/8 FP32 |
TDP | 250W | 300W | 195W | 244W |
Transistor Count | 7.1B | 2 x 3.5B | 3.5B | 3B |
Manufacturing Process | TSMC 28nm | TSMC 28nm | TSMC 28nm | TSMC 40nm |
Launch Price | $999 | $999 | $499 | $499 |
Diving right into things then, at the heart of the GeForce GTX Titan we have the GK110 GPU. By virtue of this being the 2nd product to be launched based off the GK110 GPU, there are no great mysteries here about GK110’s capabilities. We’ve covered GK110 in depth from a compute perspective, so many of these numbers should be familiar with our long-time readers.
GK110 is composed of 15 of NVIDIA’s SMXes, each of which in turn is composed of a number of functional units. Every GK110 packs 192 FP32 CUDA cores, 64 FP64 CUDA cores, 64KB of L1 cache, 65K 32bit registers, and 16 texture units. These SMXes are in turn paired with GK110’s 6 ROP partitions, each one composed of 8 ROPs, 256KB of L2 cache, and connected to a 64bit memory controller. Altogether GK110 is a massive chip, coming in at 7.1 billion transistors, occupying 551mm2 on TSMC’s 28nm process.
For Titan NVIDIA will be using a partially disabled GK110 GPU. Titan will have all 6 ROP partitions and the full 384bit memory bus enabled, but only 14 of the 15 SMXes will be enabled. In terms of functional units this gives Titan a final count of 2688 FP 32 CUDA cores, 896 FP64 CUDA cores, 224 texture units, and 48 ROPs. This makes Titan virtually identical to NVIDIA’s most powerful Tesla, K20X, which ships with the same configuration. NVIDIA does not currently ship any products with all 15 SMXes enabled, and though NVIDIA will never really explain why this is – yield, power, or otherwise – if nothing else it leaves them an obvious outlet for growth if they need to further improve Titan’s performance, by enabling that 15th SMX.
Of course functional units are only half the story, so let’s talk about clockspeeds. As a rule of thumb bigger GPUs don’t clock as high as smaller GPUs, and Titan will be adhering to this rule. Whereas GTX 680 shipped with a base clock of 1006MHz, Titan ships at a more modest 837MHz, making up for any clockspeed disadvantage with the brute force behind having so many functional units. Like GTX 680 (and unlike Tesla), boost clocks are once more present, with Titan’s official boost clock coming in at 876MHz, while the maximum boost clock can potentially be much higher.
On the memory side of things, Titan ships with a full 6GB of GDDR5. As a luxury card NVIDIA went for broke here and simply equipped the card with as much RAM as is technically possible, rather than stopping at 3GB. You wouldn’t know that from looking at their memory clocks though; even with 24 GDDR5 memory chips, NVIDIA is shipping Titan at the same 6GHz effective memory clock as the rest of the high-end GeForce 600 series cards, giving the card 288GB/sec of memory bandwidth.
To put all of this in perspective, on paper (and at base clocks), GTX 680 can offer just shy of 3.1 TFLOPS of FP32 performance, 128GTexels/second texturing throughput, and 32GPixels/second rendering throughput, driven by 192GB/sec of memory bandwidth. Titan on the other hand can offer 4.5 TFLOPS of FP32 performance, 187GTexels/second texturing throughput, 40GPixels/second rendering throughput, and is driven by a 288GB/sec memory bus. This gives Titan 46% more shading/compute and texturing performance, 25% more pixel throughput, and a full 50% more memory bandwidth than GTX 680. Simply put, thanks to GK110 Titan is a far more powerful GPU than what GK104 could accomplish.
Of course with great power comes great power bills, to which Titan is no exception. In GTX 680’s drive for efficiency NVIDIA got GTX 680 down to a TDP of 195W with a power target of 170W, a remarkable position given both the competition and NVIDIA’s prior generation products. Titan on the other hand will have a flat 250W power target – in line with prior generation big NVIDIA GPUs – staking out its own spot on the price/power hierarchy, some 28%-47% higher in power consumption than GTX 680. These values are almost identical to the upper and lower theoretical performance gaps between Titan and GTX 680, so performance is growing in-line with power consumption, but only just. From a practical perspective Titan achieves a similar level of efficiency as GTX 680, but as a full compute chip it’s unquestionably not as lean. There’s a lot of compute baggage present that GK104 didn’t have to deal with.
157 Comments
View All Comments
chizow - Friday, February 22, 2013 - link
Um, GF100/110 are absolutely the same league as this card. In the semiconductor industry, size = classification. This is not the first 500+mm^2 ASIC Nvidia has produced, the lineage is long and distinguished:G80, GT200, GT200b, GF100, GF110.
*NONE* of these GPUs cost $1K, only the 8800Ultra came anywhere close to it at $850. All of these GPUs offered similar features and performance relative to the competition and prevailing landscape. Hell, GT200 was even more impressive as it offered a 512-bit memory interface.
Increase in number of transistors is just Moore's law, that's just expected progress. If you don't know the material you're discussing please refrain from commenting, thank you.
CeriseCogburn - Sunday, February 24, 2013 - link
Wait a minute doofus, you said the memory cost the same, and it's cheap.You entirely disregarded the more than double the core transistor footprint, the R&D for it, the yield factor, the high build quality, and the new and extra top tier it resides in, not to mention it's awesome features the competition did not develop and does not have, AT ALL.
4 monitors out of the box, Single card 3d and surround, extra monitor for surfing, target frame rate, TXAA, no tesselation lag, and on and on.
Once a product breaks out far from the competitions underdeveloped and undeveloped failures, it EARNS a price tier.
You're living in the past, you're living with the fantasy of zero worldwide inflation, you'r living the lies you've told yourself and all of us about the last 3 top tier releases, all your arguments exposed in prior threads for the exaggerated lies they were and are, and the Charlie D RUMORS all you of the this same ilk repeat, even as you ignore the absolute time years long DEV time and entire lack of production capability with your tinfoil hat whine.
The market has changed you fool. There was a SEVERE SHORTAGE in the manufacturing space (negating your conspiracy theory entirely) and still there's pressure, and nVidia has developed a large range of added features the competition is entirely absent upon.
You didn't get the 680 for $350 (even though you still 100% believe Charlie D's lie filled rumor) and you're not getting this for your fantasy lie price either.
CeriseCogburn - Sunday, February 24, 2013 - link
NONE had the same or much bigger die sizes.NONE had 7.1 BILLION engineer traced research die points.
NONE had the potential downside low yield.
NONE had the twice plus expensive ram in multiples more attached.
NONE is the amount of truth you told.
Stuka87 - Tuesday, February 19, 2013 - link
Common sense would say nVidia is charging double what they should be.384bit memory is certainly not a reason for high cost as AMD uses it in the 79x0 series chips. A large die adds to cost, but the 580 had a big die as well (520mm2), so that cant be the whole reason for the high cost (the GK110 does have more transistors).
So it comes down to nVidia wanted to scalp customers.
As for your comments on AMD, what proof do you have that AMD has nothing else in the works? Not sure what crap you are referring too. I have had no issues with my AMD cards or their drivers (Or my nVidias for that matter). Just keep on hating for no reason.
AssBall - Tuesday, February 19, 2013 - link
You speak of common sense, but miss the point. When have you ever bought a consumer card for the pre-listed MSRP? These cards will sell to OEM's for compute and to enthusiasts via Nvidia's partners for much less.So it comes down to "derp Nvidia is a company that wants to make money derp".
Calling someone a hater for unrealistic reasons is much less of an offense than being generally an idiot.
TheJian - Wednesday, February 20, 2013 - link
A chip with 7.1B transistors is tougher to make correctly than 3B. Which card has 6GB of 6ghz memory from AMD that's $500 with this performance? 7990 is $900-1000 with 6GB and is poorly engineered compared to this (nearly double the watts, two slots more heat etc etc).This is why K20 costs $2500. They get far few of these perfect than much simpler chips. Also as said before, engineering these are not free. AMD charges less you say? Their bottom line for last year shows it too...1.18B loss. That's why AMD will have no answer until the end of the year. They can't afford to engineer an answer now. They just laid of 30% of their workforce because they couldn't afford them. NV hired 800 people last year for new projects. You do that with profits, not losses. You quit giving away free games or go out of business.
Let me know when AMD turns a profit for a year. I guess you won't be happy until AMD is out of business. I think you're hating on NV for no reason. If they were anywhere near scalping customers they should have record PROFITS but they don't. Without Intel's lawsuit money (300mil a year) they'd be making ~1/2 of what they did in 2007. You do understand a company has to make money to stay in business correct?
If NV charged 1/2 the price for this they would be losing probably a few hundred on each one rather than probably a $200 profit or so.
K20 is basically the same card for $2500. You're lucky their pricing it at $1000 for what you're getting. Amazon paid $2000ea for 10000 of these as K20's. You think they feel robbed? So by your logic, they got scalped 20,000 times since they paid double the asking here with 10000 of them?...ROFL. OK.
What it comes down to is NV knows how to run a business, while AMD knows how to run one into the ground. AMD needs to stop listening to people like you and start acting like NV or they will die.
AMD killed themselves the day they paid 3x the price they should have for ATI. Thank Hector Ruiz for that. He helped to ruin Motorola too if memory serves...LOL. I love AMD, love their stuff, but they run their business like idiots. Kind of like Obama runs the country. AMD is running a welfare business (should charge more, and overpays for stuff they shouldn't even buy), obama runs a welfare country, and pays for crap like solyndra etc he shouldn't (with our money!). Both lead to credit downgrades and bankruptcy. You can't spend your way out of a visa bill. But both AMD and Obama think you can. You have to PAY IT OFF. Like NV, no debt. Spend what you HAVE, not what you have to charge.
Another example. IMG.L, just paid triple what they should have for the scrap of MIPS. I think this will be their downfall. They borrowed 22million to pay 100mil bid for mips. It was worth 30mil. This will prove to be Imaginations downfall. That along with having chip sales up 90% but not charging enough to apple for them. They only made 30mil for 6 MONTHS! Their chip powers all of apples phones and tablets graphics! They have a hector ruiz type running their company too I guess. Hope they fire him before he turns them into AMD. Until Tegra4 they have the best gpu on a soc in the market. But they make 1/10 of what NV does. Hmmm...Wrong pricing? Apple pockets 140Bil over the life of ipad/iphone...But IMG.L had to borrow 22mil just to buy a 100mil company? They need to pull a samsung and raise prices 20% on apple. NV bought icera with 325mil cash...Still has 3.74B in the bank (which btw is really only up from 2007 because of Intel's 300mil/yr, not overcharging you).
CeriseCogburn - Sunday, February 24, 2013 - link
Appreciate it. Keep up the good work, as in telling the basic facts called the truth to the dysfunctional drones.no physx
no cuda
no frame rate target (this is freaking AWESOME, thanks nVidia)
no "cool n quiet" on the fly GPU heat n power optimizing max fps
no TXAA
no same game day release drivers
EPIC FAIL on dual cards, yes even today for amd
" While it suffers from the requirement to have proper game-specific SLI profiles for optimum scaling, NVIDIA has done a very good job here in the past, and out of the 19 games in our test suite, SLI only fails in F1 2012. Compare that to 6 out of 19 failed titles with AMD CrossFire."
http://www.techpowerup.com/reviews/NVIDIA/GeForce_...
nVidia 18 of 19, 90%+ GRADE AAAAAAAAAA
amd 13 of 19 < 70% grade DDDDDDDDDD
Iketh - Tuesday, February 19, 2013 - link
please drag yourself into the street and stone yourselfCeriseCogburn - Sunday, February 24, 2013 - link
LOL awww, now that wasn't very nice... May I assume you aren't in the USA and instead in some 3rd world hole with some 3rd world currency and economy where you can't pitch up a few bucks because there's no welfare available ? Thus your angry hate filled death wish ?MrSpadge - Tuesday, February 19, 2013 - link
Don't worry.. price will drop if they're actually in a hurry to sell them.