GPU Performance

Since the iPad Air uses the same A7 silicon as the iPhone 5s, it also uses the same on-die GPU as the 5s: IMG’s PowerVR G6430. This is a 4-cluster configuration of IMG’s latest graphics hardware, running at some relatively high frequency. I already went into some detail on the G6430 in our 5s review so I won’t rehash that here, but we’re basically looking at a shift to a more efficient scalar architecture.

I still don’t have confirmations of clock speeds, but I believe we’re looking at a max GPU clock of around 450MHz. As you’ll see from the results below, there’s a small difference in performance between the iPad Air and iPhone 5s in terms of peak GPU performance - implying very similar clocks. The difference is the iPad Air should be able to sustain its max frequency longer than the iPhone 5s can.

Mobile SoC GPU Comparison
  PowerVR SGX 554MP4 PowerVR G6430 PowerVR G6430
Used In iPad 4 iPhone 5s iPad Air
SIMD Name USSE2 USC USC
# of SIMDs 32 4 4
MADs per SIMD 4 32 32
Total MADs 128 128 128
GFLOPS @ 300MHz 76.8 GFLOPS 76.8 GFLOPS 76.8 GFLOPS
GFLOPS as Shipping 68.1 GFLOPS (?) 115.2 GFLOPS 115.2 GFLOPS

Since we’re talking about an A7 here and not an X-series SoC, there’s still only a 64-bit wide memory interface. As memory bandwidth is a key enabler of GPU performance I was curious to see how GPU performance compared to the outgoing iPad 4 with its much wider memory interface. Do keep in mind that the A7 does include a large system cache on-die, which can help improve effective memory bandwidth.

GFXBench 2.7

We'll start our GPU performance analysis with a look at low level results using GFXBench/GLBenchmark 2.7. The low level tests, particularly the offscreen ones, should give us some idea as to whether or not there's any increase in GPU frequency for the iPad Air vs. iPhone 5s implementations of A7.

GLBenchmark 2.7 - Fill Test (Onscreen)

GLBenchmark 2.7 - Fill Test (Offscreen)

Looking at the fill rate results, there's a 4.5% increase in performance compared to the iPhone 5s. That could be the magnitude of clock increase that we're seeing between A7s. Apple could very well be relying on more thermal headroom in the iPad Air to provide any real world GPU performance advantages over the iPhone 5s.

GLBenchmark 2.7 - Triangle Throughput (Onscreen)

GLBenchmark 2.7 - Triangle Throughput (Offscreen)

We see an even smaller gap between the Air and 5s in the triangle throughput tests (2.9%). There doesn't seem to be any substantial difference in GPU frequency between A7 implementations here. The regression in triangle rate performance compared to the iPad 4 is explained by differences in how Series 6 and Series 5XT GPUs scale in width. Whereas 5XT replicated nearly the entire GPU for "multi-core" versions, multi-cluster versions of Rogue only replicate at the shader array. The result? We don't see the same sort of peak triangle setup scaling we did back on multi-core 5XT parts. I'm not sure I'm particularly happy with the magnitude of the regression here, but I haven't seen any real world cases where it matters yet.

Next up are the game simulation tests. We'll start with the more strenuous of the two: T-Rex HD.

GLBenchmark 2.7 - T-Rex HD (Onscreen)

Here we get closer to Apple's claims of a 2x increase in performance. The iPad Air delivers 75% more performance than the iPad 4 in this test. Once again the iPhone 5s pulls ahead but that's because the onscreen tests render at display resolution, which is lower on the 5s.

GLBenchmark 2.7 - T-Rex HD (Offscreen)

Offscreen performance sees similar scaling: ~69% better performance compared to the iPad 4.

GLBenchmark 2.5 - Egypt HD (Onscreen)

GLBenchmark 2.5 - Egypt HD (Offscreen)

3DMark

We're once again running 3DMark's newest Unlimited mode which does its best to run independently of v-sync and at a standard resolution across all devices. I've also included 3DMark Extreme results below that feature a few more comparison points.

3DMark Unlimited - Ice Storm

The overall Ice Storm scores show a 36% improvement in performance over the iPad 4 and an 8% increase compared to the iPhone 5s. Given the CPU frequency advantage of the A7 in the Air vs. the iPhone 5s, I'm guessing that's why we're seeing the performance gap we are here.

3DMark Unlimited - Graphics Score

If we focus exclusively on the GPU tests (which themselves are still CPU bound), the iPad Air's performance advantage over the iPad 4 grows to over 60%.

3DMark Unlimited - Graphics Test 1

3DMark Unlimited - Graphics Test 2

3DMark Unlimited - Physics Score

I'm still not entirely sure what's going on with the 3DMark Physics test, but we've seen this two reviews in a row now where Cyclone showed no performance increase at all compared to Swift despite this being largely a CPU test.

3DMark - Ice Storm (Extreme)

3DMark - Graphics Score (Extreme)

3DMark - Graphics Test 1 (Extreme)

3DMark - Graphics Test 2 (Extreme)

3DMark - Physics Score (Extreme)

Basemark X

Basemark X is a new addition to our mobile GPU benchmark suite. There are no low level tests here, just some game simulation tests run at both onscreen (device resolution) and offscreen (1080p, no vsync) settings. The scene complexity is far closer to GLBenchmark 2.7 than the new 3DMark Ice Storm benchmark, so frame rates are pretty low.

I'm still having random issues with Basemark X reliably running both on and offscreen tests on iOS 7. Unfortunately I could only get onscreen results for the iPad Air, which came in at 46% faster than the iPad 4. Note the iPad mini and iPhone 5s benefit from having lower native resolutions here, which is why they perform so well.

Basemark X (Onscreen)

CPU Changes, Performance & Power Consumption Display
Comments Locked

444 Comments

View All Comments

  • Ryan Smith - Wednesday, October 30, 2013 - link

    Which is the difference between having me do something, and having Anand do something. There is a very good reason he's The Boss, after all. He's a machine (figuratively speaking).;-)
  • Krysto - Wednesday, October 30, 2013 - link

    > A single CPU core can use up 8GB/s of bandwidth. I’m still vetting other SoCs, but so far I haven’t come across anyone in the ARM camp that can compete with what Apple has built here.

    Didn't Exynos 5250 have 17.5 GB/s bandwidth, according to Samsung's whitepaper? That would be more than 8GB/s per core.
  • raptorious - Wednesday, October 30, 2013 - link

    Maybe if you're talking about L1 or L2 bandwidth, but not DRAM. The DRAM interface on that chip is 2x32 bit and it's clocked at DDR3-1600, which means it can reach a _peak theoretical max_ of 12.8 GB/s. Theoretical bandwidth is one thing, what the CPU can actually sustain to DRAM is a whole different story.
  • AbbyYen - Wednesday, October 30, 2013 - link

    the SHIELD is A BEAST!

    well done NVIDIA, now made it into tablet form with those battery. I don't care if it is a cylinder type, protrude the tablet if you have to so it won't lay flat on table and won't need no magic cover!

    iPad for leisure only, play and usefulness is still Android!
  • darkcrayon - Wednesday, October 30, 2013 - link

    Shield is much lower resolution, obviously. Remember also the iPad is (finally) getting official physical controller support, which makes it a potentially much more versatile device than the Shield. And clearly iOS is no slouch when it comes to developer support for games. About the only thing nvidia has here is the willingness to pay for more exclusives. How many people do you think would get a general purpose iPad and use it for awesome games vs buying a one-trick pony like the Shield?
  • ddriver - Wednesday, October 30, 2013 - link

    LOL, this is a new high (or low) in manipulative reviews, so now the A7 is faster than a Haswell? With all the money AT pockets in, one would thing they'd at least write a set of in-house REAL CPU benchmarks, especially after all the b1tching about manufacturers cheating in the 3rd party benchmarks that are widely available...
  • Wilco1 - Wednesday, October 30, 2013 - link

    It's not as unlikely as you think. An old Cortex-A9 gets half the performance of SB on SPECINT, and A15 is much closer still. Apple's A7 is even wider so should be able to beat a low clocked Haswell. That doesn't mean it can beat the 4GHz beasts - for that Apple would need to clock their A7 much higher and increase core counts, and that is unlikely for now.

    However I fully agree with you on Anand's state of benchmarks. That he still keeps using SunSpider (one of the most cheated on benchmarks) while at the same time talking about frequency changes as cheating is beyond me. And the claim of A7 being 6-wide is very dubious, it seems to me it's 4-way - it hasn't that much higher IPC than an A15.
  • thunng8 - Wednesday, October 30, 2013 - link

    Not much lower IPC? You've got to be kidding right?

    A 1.9ghz A15 scores 920 in single core geekbench
    A 1.3ghz Apple A7 scores 1400.

    IPC looks to be about double on the Apple A7. In what world is double preset close.

    Your assertion that deliberately changing frequency when detecting benchmark is not cheating is laughable.
  • Wilco1 - Wednesday, October 30, 2013 - link

    Actually he 64-bit score includes hardware acceleration of crypto functions which artificially inflate the score. The 32-bit integer score is 1065 for 1.3GHz A7 vs 895 for 1.6GHz 5410. That's about 46% IPC gain, or just one generation. A57 will have almost the same IPC as A7, and is still 3-way OoO.

    Your idea that running a CPU at its designed maximum frequency is cheating is ridiculous. Ever done any benchmarking? I think not. The first thing you do when benchmarking is disable DVFS so you get meaningful and repeatable results at the maximum frequency.
  • errorr - Wednesday, October 30, 2013 - link

    But how does that make benchmarks worthwhile if DVFS is disabled? I want to know if an aggressive throttle is applied in situations. If the DVFS table throttles in the middle of a benchmark the problem is the table.

    This just shows where ARM needs the most help compared to the other players. Intel and AMD are way ahead in power managment and the newer graphics even throttle much more intelligently, quicker, and with better granularity than any ARM chip. This is one of the last low hanging fruits in performance for such power limited chips. On die voltage regulation is a must for future generations.

Log in

Don't have an account? Sign up now