In the past couple of years we've seen the creation of a number of new low level graphics APIs. Arguably the first major initiative was AMD's Mantle API, which promised to improve performance on any GPUs that used their Graphics Core Next (GCN) architecture. Microsoft followed suit in March of 2014, with the announcement of DirectX 12 at the 2014 Game Developers Conference. While both of these APIs promise to give developers more direct access to graphics hardware in the PC space, there was still no low level graphics API for mobile devices, with the exception of future Windows tablets. That changed in the middle of 2014 at WWDC, where Apple surprised a number of people by revealing a new low level graphics and compute API that developers could use on iOS. That API is called Metal.

The need for a low level graphics API in the PC space has been fairly obvious for some time now. The level of abstraction in earlier versions of DirectX and OpenGL allows them to work with a wide variety of graphics hardware, but this comes with a significant amount of overhead. One of the biggest issues caused by this is reduced draw call throughput. A simple explanation of a draw call is that it is the command sent by the CPU which tells the GPU to render an object (or part of an object) in a frame. CPUs are already hard-pressed to keep up with high-end GPUs even with a low level API, and the increased overhead of a high level graphics API further reduces the amount that can be issued in a given period of time. This overhead mainly exists because most graphics APIs will do shader compilation and state validation (ensuring API use is valid) when a draw call is made, which takes up valuable CPU time that could be used to do other things like physics processing or drawing more objects.

Because a draw call involves the CPU preparing materials be rendered, developers can use tricks such as batching, which involves grouping together items of the same type to be rendered with a single draw call. Even this can present its own issues, such as objects not being culled when they are out of the frame. Another trick is instancing, which involves making a draw call for a single object that appears many times, and having the GPU duplicate it to various coordinates in the frame. Despite this, the overhead of the graphics API combined with the time that it takes the CPU itself to issue a draw call ultimately limits how many can be made. This reduces the number of unique objects developers can put on screen, as well as the amount of CPU time that is available to perform other tasks. Low level graphics APIs aim to address this by removing much of the overhead that exists in current graphics APIs.

The question to ask is why do Apple and iOS developers need a low level graphics API for their mobile games? The answer ends up being the same as the PC space. While the mobile space has seen tremendous improvements in both CPU and GPU processing power, the pace of CPU improvements is slowing when compared to GPU improvements. In addition, the increases GPU processing power were always of a greater magnitude than the CPU increases. You can see this in the chart above, which shows the level of the CPU and GPU performance of the iPad relative to its original model. Having CPU performance improve by a factor of twelve in less than five years is extremely impressive, yet it pales in comparison to the GPU performance which, in the case of the iPad Air 2, is 180 times faster than its original version.

Because of this widening gap between CPU and GPU speeds, it appears that even mobile devices have begun to experience the issue of the GPU being able to draw things much faster than the CPU can issue commands to do so. Metal aims to address this issue by cutting through much of the abstraction that exists in OpenGL ES, and this is possible in part because of Apple's control over their hardware and software in their devices. Apple designs their own CPU architectures, and while they don't design the GPU architecture, it's clear they're free to do what they desire to with the IP to create the GPUs they need.

The other side of the discussion is compatibility. Much of the abstraction in higher level graphics APIs is done to support a wide variety of hardware. Low level graphics APIs often are not as portable or widely compatible as high level ones, and this is also true of Metal. The iOS Metal API currently only works on devices that use GPUs based on Imagination Technologies' Rogue architecture, which limits it to devices that use Apple's A7, A8, and A8X SoCs.

This can pose a dilemma for developers, as programming only for Metal limits the number of users they can target with their application. The number of older iPads and iPhones still in use, as well as Apple's insistence on selling the original iPad Mini and iPod Touch which use their A5 SoC from 2011, can limit the market for games that use Metal. If I were to make a prediction, it would be that Metal's adoption among iOS developers will grow substantially in the next year or two when devices that use the A5 and A6 chips are retired from sale.

Kishonti Informatics, the developer of the GFXBench GPU benchmarking application, have released a new version of their benchmark. The new benchmark is called GFXBench Metal, and it's essentially the same benchmark as the normal GFXBench 3.0 / 3.1. The difference is that this version of the benchmark has been built to use Apple's Metal API rather than OpenGL ES. Although it's not one of the first Metal applications, it's one of the first benchmarks that can give some insight into what improvements developers and users can see when games and other 3D applications are built using Metal rather than OpenGL ES.

Before getting into the results, I did want to address one disparity that may be noticed about the non-Metal iPad Air 2 results. It appears that Apple has been making some driver optimizations for the A8X GPU with iOS releases that have come out since our original review. Because of this, the iPad Air 2's performance in the OpenGL version of GFXBench 3.0 is noticeably improved over our original results. To avoid incorrectly characterizing the improvements that Metal brings to the table, all of the iPad tests for the OpenGL and Metal versions of the benchmark were re-run on iOS 8.3. Those are the results that are used here. Testing with the iPhone 5s and 6 revealed that there are no notable improvements to the performance of Apple A7 and A8 devices.

GFXBench 3.0 Driver Overhead Test (Offscreen)

GFXBench 3.0's driver overhead test is one we don't normally publish, but in this circumstance it's one of the most important tests to examine. What this test does is render a large number of very simple objects. While that sounds like an easy task, the test renders each object one by one, and issues a separate draw call for each. This is essentially the most inefficient way possible to render the scene, as the GPU will be limited by the draw call throughput of the CPU and the graphics API managing them.

In this test, it's clear that Metal provides an enormous increase in performance. Even the lowest performance improvement for a device on Metal compared to OpenGL is still well over a 3x increase. While this test is obviously very artificial, it's an indication that Metal does indeed provide an enormous improvement in draw call throughput for developers to take advantage of.

GFXBench 3.0 Manhattan (Offscreen)

GFXBench 3.0 T-Rex HD (Offscreen)

While the driver overhead test is an interesting way of looking at how Metal allows for more draw call throughput, it's important to look at how it performs with actual graphics tests that simulate the type of visuals you would see in a 3D game. In both the Manhattan and T-Rex HD parts of GFXBench we do see an improvement when using Metal instead of OpenGL ES, but the gains are not enormous. The iPad Air 2 shows the greatest improvement, with an 11% increase in frame rate in T-Rex HD, and an 8.5% increase in Manhattan.

The relatively small improvements in these real world benchmarks illustrate an important point about Metal, which is that it is not a magic bullet to boost graphics performance. While there will definitely be small improvements due to general API efficiency and lower overhead, Metal's real purpose is to enable new levels of visual fidelity that were previously not possible on mobile devices. An example of this is the Epic Zen Garden application from Epic Games. The app renders at 1440x1080 with 4x MSAA on the iPad, and it displays 3500 animated butterflies on the screen at the same time. This scene has an average of 4000 draw calls per frame, which is well above what can currently be achieved with OpenGL ES on mobile hardware.

I think that Metal and other low level graphics APIs have a bright future. The introduction of Metal on OS X can simplify the process of bringing games to both Apple's desktop and mobile platforms. In the mobile space, developers of the most complicated 3D applications and games will be eager to adopt Metal as they begin to hit the limits of what visuals can be accomplished under OpenGL ES. While there are titles like Modern Combat 5 which use both Metal and OpenGL ES depending on the device, that method of development prevents you from using any of Metal's advantages effectively, as they will not scale to the OpenGL ES version. I cannot stress enough how much the continued sale of Apple A5 and A6 devices impedes the transition to using Metal only, and I hope that by the time Apple updates their product lines again those devices will be gone from sale, and eventually gone from use. Until that time, we'll probably see OpenGL ES continue to be used in most mobile game titles, with Metal serving as a glimpse of the mobile games that are yet to come.

Comments Locked


View All Comments

  • Despoiler - Monday, June 15, 2015 - link

    Any discussion concerning these newer APIs vs old ones MUST include frame time plots and min FPS. I'm guessing that if we had those for this comparison we would see lower frame times and much higher min FPS.
  • tipoo - Monday, June 15, 2015 - link

    So looks like a large difference in theoretical driver overhead tests, much smaller difference in actual frame rates in full scenes, as expected with any of these low overhead APIs. Not an instant game changer in anything GPU bound, but could be leveraged to get more out of the CPUs further down the line.

    I do hope Apple adopts Vulkan on Mac as well when that comes out though, having that everywhere would be amazing for competition with DX12.
  • pSupaNova - Tuesday, June 16, 2015 - link

    Apple are in the steering group, unless the effort is botched they will implement this API.
  • akdj - Monday, June 15, 2015 - link

    Will you be posting the same type of benchmarking scores for OS X and its implementation of Metal? Seems with the significant changes and fickle 'back and forth' Apple is using with AMD/nVidia's solutions beside the latest iGPUs from could glean a bit more 'need to know' when it comes to deciding which model to buy (including old or pre owned/refurb stock) based on their need for compute vs. 'gaming' performance
    Personally I 'live in' After Effects. Seeing Adobe's chief whatever officer quoted as a 6-8x performance increase is exciting to someone that spends 6/10 hours in a specific program each day (six of my ten hours each workday on average was how that's supposed to read!)

    Thanks --- and you never know. Swift was announced as X-Plat this year at WWDC;). I know it's a high level language, not low level API --- but I'd not be surprised Apple trying to keep the two players happy. Course they're married to their own APIs but we've seen Adobe's willingness to 'accept' that horsepower for rendering off the GPU on both APIs now for a couple of years

    Thanks again, all have a great week!
  • name99 - Monday, June 15, 2015 - link

    "Low level graphics APIs aim to address this by removing much of the overhead that exists in current graphics APIs."

    I don't think this is quite the right way to look at it. I think a better model is that something like OpenGL makes a large number of promises about what state is modified when, what state persists across which calls, etc. You generally don't need all these promises, but that's how the API has evolved over time, and so the API has to fulfill its side of the contract, even when no-one cares.

    Low level APIs promise much less, with the corollary, of course, that it's now you that has to ensure that such promises as you care about (eg tracking and rotating multiple buffers as they flow through different rendering stages, to ensure that you're neither waiting on a buffer nor leaking buffers).

    However even this is not as bad as it seems. The issue is not just that an API makes promise it has to enforce, it's a question of what the API exposes and abstracts. C makes few promises, but it exposes so much that there are severe constraints as to what the compiler can do automatically. In principle this means I can write a kick-ass parallel program, but for many purposes it's just a whole lot easier to use a higher level (eg functional language). The higher level language does not expose details that for the most part don't need to be exposed, I can't use [and rely upon] those details, and so the compiler has more power to optimize my code.
    What this means is that even when the user doesn't want to handle low level details and so delegates the work to SceneKit or MetalKit or CoreAnimation, those can ALSO do their jobs better than OpenGL because, once again, state is not exposed in such a way that (even though the user does not care) in principle that state could be important and so has to be tracked and updated by OpenGL.

    You can call this "overhead" if you want, but it misses the point. The issue is not that OpenGL should engage in various micro-optimizations to remove this overhead, it's that OpenGL simultaneously exposes too much state, and exposes that state via obsolete and unhelpful abstractions.
  • bji - Monday, June 15, 2015 - link

    100% this. Read this folks. If you are not a developer, you may get some insight into what the actual issues are with regard to API abstractions and what kinds of problems actually need to be addressed as software architectures evolve.
  • dmunsie - Monday, June 15, 2015 - link

    "Until that time, we'll probably see OpenGL ES continue to be used in most mobile game titles, with Metal serving as a glimpse of the mobile games that are yet to come."

    Isn't it likely that the bigger titles will have both OpenGL ES and Metal support in their engines? And for smaller developers, Unity or Scene/SpriteKit will provide both OGL and Metal support as well.
  • felaki - Monday, June 15, 2015 - link

    _Please_ include battery power consumption and separate CPU/GPU utilization % rates in future tests about Metal/D3D12/Vulkan. In applications that are already GPU bound and that are well pipelined without CPU-GPU sync bubbles, it is expected that there won't be much of a FPS improvement, and for a lot of mobile targeting games that is very often the case that the code has already been optimized well to use batching and instancing techniques.

    For these types of applications, the big prospect of migrating a game that already runs at 60fps on a mobile device on to Metal, is the chance that one could do 60fps with, say, 20% of active CPU consumption while the game is running, while before with OpenGL ES one might have had the 60fps with e.g. 80% of CPU consumption. This would not show up as a difference in fps, but it is still a huge difference for the consumer, since it could mean a longer sustained gameplay experience with one battery charge.
  • taigebu - Monday, June 15, 2015 - link

    +1 ! We need to see the difference in power consumption. If I remember correctly it was one of the seeling point Apple used last year when they introduced Metal on iOS: developers could either have the same performance for less battery usage or add more particles/effects/AI without it impacting the initial performance or battery usage.
  • toyotabedzrock - Monday, June 15, 2015 - link

    What is the battery advantage provided by metal if the developers keep detail levels the same? This might lead to more game time.

Log in

Don't have an account? Sign up now