Memory Performance: 16GB DDR3-1333 to DDR3-2400 on Ivy Bridge IGP with G.Skillby Ian Cutress on October 18, 2012 12:00 PM EST
- Posted in
- Ivy Bridge
Memory reviews are, in my opinion, actually quite hard to do. There are plenty of memory kits available that are nice and cheap, and the easy way to differentiate between them in a review is usually though synthetics – without too much effort we can find memory comparison articles online that deal solely in synthetics. The downside of synthetics is that they rarely emulate real-world performance. When the requests came in for a comparison of memory kits available on the market, I was stumped to find real-world examples where memory truly matters by significant margins, and benchmarks to match. Fast forward a month or so, and we have compiled a series of tests taking advantage of some of the most memory limited examples common to most users – IGP performance using memory from DDR3-1333 to DDR3-2400. Inside this review we have also mixed in some encoding, compression, and you may be surprised to hear that USB 3.0 performance is also affected by memory speed. In this article we also look at and review the memory kits that G.Skill has gracefully provided from their Ares, Sniper, RipjawsX, RipjawsZ and TridentX brands.
Memory in a Nutshell
Graphical performance is all about vector calculations - moving data from memory to the compute units for calculation then placing it back out again where required. High end graphics cards do this quite well, with the high end NVIDIA GTX680 video cards achieving a rated bandwidth of ~192 GB/s. In comparison, integrated graphics have a tough time. Their main memory store is the system memory, which can vary from 10 GB/s to 50 GB/s depending on the platform. There are architectural decisions made in both circumstances (discrete and IGP) to reduce the importance of memory bandwidth, and software can be written to hide the memory bandwidth or memory latency issues. But the fact still remains that memory bandwidth is key and vital for a good number of real-world applications and usage scenarios.
The future of memory is a little mysterious to say the least. Current modern systems run DDR3 SDRAM that can vary in speed from 800 MHz to 3000 MHz, which also varies in price, performance, power usage, and if the memory controller can handle such a speed. Those 3000 MHz modules cost a pretty penny, and are reputed to only work with 1 in 10 Ivy Bridge processors. The immediate future for memory still lies in DDR3 – the next iteration, DDR4, is still several years away. We are told that on the Intel side of things, Haswell is DDR3, as will be Broadwell, the Haswell replacement. Reports expect DDR4 to be less than 10% of the market in late 2014 (early adoption in the high end space), but 50%+ across 2015. DDR4 is expected to have a base speed of 2133 MHz up to 3200 MHz for initial enthusiast applications – though given the rise in enthusiast speeds this could seemingly be pushed to 4266 MHz+ over the course of the development cycle. DDR4 is also expected to be a single module per channel, paving the way for up-to-quad channel in the mainstream arena.
There are also exciting technologies being developed in the memory space, for both NAND and DRAM – memristors, ReRAM, stacked memory, spintronics et al. If history is anything to go by, as long as these technologies are not hindered by patents, trolls or physics, each could lead to interesting products coming to market. Though we may have to wait several years, and chances are that only one or two will come through for their respective markets, and the rest will go the way of Betamax and HD-DVD.
Back to our DDR3 memory, G.Skill was kind enough to provide us several kits for this overview of memory performance. Most DDR3 kits on sale for the vast majority of users come in speeds from 1333 MHz to 2133 MHz. Anything above DDR3-2133 is definitely in the enthusiast range, and as such G.Skill also sent us a DDR3-2400 kit to test for this overview. In due course we also have a DDR3-2666 kit to test, so stay tuned for that review.
Not All About The MHz
But memory is not all about the MHz, just as computer speed is not all about the MHz and cores. Deciding when memory should be accessed, what delays to be put in place between read and write cycles are the sub-timings. These sub-timings are arguably more important than the MHz number, as we will see in the review. The main timings on display to the public are the following:
CAS Latency (CL)
RAS to CAS (tRCD)
RAS Precharge (tRP)
Row Active Time (tRAS)
Row Cycle Time (tRC)
Command Rate (CR)
For a very extensive look into memory, our last big memory article went into obscene depth of how memory works. Please read it here, and I will confess that I do not understand it after just reading it, but need a pen and paper when going through it thoroughly. One of the most important images of that memory article is the following:
Shown here are pair of "back-to-back" reads. Our example Row Cycle Time (tRC) lets us transfer up to 16 bytes of data with a minimum Page open time of 24T using CL-tRCD-tRP-tRAS timings of 6-6-6-18
Using this image, from left to write, we can explain what the timings mean to a certain degree:
tRAS determines the length of time between initialization and the requirement for the memory row to recharge. Within this tRAS we need a tRCD to initialize the column of the row from which we would like to read. After the tRCD is the CL, which provides a read latency. There are also other features which allow for reads across multiple columns within the tRAS, however in order to move to the next row the tRAS needs to end and the tRP allows the next row to precharge.
All this means that:
If tRAS is a low number, it is quick to read from different rows. If it is a high number, reading from different columns is easier.
If CL is a low number, reading from within a row (and the columns) is quicker.
If tRCD is low, more CLs can be initialized inside the tRAS.
If tRP is low, then the overall time (tRAS+tRP) to jump between row reads is quicker.
When we buy a memory kit, we usually get a SKU number and a description of the modules at hand. Let us look at the first kit we will be testing today:
4x4 GB DDR3-1333 9-9-9-24 1.50V
The first line describes the module in the form of a SKU, which allows for stock checking. In this case, G.Skill’s naming scheme makes it simple – F3 means DDR3; 1333C9 means 1333 MHz with CL9; Q means quad module kit; 16G means it is a 16GB kit; A means the Ares branding; and O means our kit is colored orange.
The second line is a little more readable. First we get the size of the kit (4x4 GB) then the speed (DDR3-1333). Next are the sub-timings, which will always appear in the order of CL-tRCD-tRP-tRAS. This means that our three main sub-timings are 9-9-9, and the tRAS is 24. The last bit of information is the voltage of the kit.
So What Can I Do About Sub-Timings?
As a general rule, lower is better. Memory kits on the market will vary in their subtimings – you can purchase DDR3-1333 9-9-9, DDR3-1600 11-11-11 all the way up to DDR3-3000 12-14-14. The question then becomes whether you want to decide between two similar kits. Imagine the following kits:
In a lot of scenarios, an enthusiast may take one look at these numbers and tell a user that these kits are equivalent – boosting the memory speed but increasing the sub-timing latencies causes similar performance. There is one way to determine whether a kit might be better than another, and that is to look at the calculable latency of the kit.
Calculating this value is a simple enough formula:
2000 x (CL / Speed ) = Latency in nanoseconds (ns)
Thus for the three kits above:
DDR3-1333 9-9-9 has a latency of 13.5 ns
DDR3-1600 10-10-10 has a latency of 12.5 ns
DDR3-1866 11-11-11 has a latency of 11.79 ns
This latency essentially tells us which kit is fastest at non-sequential reads. Non-sequential reads are important in a lot of variable scenarios, such as video games whereby the user could perform one of a billion different actions and as such different elements of the memory have to be loaded.
The downside of this test is that it does not take into account consecutive reads. When dealing with conversion, video editing, or anything that requires a large dataset to be read sequentially, we have to look at how long reads are processed.
The way to check this with DDR3 is as follows:
Cycle time in ns = 1000 / (Memory Speed / 2)
Bit time in ns = 1000 / Memory Speed
The time to read a single word of data (word is a technical term meaning 64 bits) is given by the Cycle Time multiplied by the CL. The time to read eight words is the Cycle Time multiplied by the CL then add seven lots of Bit Time. Let us go through the memory kits above with this method.
DDR3-1333 9-9-9 has a Cycle Time of 1.5 ns and a Bit Time of 0.75 ns
The time to read one word is 1.5*9 = 13.5 ns
The time to read eight words is 13.5 + 7 * 0.75 = 18.75 ns
DDR3-1600 10-10-10 has a Cycle Time of 1.25 ns and a Bit Time of 0.625 ns
The time to read one word is 1.25 * 10 = 12.5 ns
The time to read eight words is 12.5 + 7 * 0.625 = 16.875 ns
DDR3-1866 11-11-11 has a Cycle Time of 1.07 ns and a Bit Time of 0.536 ns
The time to read one word is 1.08 * 11 = 11.79 ns
The time to read eight words is 11.79 + 7 * 0.536 = 15.54 ns
In both the sort reads and long reads, DDR3-1866 11-11-11 wins out of the three kits. But what if it was not so clear cut?
The following kits have the following timings and results:
DDR3-2000 at 9-9-9 reads one word in 9 ns and eight words in 12.5 ns
DDR3-1666 at 7-7-7 reads one word in 8.75 ns and eight words in 13.125 ns
This means that the DDR3-2000 kit should be better for longer reading workloads, whereas the DDR3-1666 kit should be better for random reads.
I should stress (and add a disclaimer) that this comparison is all at the high level, as we are only talking about memory speed and CAS Latency – everything else plays its part, and I highly suggest reading Rajinder’s memory article to get a deeper look as to how this all works.
Personally, I use these formulas when overclocking competitively – if I have two kits, one of which can do DDR3-2000 6-7-7 and the other is DDR3-2666 11-13-13, I can decide which one is more appropriate for the benchmark in question.
This funny little number at the end is often quoted as 1T or 2T depending on the memory kit, how many modules are installed, and the motherboard settings. The command rate is the address and command decode latency, essentially the delay between accessing and decoding data - this delay allows the memory time to be accessed without errors.
In an ideal world there should be no latency, but in performance tests using a setting of 1T is shown to be quicker than 2T for synthetic benchmarks. Whether a user can feel the difference (in essence it adjusts peak bandwidth as well) is debatable, but the slower the kit as standard, the more of a difference will be felt between the two options. The argument also exists that a setting of 2T will allow the kit to be overclocked higher.
By default 2T is usually selected for memory kits that contain more modules - on the off chance that one module of the kit cannot perform at the stated speed using 1T timings, defaulting to 2T will make sure more modules pass the binning process.
Standards and The Issue With Memory
Contrary to the most popular of beliefs, memory kits do not work as stated out of the box. The number of times I have walked through a large LAN event and found people playing games on $2000+ water cooled systems, only to find that their kit of DDR3-2400 is actually running at DDR3-1333 astounds me. It is a lot more common than you think, and there is probably someone you know that is a culprit of this. Making sure memory is set at its rated speed is an important part of the process, and as an enthusiast we have a job to make sure that is the case.
Rant aside, this is an important point – when we buy a processor, it always runs at the stated speed. When we plug it into the system, there is no fiddling required. If every time I installed a processor I had to go into the BIOS and adjust it so it runs above 1.2 GHz or 1.6 GHz, I would be annoyed. So why is there this discontinuity on the memory side? Why do we have to go into the BIOS to adjust the memory speed to what it says on the box?
The issue is largely down to compatibility. When a processor is installed into the board, the processor knows that it will go into a board that has the right socket, it knows that there will be pins for a certain number of PCIe lanes or for data transfer to the chipset. It also knows that there will be memory on the end of some pins that runs at a designated multiplier as dictated by the BIOS. The issue with memory is that the memory does not know where it will be plugged into.
A DDR3 module or kit could be plugged into any DDR3 compatible motherboard, and paired with AMD, Intel, or any other processor capable of DDR3, such as server parts. As processor design is now putting the memory controller onto the CPU itself, the capabilities of that memory controller can vary wildly. On a Xeon processor, the system may only accept 1600 MHz maximum due to the capable multipliers, so it would be foolish to try and boot the system with a 2133 MHz kit attempting to apply full speed. We could plug at DDR3-2666 kit into a Sandy Bridge system, but the memory controller would refuse to run at 2666 MHz. However, take the same motherboard and an Ivy Bridge processor, and the memory should be able to work. Then at the high end, remember I mentioned that there are DDR3-3000 memory kits that only work with 10% of Ivy Bridge i7-3770K processors? There is that too. I could plug in a four module DDR3-kit into a 990FX board, a P67 motherboard, a B75 motherboard, or something nice and obscure. The memory does not know what processor or memory controller it is going to get, but the processor does know that it will get DDR3 when it is plugged in. There are a lot more variables on the memory side which are unpredictable.
With that being said, we have seen some Kingston memory with plug-and-play capabilities. This memory was limited in speed, availability, and did not catch on in the way that it should. Speaking with memory vendors, the main barrier to this being applied globally are the motherboards themselves – the motherboard should be able to recognize a plug-and-play kit then adjust accordingly. There are already standards set in place (JEDEC, XMP – more later on these), so if the plug-and-play does not work, then the speed will be reduced down to the one that works. It sounds simple, but then again how do we confirm that the memory works? If it boots into an operating system, or if it survives 72 hours of MemTest86 or Linpack? Do people want to wait 3 days to get the system at the speed the kit is rated? The answer is almost certainly no, hence why we are limited to adjusting a BIOS setting to get the speed we want.
I have floated the idea of having software with the memory kit to enable XMP through the operating system, but the main barrier to that is the need for the software to work with every motherboard available. The next thought was to whether the motherboard manufacturers could create the software, to enable a JEDEC or XMP setting on the next boot through software. As expected, the answer was the complication of so many modules and so many motherboards. The answer to this new problem would be to include standards to the memory and the motherboards so this all works – but there are already standards. For this to work, it would require a deep partnership between a motherboard manufacturer and a memory vendor, potentially aiding sales from both sides. We will see.
In the meantime, make sure your friends and family are running their memory at rated speed!
Enough! Where Is All The Memory?
This review takes into account five kits from DDR3-1333 to DDR3-2400. Many thanks to G.Skill for providing us with these memory kits, one of each from their Ares, RipjawsX, Sniper, RipjawsZ and TridentX series. Specifically, we have the following kits:
4 x 4 GB DDR3-1333 9-9-9-24 1.50 V : F3-1333C9Q-16GAO (Ares)
4 x 4 GB DDR3-1600 9-9-9-24 1.50 V : F3-12800CL9Q-16GBXL (RipjawsX)
4 x 4 GB DDR3-1866 9-10-9-28 1.50 V : F3-14900CL9Q-16GBSR (Sniper)
4 x 4 GB DDR3-2133 9-11-10-28 1.65 V : F3-17000CL9Q-16GBZH (RipjawsZ)
4 x 4 GB DDR3-2400 10-12-12-31 1.65 V : F3-2400C10Q-16GTX (TridentX)
Over the next few pages, we take the run down of all these kits.
Post Your CommentPlease log in or sign up to comment.
View All Comments
Calin - Friday, October 19, 2012 - linkI remember the times when I had to select the speed of the processor (and even that of the processor's bus) with jumpers or DIP switches... It wasn't even so long ago, I'm sure anandtech.com has articles with mainboards with DIP switches or jumpers (jumpers were soooo Pentium :p but DIP switches were used in some K6 mainboards IIRC )
Ecliptic - Friday, October 19, 2012 - linkGreat article comparing different speed ram at similar timings but I'd be interested in seeing results at different timings. For example, I have some ddr3-1866 ram with these XMP timings:
1333 @ 6-6-6-18
1600 @ 8-8-8-24
1866 @ 9-9-9-27
The question I have is if it better to run it at the full speed or lower the slower speed and use tighter timings?
APassingMe - Friday, October 19, 2012 - link+ 1
+ 2, if I can get away with it. I've always wondered the same thing. I have seen some minor formulas designed to compare... something like frequency divided by timing, in order to get a comparable number. But that is pure theory for the most part, I would like to see how the differences in the real world effects different systems and loads.
Spunjji - Friday, October 19, 2012 - linkBut in all seriousness, I would find that to be much more useful - it's more likely to actually be used for IGP gaming.
If you could go as far as to show the possible practical benefits of the higher-speed RAM (e.g. new settings /resolutions that become playable) that would be spiffing.
vegemeister - Friday, October 19, 2012 - linkStop using 2 pass for benchmarks. Nobody is trying to fit DVD rips onto CD-Rs anymore. Exact file size *does not matter*. Using the same CRF for every file in a set (say, a season of a television series) produces a much better result and takes less time (you pretty much avoid the first pass).
IanCutress - Friday, October 19, 2012 - linkThe 2-pass is a feature of Greysky's x264 benchmark. Please feel free to email him if you would like him to stop doing 2-pass. Or, just look at the 1st pass results if the 2nd pass bothers you.
rigel84 - Friday, October 19, 2012 - linkHi, I don't know if I somehow skipped it in the article, but if I buy a 3570k and some 1866mhz memory, wouldn't I have to overclock the CPU in order for them to run at that speed? I'm pretty sure I had to overclock my RAM on my P4 2,4ghz, in order to use the extra mhz.. Does my memory fail me or has things changed?
IanCutress - Friday, October 19, 2012 - linkNo, you do not have to overclock the CPU. This has not been the case since the early days :D. Modern computer systems in the BIOS have an option to adjust the memory strap (1333/1600/1866 et al.) as required. On Intel systems and these memory kits, all that is needed it to set XMP - you need not worry about voltages or sub-timings unless you are overclocking the memory.
CaedenV - Friday, October 19, 2012 - linkas there is an obvious difference with ram speed for onboard graphics, the next obvious question is one of how much memory is needed to prevent the system from throwing things back on the HDD?
The reason I ask is that 16GB, while relatively cheap today, is still a TON of ram by today's standards, and people who are on a budget where they are playing with igp are not going to be able to afford an i7, and much less be willing to fork over ~$100 for system memory. However, if there is no performance hit moving down to 8GB of system memory it becomes much more affordable for these users to purchase better performing ram because the price points are even closer together between the performance tiers. As I understand memory usage, there should be no performance hit so long as there is more memory available than is actively being used by the game, so the question is how much is really needed before hitting that need for more memory? is the old standard of 4GB enough still? or do people need to step up to 8GB? or, if nothing is getting passed onto a dedicated GPU, do igp users really need that glut of 16GB of ram?
Lastly, I remember my first personal build being a Pentium 3 1GHz machine for a real time editing machine for college. I remember it being such an issue because the Pentium 4 was out, but was tied to Rambus memory which had a high burst rate, but terrible sustained performance, and so I agonized for a few months about sticking with the older but cheaper platform that had consistent performance, vs moving up to the newer (and terribly more expensive) P4 setup which would perform great for most tasks, but not as well for rendering projects. Anywho, I ended up getting the P3 with 1GB of DDR 133 memory. I cannot remember the actual price off hand (2001), but I do remember that the system memory was the 2nd most expensive part of the system (2nd to the real time rendering card which was $800). It really is mindblowing how much better things have gotten, and how much cheaper things are, and one wonders how long prices can remain this low with sales volumes dropping before companies start dropping out and we have 2-3 companies that all decide to up prices in lock step.
IanCutress - Friday, October 19, 2012 - linkWith memory being relatively cheap, on a standard DDR3 system running Windows 7, 8 GB would be the minimum recommendation at this level. As I mentioned in my review, in my work load the most I have ever peaked at was 7.7 GB, and that was while playing a 1080p game with all the extras alongside lots of Chrome tabs and documents open at the same time.
Ideally this review and comparison should be taken from the perspective that you should know how much memory you are using. For 99.9% of the populace, that usually means 16GB or less. Most can get away with 8, and on a modern Windows OS I wouldn't suggest anything less than that. 4GB might be ok, but that's what I have in my netbook and I sometimes hit that.