Saturday, July 19, 2008

Everything You Need to Know About Dual Channel

Introduction

The system RAM memory prevents the PC of achieving its maximum capable performance. This happens because the processor (CPU) is faster than RAM memory and usually it has to wait for the RAM memory to deliver data. During this wait time the CPU is idle, doing nothing (that's not absolutely true, but it fits in our explanation). In a perfect computer, the RAM memory would be as fast as the CPU. Dual channel is a technique used to double the communication speed between the memory controller and the RAM memory, and thus improving the system performance. In this tutorial we will explain everything you need to know about dual channel technology: how it works, how to set it up, how to calculate transfer speeds and more.

Before explaining what dual channel is, let’s first explain how RAM memory is traditionally connected to the system.

Memory is controlled by a circuit called memory controller. This circuit is physically inside the chipset (north bridge chip – or MCH, Memory Controller Hub, as Intel calls this chip –, to be more specific), in the case of Intel CPUs, and inside the CPU, in the case of current AMD CPUs (i.e. CPUs based on AMD64 architecture; older AMD CPUs like Athlon XP use the same scheme as Intel CPUs).

RAM is connected to the memory controller thru a series of wires. These wires are divided into three groups: data, address and control. The wires from the data bus will carry data that is being read (i.e. transferred from the memory to the memory controller and then to the CPU) or written (i.e. transferred from the memory controller to the memory, coming from the CPU). The wires from the address bus tells the memory modules where exactly (i.e. which address) that data must be retrieved from or stored. And the control wires send commands to the memory modules, telling them what kind of operation is being done – for example, if it is a write (store) or a read operation. Another important wire present on the control bus is the memory clock signal. We summarize this on Figure 1. Our drawing is based on an Intel system. On AMD CPUs the memory controller is inside the CPU and thus the memory bus comes directly from the CPU with no “middleman”.












Figure 1: How memory is accessed.

The memory speeds (clock rates), maximum capacity and types (DDR, DDR2, DDR3, etc) a system can accept is defined by the chipset (Intel) or by the CPU (AMD). For example, accepting DDR3 memories on an Intel system will depend on the chipset (and on the motherboard providing the right kind of memory sockets) and not on the CPU. AMD systems currently can’t work with DDR3 memories because the embedded memory controller can’t recognize this technology.

As for the clock rates, if the memory controller can only generate a clock rate of, let’s say, 667 MHz (333 MHz x 2), your DDR2-800 memories will work at 667 MHz on this particular system. This is a physical limitation of your memory controller. Usually you will see this kind of limitation only on Intel systems, as AMD CPUs can recognize DDR2 memories up to 800 MHz (socket AM2 CPUs) or up to 1,066 MHz (socket AM2+ Phenom CPUs).

Another interesting thing refers to the maximum amount of memory the system can recognize. Most Intel CPUs have a 32- or a 36-bit memory address bus (here we are referring to the address bus available on the CPU external bus, i.e. on the CPU front side bus). This allows the CPU to recognize up to 4 GB (2^32) or 64 GB (2^36) of memory, respectively. But since it is the memory controller who will access the memory (not the CPU directly), this middleman may limit the maximum amount of RAM your system can have. For example, Intel P35 and G33 chipsets can only access up to 8 GB of RAM (2 GB per memory socket). Plus the motherboard manufacturer may not make available enough memory sockets on the motherboard in order to achieve the maximum amount of RAM that the CPU can theoretically access. For example, if a manufacturer produces a motherboard based on Intel G33 chipset with only two memory sockets, the maximum amount of memory you can have is 4 GB (2 GB per socket), even thought the chipset is capable of accessing up to 8 GB.

Since all kinds of memory modules available today are 64-bit devices, the memory data bus is 64-bit wide. What dual channel technology does is expand the memory data bus from 64 to 128 bits.

What is Dual Channel?

Dual channel is the ability that some memory controllers have to expand the width of their data busses from 64 to 128 bits. Considering that everything remains the same (clock speeds, for example), the memory maximum theoretical transfer rate is doubled by the use of this technique.

The maximum theoretical transfer rate (MTTR) is calculated using this formula:

MTTR = real clock rate x data transferred per cycle x bits transferred per cycle / 8

Or

MTTR = DDR clock rate x bits transferred per cycle / 8

Memories based on DDR (Double Data Rate) technology such as DDR-SDRAM, DDR2-SDRAM and DDR3-SDRAM transfer two data per clock cycle. Because of that they achieve double the transfer rate compared to traditional memories (such as the original SDRAM) running at the same clock rate. Because of that DDR-based memories are usually labeled with double their real clock rate. For example, DDR2-800 memories in reality work at 400 MHz transferring two data per clock cycle, and thus are labeled as being an “800 MHz” device, even though the clock signal doesn’t really work at 800 MHz.

So in the above formulas you to multiply the real clock rate by two, i.e. use the DDR clock rate.

So a DDR2-800 memory module – which is a 64-bit device, as mentioned before – has a maximum theoretical transfer rate of 6,400 MB/s (800 MHz x 64 / 8). This is why memory modules using DDR2-800 memory chips are also called PC2-6400. This number refers to the memory’s maximum theoretical transfer rate in MB/s (megabytes per second).

If we enable dual channel technique with DDR2-800 modules, the memory subsystem maximum theoretical transfer rate is doubled, jumping from 6,400 MB/s to 12,800 MB/s (800 MHz x 128 / 8), as we will be transferring double the amount of data (128 bits vs. 64 bits) each clock cycle.

It is very important to notice that these transfer rates are “theoretical”. When we calculate them we are assuming that a data transfer will occur at each clock cycle (i.e. that on a DDR2-800 memory 800,000,000 transfers per second will occur), which in fact never happens, because no CPU or memory controller is transferring data 100% of the time. That is why when you measure the actual memory transfer from your system using a program such as Sandra you always get a value lower than the maximum theoretical transfer rate.

It is also important to notice that the performance increase is achieved only on the memory subsystem; a 100% theoretical performance increase does not translate into a 100% performance increase in your whole computer. Only a small percentage of this memory performance increase will be reflected on the overall system performance.

At this moment we want to explain in details what physically happens with the memory data bus, because we’ve seen a lot of wrong information being posted on our forums about how dual channel technique works.

Let’s first assume a system that doesn’t support dual channel feature (i.e. a single channel system).

When we say that the memory data bus is 64-bit wide, this means that there 64 wires (yes, physical wires on the motherboard) connecting the memory controller and the memory sockets. These wires are labeled D0 thru D63. The memory data bus is shared among all memory sockets. The address and control busses will activate the proper memory socket depending on the address where data must be stored or read from. We illustrate this on Figure 2.
















Figure 2:
How single channel works.


On systems supporting dual-channel technology, the memory data bus is expanded to 128 bits. This means that on such systems there are 128 wires connecting the memory controller and the memory sockets. These wires are labeled D0 thru D127. Since each memory module can only accept 64 bits per cycle, two memory modules are used to fill the 128-bit data bus. So for dual-channel technology to work you need to have an even number of memory modules on your system (assuming that your AMD CPU or Intel chipset support this technology, of course). If you install just one module this technique won’t work because memory will still be accessed 64 bits per cycle. In other words, dual channel works by accessing two memory modules in parallel, i.e. at the same time.



















Figure 3: How dual channel works.

Because the two modules are accessed at the same time, they must be identical (same capacity, same timings and same clock rate).

No comments: