Memory and Storage

Difference Between DDR4 vs GDDR5/GDDR6 Memory; DDR4 vs LPDDR4 Explained

Modern computers use many different kinds of memory: DDR4, GDDR5, GDDR6, LPDDR4, etc. While these are all based on DRAM, there are some key differences between them. DDR4 is used in most PCs as the main memory and is the most popular form of DRAM. GDDR5 and GDDR6 are used in graphics cards as dedicated graphics memory. Although it’s also based on DRAM, it’s somewhat different from DDR4.

DDR4 vs GDDR5 vs GDDR6: What’s the Difference Between Main Memory and Graphics Memory

Many people get confused between the two and use them interchangeably. There’s also LPDDR4 memory used in smartphones and other mobile devices, and HBM utilized in servers and exascale computers. In this post, we explore the differences between DDR4 and GDDDR5 memory along with a brief explanation of HBM, LPDDR4 and the newer GDDR6 standard.

Double Data rate Generation Four (DDR4)

Nearly every kind of memory is based on dynamic random access memory or DRAM. It‘s slower than static ram (SRAM) as it has to be refreshed continuously by the memory controller. At the same time, it is much more affordable which is the primary reason for its widespread use. SRAM is used as the cache memory in GPUs and CPUs as it’s much faster and efficient compared to DRAM.

DDR4 is the latest iteration of DRAM. Released in 2014, it initially focused on reducing the voltage and power consumption rather than increasing the operating frequencies. With the coming of AMD’s Ryzen processors and the MCM design, high-speed DDR4 memory has suddenly become more relevant. DDR4 memory modules capable of running at 3600MHz out of the box are now widely available while some can even be pushed to as high as 5000MHz.

DDR4 vs DDR3

Aside from the obvious (faster frequencies and lower latency), the primary advantages of DDR4 memory over DDR3 are higher DIMM sizes (up to 64 GB, DDR3 is limited to 16GB). It also draws considerably less power and runs at a lower voltage.

DDR3 vs DDR4 vs DDR5: Differences Explained

With that out of the way, let’s talk about GDDR5 memory, the predominant memory standard on video cards (now being replaced by GDDR6).

Both DDR4 and DDR3 use a 64-bit memory controller per channel which results in a 128-bit bus for dual-channel memory and 256 bit for quad-channel. GDDR5 memory, on the other hand, leverages a puny 32-bit controller per channel.

DDR4 Vs GDDR5

  • DDR4 runs at a much higher voltage than GDDR5, 1.2 volts to be exact. GDDR5, on the other hand, is usually limited to 1V.
  • Both DDR4 and DDR3 use a 64-bit memory controller per channel which results in a 128-bit bus for dual-channel memory and 256 bit for quad-channel. GDDR5 memory, on the other hand, leverages a puny 32-bit controller per channel.
  • Where CPU memory configurations have wider but fewer channels, GPUs can support any number of 32-bit memory channels. This is the reason many high-end GPUs like the GeForce RTX 2080 Ti and RTX 2080 have a 384-bit and 256-bit bus width, respectively. 

Both the RTX 20 series cards are connected to 1GB memory chips via 8 (for 2080) and 12 (for the Ti) 32-bit memory controllers or channels. GDDR5/6 can also operate in what is called clamshell mode, where each channel instead of being connected to one memory chip is split between two. This also allows manufacturers to double the memory capacity and makes hybrid memory configurations like the GTX 660 with its 192-bit bus width possible.

The GTX 670 has four 512 MB chips across eight channels
A GTX 660 Ti has four memory stacks, the ones on top (packing two per stack) in clamshell mode. This reduces the bus width to 192-bit rather than 256-bit
A GTX 660 PCB
clamshell mode
  • Another core difference between DDR4 and GDDR5/6 memory involves the I/O cycles. Just like SATA, DDR4 can only perform one operation (read or write) in one cycle. GDDR5 can handle input (read) as well as output (write) on the same cycle, essentially doubling the bus width.
  • All this might put DDR4 memory in a bad light, but this configuration actually suits both setups. CPUs are largely sequential in nature while GPUs run thousands of parallel cores. The former benefits from low latency and slimmer channels, while GPUs require a much higher bandwidth with loose timings.

GDDR5 vs GDDR5X vs GDDR6

GDDR6 was preceded by GDDR5X which was more of a half-generation upgrade of sorts. GDDR5X features transfer rates of up to 14GBit/s per pin, twice as much as GDDR5.

This was achieved by using a higher prefetch. Unlike GDDR5, GDDR5X has a 14n prefetch architecture (vs 8n on G5). This allows it to fetch 64-bytes (512-bits) of data per cycle while GDDR5 was limited to 32-bytes.

GDDR6, like GDDR5X, has a 16n prefetch but it’s divided into two channels. So GDDR6 fetches 32 bytes per channel for a total of 64 bytes just like GDDR5X and twice that of GDDR5. While this doesn’t improve memory transfer speeds over GDDR5X, it allows for more versatility.

DDR4 Vs GDDR5

GDDR6 can fetch the same amount of data as GDDR5X but in two separate channels, allowing it to function like two smaller chips instead of one, in addition to a wider single one.

Other than that, GDDR6 also increased the density to 16Gb (2x compared to GDDR5X) and significantly improves bandwidth by increasing the base clock from 12Gbps to up to 16Gbps.

High Bandwidth Memory (HBM)

First popularized by AMD’s Fiji graphics cards, high bandwidth memory or HBM is a low power memory standard with a wide bus. HBM achieves substantially higher bandwidth compared to GDDR5 while drawing much lesser power in a small form factor.

HBM adopts clocks as low as 500 MHz to conform to a low TDP target and makes up for the loss in bandwidth with a massive bus (usually 4096 bits). AMD’s Radeon RX Vega cards are the best example of HBM2 implementation in consumer hardware. HBM2 solved the 4GB limit of the HBM1, but limited yields coupled with memory shortage prevented AMD from capitalizing on the consumer GPU front.  

LPDDR4 vs DDR4

LPDDR4 is the mobile equivalent of DDR4 memory. Compared to DDR4, it offers reduced power consumption but does so at the cost of bandwidth. LPDDR4 has dual 16-bit channels resulting in a 32-bit total bus. In comparison, DDR4 has 64-bit channels.

DDR4 vs LPDDR4: What’s the Difference Between PC and Mobile RAM

However, at the same time, LPDDR4 has a prefetch of 16n per channel for a total of (16 words x 16 bit) 256 bits/32 bytes. That results in an overall of 512 bits or 64 bytes for both the channels.

DDR4, on the other hand, has two 8n prefetch banks per channel. The two banks are separate and can execute two independent 8n prefetches. This is done by using a multiplexer to time division multiplex its internal banks.

Compared to DDR4, LPDDR4 offers reduced power consumption but does so at the cost of bandwidth. LPDDR4 has dual 16-bit channels resulting in a 32-bit total bus. In comparison, DDR4 has 64-bit channels.

DDR4 Vs GDDR5

LPDDR4 also has a more flexible burst length ranging from 16 to 32 (256 or 512 bits, 32 or 64 bytes. DDR4, on the other hand, is limited to 8 bursts per cycle (or 128 bits), although each bank can perform additional transfers.

To understand what burst-length means, you need to know how memory is accessed. When the CPU or cache requests new data, the address is sent to the memory module and the needed row, then the column is located (if not present, a new row is loaded). Keep in mind that there’s a delay after every step.

After that, the entire column is sent across the memory bus, but instead in bursts. For DDR4, each burst was 8 (or 16B). With DDR5, it has been increased to as much as 32 (up to 64B). There are two bursts per clock and they happen at the effective data rate.

This design makes LPDDR4 much more power efficient compared to standard DDR4 memory, making it ideal for use in smartphones with battery standby times of up to 8-10 hours. Micron’s LPDDR4 RAM tops out the standard with a 2133 MHz clock for a transfer rate of 4266 MT/s while Samsung follows shortly after with a clock of 1600MHz and a transfer rate of 3200 MT/s.

DDR4 vs DDR5 Memory

The specifications of the next-gen DDR5 memory standard have been announced and they’re a substantial step above the existing DDR4 modules. DDR5 aims to reach bandwidths as high as 4800Mbps per DIMM, a hefty 50% gain over DDR4’s 3200Mbps. This massive uplift is achieved via the following advances in the memory structure:

32-Bank Structure: DDR5 uses a 32 bank structure with 8 bank groups, twice as much as DDR4’s 16 bank design. This effectively doubles the memory access availability. To complement this, DDR5 also adopts the Same Bank Refresh Function. Unlike DDR4, this allows the next-gen memory to access other memory banks while the rest are operating or refreshing.

Burst Length: With DDR4, the burst rate was limited to 8, allowing transfers of up to 16B from the cache at a time. DDR5 increases this to 16, with support for even 32-length mode, which allows up to 64B cache line fetch with just one DIMM.

To understand what burst-length means, you need to know how memory is accessed. When the CPU or cache requests new data, the address is sent to the memory module and the required row, after which the column is located (if not present, a new row is loaded). Keep in mind that there’s a delay after every step.

Then the entire column is sent across the memory bus, in bursts. For DDR4, each burst was 8 (or 16B). With DDR5, it has been increased to as much as 32 (up to 64B). There are two bursts per clock and they happen at the effective data rate.

16n Prefetch: The prefetch has also been scaled up to 16n to keep up with the increased burst length. Like DDR4, there will be two memory-bank arrays per channel connected via a MUX resulting in a higher effective prefetch rate.

Lastly, by adopting a Decision Feedback Equalization (DFE) circuit, which eliminates reflective noise during the channels’ high-speed operation, DDR5 increases the speed per pin considerably.

Feature/OptionDDR4DDR5DDR5 Advantage
Data rates1600-3200 MT/s3200-6400 MT/sIncreases performance and bandwidth
VDD/VDDQ/VPP 1.2/1.2/2.5 1.1/1.1/1.8 Lowers power
Internal VREFVREFDQVREFDQ, VREFCA, VREFCS Improves voltage margins, reduces BOM costs
Device densities  2Gb-16Gb  8Gb-64Gb  Enables larger monolithic devices 
Prefetch  8n 16n  Keeps the internal core clock low
DQ receiver equalization CTLEDFEImproves opening of the received DQ data
 eyes inside the DRAM
Duty cycle adjustment (DCA) None DQS and DQImproves signaling on the transmitted DQ/DQS pins
Internal DQS delay
 monitoring 
None DQS interval oscillator Increases robustness against environmental changes 
On-die ECCNone128b+8b SEC, error check and scrub Strengthens on-chip RAS
CRC Write Read/Write  Strengthens system RAS by protecting read data 
Bank groups (BG)/banks 4 BG x 4 banks (x4/x8)
 2 BG x 4 banks (x16)
8 BG x 2 banks (8Gb x4/x8)
 4 BG x 2 banks (8Gb x16)
 8 BG x 4 banks (16-64Gb x4/x8)
 4 BG x 4 banks (16-64Gb x16) 
Improves bandwidth/performance
Command/address interface ODT, CKE, ACT, RAS,
 CAS, WE, A<X:0>
CA<13:0> Dramatically reduces the CA pin count
ODTDQ, DQS, DM/DBI DQ, DQS, DM, CA bus  Improves signal integrity, reduces  BOM costs 
Burst lengthBL8 (and BL4) BL16, BL32 (and BC8 OTF, BL32 OTF) 
Allows 64B cache line fetch with only 1 DIMM subchannel. 
MIR (“mirror” pin) NoneYesImproves DIMM signaling
Bus inversion Data bus inversion (DBI)Command/address inversion (CAI) Reduces VDDQ noise on modules
CA training, CS training None CA training, CS training Improves timing margin on CA and CS pins  
Write leveling training modes YesImprovedCompensates for unmatched DQ-DQS path
Read training patterns Possible with the MPRDedicated MRs for serial (userdefined), clock and LFSR-generated training patternsMakes read timing margin more robust
Mode registers7 x 17 bitsUp to 256 x 8 bits (LPDDR type read/write) 
Provides room to expand
PRECHARGE commands All bank and per bankAll bank, per bank, and same bank PREsb enables precharging-specific bank in each BG
REFRESH commands All bank All bank and same bankREFsb enables refreshing of specific bank in each BG
Loopback modeNone YesEnables testing of the DQ and DQS signaling 

DDR5 also increases the memory density all the way (up) to 64Gb from 16Gb and both the VDD and VPP have gone down to reduce the power draw. Finally, on-chip ECC has also been added and the Mode Registers have also been significantly upgraded. You can see the entire change-list in the above table.

What is Cache Memory? Difference Between L1, L2, and L3 Cache Explained; Types of Memory Mapping

Areej

Computer Engineering dropout (3 years), writer, journalist, and amateur poet. I started Techquila while in college to address my hardware passion. Although largely successful, it suffered from many internal weaknesses. Left and now working on Hardware Times, a site purely dedicated to. Processor architectures and in-depth benchmarks. That's what we do here at Hardware Times!
Back to top button
Close