After months of speculation and several dozen “leaks”, AMD finally took the wraps off its Ryzen 4000 “Renoir” mobile APUs in March. These processors succeed the existing Ryzen 3000 “Picasso” parts built atop the 12nm Zen+ core. Like the mainstream Ryzen 3000 desktop CPUs (Matisse), the Ryzen 4000 mobile processors are based on the 7nm Zen 2 core. As such, you can expect a markedly higher single-threaded performance as well as gaming capabilities rivaling contemporary Intel offerings.
With Intel’s Ice Lake parts limited to quad-core designs and the hex-core Comet Lake essentially being another rebrand, AMD is a very favorable position to mount its first assault on the mobility market in nearly a decade.
AMD Ryzen 4000 Renoir Mobile APUs Specifications
Right off the bat, let’s have a look at the specifications. First the low power 15W Ryzen 4000U lineup:
|15W U series||Cores/Threads||Base Clock||Boost Clock||L2||L3||GPU CUs|
|Ryzen 7 4800U||8/16||1.8 GHz||4.2 GHz||4 MB||8 MB||8 CUs|
|Ryzen 7 4700U||8/8||2.0 GHz||4.1 GHz||4 MB||8 MB||7 CUs|
|Ryzen 5 4600U||6/12||2.1GHz||4.0 GHz||3 MB||8 MB||6 CUs|
|Ryzen 5 4500U||6/6||2.3 GHz||4.0 GHz||3 MB||8 MB||6 CUs|
|Ryzen 3 4300U||4/4||2.7 GHz||3.7 GHz||2 MB||4 MB||5 CUs|
As you can see the 15W Ryzen 4000U lineup has gotten five SKUs right off the bat. The reason being that slim form-factor notebooks are everywhere these days. It’s what the average user looks for. True to their tradition, AMD has gone with the “moar core” philosophy, and for good reason. Pretty much every application from gaming to browsers leverage up to 8 cores nowadays.
The Ryzen 4000U lineup starts off with the quad-core Ryzen 3 4300U which we recently found out to be on par with Intel’s Core i7-7700HQ. The Ryzen 5 and 7 both have two variants each. One a hex-core and the other an octa-core part, with each offering an SMT model. The GPU Compute count scales linearly from the Ryzen 3 to the Ryzen 7.
These will be the world’s first low-power, ultra-thin notebooks with as many as 8 x86 cores and 16 threads. That’s not it though. There will also be 14″ models featuring the high-performance H-series APUs. The ASUS Zephyrus G14 is one of the first such devices. It packs a Ryzen 7 4800HS and an NVIDIA RTX 2060 paired with a 14″ 1080p 120Hz display. This is the world’s first 14″ notebook to pack the 35W high-performance chip.
It also has a snazzy rear-panel with moveable LEDs, something akin to an equalizer. I’m guessing this is mostly targeted towards DJs and other power-users. In another first, the Zephyrus will also be the first laptop featuring the HS series APUs.
|45W H Lineup||Cores/Threads||Base Clock||Boost Clock||L2||L3||GPU CUs||TDP|
|Ryzen 9 4900H||8/16||3.3 GHz||4.4 GHz||4 MB||8 MB||8 CUs||45 W|
|Ryzen 9 4900HS||8/16||3.1 GHz||4.3 GHz||4 MB||8 MB||8 CUs||35 W|
|Ryzen 7 4800H||8/ 16||2.9 GHz||4.2 GHz||4 MB||8 MB||7 CUs||45 W|
|Ryzen 7 4800HS||8/16||2.9 GHz||4.2 GHz||4 MB||8 MB||7 CUs||35 W|
|Ryzen 5 4600H||6/12||3 GHz||4 GHz||3 MB||8 MB||6 CUs||45 W|
As you can see, the H series ditches the lower-end Ryzen 3 variant. AMD is probably leaving the U series chips for the average user while the H series will serve creators and gamers. This clearly evident from the corresponding core counts. The former offers a lot more variety with four to sixteen threads while the latter is essentially a two-product catalog, one with twelve threads while the other with sixteen threads.
Interestingly, the Ryzen 7 4800U features one more graphics core than the 4800H. While the Ryzen 9 4900H (HS) does feature 8 CUs, they will almost always be coupled with a dGPU. As such, the iGPU in these chips won’t be used for high-performance workloads like gaming.
AMD Ryzen 4000 Renoir APU Chip Design: It’s Monolithic with Two CCXs
AMD’s Ryzen 4000 APU lineup is monolithic in design, rather than MCM (Multi-Chip Module) like the Ryzen 3000 desktop series. This means that we’re getting much better latencies and simultaneously higher clocks. We’re basically looking at one CCD comprising of two CCXs paired with a 7nm Vega graphics core using the Infinity Fabric.
AMD Explains why the Ryzen 4000 Mobile APUs Don’t Use the Chiplet Design; Vega Instead of Navi
Considering that the integrated graphics communicate with the CPU using the fabric, faster memory will have a notable impact on applications leveraging the former.
Compared to the desktop Ryzen 3000 series, Renoir features a reduced L3 cache. This is understandable as there’s a single die and a huge “GameCache” is not required to keep the inter-chiplet latency in check.
With Renoir, AMD claims an IPC uplift of 25% over the older Picasso APUs along with a 20% lower power draw. In addition to this, the Ryzen 4000 parts also get a healthy 400 MHz (4900H) boost in terms of the core clock which will have a meaningful impact in real-world scenarios especially gaming.
Similar to the Matisse counterparts, the Renoir chips are based on TSMC’s 7nm FF process. They feature an insane 9.8 billion transistors on a 25x25x1.38mm die. That’s twice as much compared to Picasso, while also cutting the die size by 25%.
As far as the fine-grained architectural details are concerned, the same improvements to the front-end and back-end of the core are present as the desktop Ryzen 3000 lineup. You can read more here:
3rd Gen AMD Ryzen Processors Architectural Deep-dive: Chiplets, Game Cache, TAGE and More
The core features of the Zen 2 core architecture are:
- 7nm Process
- TAGE Predictor
- 2x Micro-Op Cache
- 2x Floating Point Bandwidth
- Native AVX256 support
- 2x Load Store Bandwidth
- 2x L3 Cache
- Power Efficiency, Faster Boosts Security Mitigations
Graphics: Vega is Mega
The graphics part of things is less interesting but equally impressive. Basically, it’s the older Vega GPU with the same number of CUs. However, thanks to the super-efficient 7nm node, AMD has been able to extract a lot more performance out of the old graphics parts.
According to AMD, each Compute Unit in the Ryzen 4000 APUs is as much as 60% faster than Picasso. The overall FP32 compute horsepower increases from 1.41 TFLOPs to 1.79 TFLOPs (27%). These gains come from three areas: significantly higher graphics clock, wider memory bandwidth, and an improved Infinity Fabric design.
- The graphics core increases by a healthy 25% from 1400MHz to 1750MHz on the newer Vega 8 GPUs. This was mainly made possible thanks to the 7nm node and an improved CU implementation.
- The higher memory bandwidth also contributes to this gain. The Ryzen 4000 APUs have a significantly higher memory bandwidth compared to Picasso. The LPDDR4X configuration allows for a 68.3GB/s peak (4×32 LPDDR4x-4266) while the H series APUs will be getting the standard DDR4-3200 (64-bit x2) memory configuration for a 51.2 GB/s bandwidth.
- In comparison, the older Ryzen 3000 mobile APUs were limited to just 38.4 GB/s. Unlike dGPUs, the integrated graphics share the system memory with the CPU, so this was a much-needed upgrade.
- Lastly, the Infinity Fabric has also been updated. The graphics to fabric bandwidth is twice as much compared to Picasso, further increasing the data flow between the main memory and the GPU.
7nm Process and Power Efficiency
One of the main challenges with mobile CPUs is keeping the power draw low enough so that you can extract a decent battery life out of the device. Renoir achieves this with the help of the 7nm process which is the most advanced in the industry. The use of LPDDR4X and the monolithic design further cut-down the power draw and offer sufficiently higher performance per watt.
There have also been several software-side and kernel improvements. Compared to Picasso, the power management vastly superior with a highly dynamic clock frequency resulting in an almost 60% lesser power draw.
One of the key areas which helped with the idle power draw and passive performance is the inclusion of multiple ACPI power profiles. With Picasso, there was only one. This would lead to increased latency when the system jumped from the low-power state (idle) to active. Furthermore, the CPU cores had to cross multiple states to get from the parked to the high-performance mode.
With Renoir, there are three C states. The most reactive C1 has only one high-performance hardware power state while C2 has two. This means that the CPU latency required to go from idle to active will be much lower, and when in use, it won’t step back to low power mode, resulting in a snappier interface and faster application execution.
The processor stays in the high-performance C1 state whenever in use, and then slowly steps down to C2 when left inactive while prolonged idleness results in a switch to C3.
Overall, the frequency and thereby the power draw are governed by the user activity. Higher user activity will keep the core clocks dynamic, resulting in more frequent peaks while reduced activity will slowly push the CPU cores into parked mode.
This is something AMD has talked about a lot. In general, when either the dGPU or the CPU is idle, the other can’t exploit the additional power headroom available. Thanks to AMD’s SCF (scalable control fabric) which is part of the Infinity Fabric, this can be fixed.
This lets the GPU increase its boost clocks when the CPU is drawing lesser than the allowed power or vise versa, bolstering the gaming performance without any penalties.
The Dell G5 SE will be the first laptop to feature SmartShift. It will pack the Renoir APUs along with the Navi based Radeon RX 5600M as well as support for FreeSync. Like most OEMs, Dell’s Renoir offerings will land in Q2.
Another new feature that allows maximum performance in the thinnest form factors is the new System Tracking Technology V2. It uses multiple sensors embedded in the chassis hotspots to determine whether boost clocks can be maintained for longer than usual.
Using STT v2, the Renoir chips can boost higher than traditional chips by evaluating the data received from various sensors and override the artificial application thermal ceiling. This will be helpful in thermally restrictive chassis where the thermal bottleneck usually limits the boost clock despite the power draw staying well within range.