This morning the Cinebench R23 benchmarks for Apple’s M1 chip hit the internet. While it is the fastest Arm-based SoC, it’s still a far cry from the x86 based rivals in the mobile PC space.
As you can clearly see in the above graph, the M1 is quite a bit slower than every other octa-core part on the x86 market. It only manages to beat the quad-core Tiger Lake-U flagship, the Core i71185G7 and that too not by a big margin. The M1 clearly offers especially strong single-threaded performance, but it still lags behind Intel and AMD’s latest core designs (Willow Cove and Zen 3). This is despite the fact that the M1 has a node advantage. It’s the first chip to leverage TSMC’s 5nm EUV node. The Tiger Lake and AMD CPUs are based on the 10nm and 7nm process, respectively, with roughly half the density and efficiency of the same node.
The main advantage of the M1 core is the super-wide design. While AMD and Intel use a 4-way x86 decoder, the M1 uses an 8-way decoder. This is primarily because of the x86 ISA being full of variable instruction length while Arm has fixed lengths. Increasing the decoder width can, therefore, be a bottleneck, instead of increasing the ILP.
Another major design advantage is with respect to the Reorder buffer. The M1 has a massive 630 entry RB, while Zen 3 has just 256 and Sunny Cove somewhere around 352. The RB size is directly proportional to the process node as it takes up a lot of die space. The use of TSMC’s 5nm process has primarily allowed Apple to increase the RB width to such an extend, although I don’t expect Intel or AMD to have such a wide buffer even when they make the transition to sub-7nm nodes. It’s just no feasible design-wise.
All in all, while the M1 chip is an impressive product, Apple’s marketing has blown its capabilities out of proportion, just as they always do with most of their products. Don’t expect it to replace the x86 offerings anytime soon.