New 2GB Pi 5 has 33% smaller die, 30% idle power savings

Raspberry Pi launched the 2 gig Pi 5 for $50, and besides half the RAM and a lower price, it has a new stepping of the main BCM2712 chip.

BCM2712 C1 vs D0 Stepping chips

This is the BCM2712 D0 stepping. Older Pi 5's shipped with a C1. In their blog post, they said:

The new D0 stepping strips away all that unneeded functionality, leaving only the bits we need.

Steppings are basically chip revisions where they don't change functionality, and usually just fix bugs, or tweak the layout. But even tiny design changes could have unintended consequences. I wanted to see exactly what happens when I push one of these new chips to the limits.

First, I wanted a performance baseline, so I ran Geekbench with the latest Pi OS and all the defaults.

Except... apparently Geekbench likes more than 2 gigs of RAM. I couldn't get past the multicore Photo Filter test, since the OS kept running outta memory. A lotta software nowadays is built for an absolute minimum of 4 or 8 gigs. So keep that in mind when you're buying a Pi.

Geekbench 6 kernel Linux OOM killer

Without adding swap, Geekbench was out, but I still wanted to get some raw numbers, so I ran sysbench. It's lighter weight, it runs with limited memory, and it still gives me CPU numbers to compare.

Using a combination of this 10W peltier cooler and the bottom heatsink from an EDAtec fanless case, I ran through a number of overclock scenarios:

Clock Speed (MHz) sysbench result
2400 4155
3000 5175
3100 5315
3200 5505
3300 5715
3400 5804
3500 6068

I ran the command sysbench --test=cpu --cpu-max-prime=20000 --num-threads=4 run.

At 3.6 GHz the Pi wouldn't boot—there were always memory errors and it would completely freeze. At 3.5 GHz, there were still some stability issues, and I couldn't get the Pi to reboot cleanly.

Pi 5 heatsink - Peltier cooling

For any speeds above 3.1 GHz, I also used my pi-overvolt hack, which I cover more in depth in my blog post Hacking Pi firmware to get the fastest overclock.

There's still a hard limit of 1.1V from the PMIC, so besides splicing in higher voltages direct into the SoC, the only other hardware-level modification I hadn't tried was delidding the Pi's processor.

Theoretically, this would allow the Peltier cooler to pull heat off the silicon even faster.

Delidding

But there was another reason I wanted to delid a 2 gig Pi 5. Raspberry Pi mentioned in their blog post the D0 stepping was simpler and cheaper to make, since they removed 'dark silicon'. That just means there were portions of the BCM2712 that Broadcom put in but Raspberry Pi never used, like the built-in Ethernet controller. Raspberry Pi built their own Ethernet into the RP1, so they didn't use the controller in the main SoC.

What this means for the D0 is the actual CPU die is smaller. A smaller die fits more chips on a single wafer, meaning individual chips cost less, assuming chip production yields are the same.

I already have a delidded C1 chip from back when I worked with John McMaster and Kleindiek on my Pi 5 silicon deep-dive, so I just need to delid the D0 chip.

I placed the 2 GB Pi 5 on my workbench, and worked at the corners with a razor blade:

Pi 5 delid heat spreader with razor knife blade

The heat spreader popped off, and I took some measurements, comparing the C1 to the D0 stepping:

Stepping Width Height Die area
BCM2712 D0 6.30mm 5.98mm 37.674mm2
BCM2712 C1 6.47mm 8.63mm 55.836mm2

The D0 is 32.5% smaller than the older version, which would definitely bring down the price per chip, assuming the same yield on a given silicon wafer. It seems they're still using the 16nm process node, so that's a good chunk of 'dark silicon' removed!

Direct Die Cooling

I powered up the Pi, with the new direct-die cooling arrangement, but still had stability issues at 3.6 GHz. Maybe even a little more at 3.5, it was hard to tell.

BCM2712 bare die running with no cooling

Just for fun, I pulled off the cooler entirely, and let the Pi run with just the die exposed to the air. It was happy running like this, even running sysbench at 2.8 GHz for 10 seconds without throttling.

So could you run a Pi completely naked? Sure, but the heat spreader does a good job getting more heat off the whole package, so I'd just leave it on.

My takeaway is the Pi's 16 nanometer chip seems to max out around 3.5 Gigahertz.

Thermals and Efficiency - The Goldilocks Pi

The other big question though, is whether the smaller design is any better for efficiency or thermals.

CNX Software did some testing and published a chart, showing a significant difference, 2.7W to 3.5W, for idle power consumption.

I haven't done exhaustive testing, but I did run through my stress benchmarks, monitoring power and heat. I ran it on all the Pi 5 models, with identical test parameters. I have more in this GitHub issue, but this chart sums it up:

Pi 5 2GB D0 Pi 5 4GB C1 4GB Delta Pi 5 8GB C1 8GB Delta
Idle power 2.4W 3.3W 0.9W (+32%) 3.2W 0.8W (+29%)
Idle temp 30°C 32°C 2°C (+6%) 32°C 2°C (+6%)
stress-ng power 8.9W 9.8W 0.9W (+10%) 9.8W 0.9W (+10%)
stress-ng temp 59°C 63°C 4°C (+7%) 64°C 5°C (+8%)

For idle power draw, the improvement almost mirrors the chip size reduction. The chip is 33% smaller, and the idle power draw is almost that much better.

And here are the thermals for my test runs, for completeness:

Pi 5 C1 vs D0 stepping - thermals

Some of the power savings could be chalked up to less RAM, because more RAM requires more power. But that doesn't explain all the results. Thomas Kaiser also found the OPP tables are different with the new chip. The 2 gig model doesn't need as much voltage to hit certain clocks, like it uses 805 mV at 2.1 GHz versus 850 mV for the 8 gig model.

As a final efficiency test, I ran my top500 HPL benchmark on the 2 gig Pi 5. HPL is a memory-intense benchmark, and because the 2 GB model only has 2 GB of RAM, the overall efficiency for this test was worse, coming in at 2.07 Gflops/W. The 8 GB Pi 5 gets 2.75 Gflops/W.

For now, the biggest difference between the 2, 4, and 8 GB PI 5s for most people would still be having more RAM. If you know you can run your apps in 2 GB, this is a great little Pi for that. If you can't, then I think Raspberry Pi set up the 4 GB Pi 5 as the 'goldilocks': Not too expensive, with just enough RAM for most uses.

Conclusion

So is it worth stepping down to a 2 GB Pi 5 just to get the simpler D0 chip? No. But is it cool to have a cheaper 2 gig option exist? Yes. Just make sure you have a use case for it that doesn't need a ton of RAM.

Comments

Nice job Jeff. I love the Pi and your in depth review of this new model is great. 30% at idle is a big deal, in my opinion.

Ran a quick search on Raspberry Pi's github linux repo and found where I got my info from re the stuff they took out on D0. From what I can see, they actually removed device tree support for parts of the chip they don't use on C0/C1 that are not present on D0, and folded these changes into the same DTS file. They also seem to have added a DTS specifically for the D0 stepping, which seems to be register changes, i.e. stuff that is present in both variants of the chip but has moved or needs to otherwise be handled differently between C1 and D0. See https://github.com/raspberrypi/linux/pull/5847, specifically for the bits removed see https://github.com/raspberrypi/linux/pull/5847/commits/8be0890e7464324e…. Per that commit, they removed:

- UART3
- UART4
- UARTC (compatible = "brcm,bcm7271-uart";, which is a type never used on any Raspberry Pi device)
- the Ethernet MAC (compatible = "brcm,bcm2711-genet-v5" - i.e. the same one in BCM2711 (Pi 4 series)
- sdio0 (compatible = "brcm,bcm2711-emmc2" - don't know which interface this is and how it relates to previous generations of SoC)

I suspect Broadcom have changed other things - to my very untrained eye, that doesn't seem like it would account for a 33% reduction in die size.

Nice article, just an observation that temperature percentages don't really work that way...

I recommend controlling which CPU workload stress-ng is running, rather than letting it cycle. On Intel's Gracemont cores, --cpu-method=fft seems the most energy-intensive. I don't have a Pi 5 to try it on, but it'd be a good place to start. Check the manpage, for other options.

The new RPi5 idle power (2.5W) is very close to the RPi4 idle power (2.7W). I use my RPi4 as a gadget and it does fine with USB 3.0 power limits. I guess RPi5 2GB could be a good replacement now.

The RP1 die handles all the peripherals and IO. I would not be surprised if all the IO pads and peripherals were removed from the Broadcom SOC. The Broadcom die just needs PCIe, DDR and power (maybe I2C to the PMIC). I believe Wifi, Ethernet and USB are on RP1.