Most Stable Raspberry Pi? Better NTP with Thermal Management
256 points
13 hours ago
| 25 comments
| austinsnerdythings.com
| HN
dfc
6 hours ago
[-]
The transmission site for WWVB has hundreds of water bottles to insulate the equipment from temperature swings.

https://jila.colorado.edu/news-events/articles/spare-time

reply
wongarsu
6 hours ago
[-]
That page deserves its own submission. The juxtaposition in running four atomic clocks with a UPS consisting of "two car batteries, a power supply, a trickle charger (to keep the batteries continuously charged), and two car headlights (for discharging the batteries)" and a thermal mass consisting of water bottles is crazy
reply
jsolson
12 hours ago
[-]
You might have even better precision if you stay away from CPU0 and also set idle=poll in your kernel command line. Lots of things (including other interrupts) often land on CPU0. It would not be my first choice for something where I wanted high timing precision.
reply
yc-kraln
11 hours ago
[-]
I came here to post this. We make a lot of the same sorts of optimizations for our OS distro (debian based) -- disabling frequency scaling, core pinning, etc. Critically, CPU0 has a bunch of stuff you cannot push, and you're better off with using one of the other cores as an isolated island.

This is what the scheduler latency looks like on our isolated core:

# Total: 000300000 # Min Latencies: 00001 # Avg Latencies: 00005 # Max Latencies: 00059 # Histogram Overflows: 00000

(those are uS!)

reply
msephton
10 hours ago
[-]
Very cool. What are you running on it? What's your use case?
reply
auspiv
5 hours ago
[-]
Worth a shot! I'll give it a go later today.
reply
auspiv
5 hours ago
[-]
Austin from austinsnerdythings.com here! Posted this last night and went to sleep. Definitely didn't expect to wake up to it #1 on HN this morning!

Some notes that weren't included in this post:

I do have my LEA-M8T generating time pulses at 16 Hz, with a dpoll of -4 in the chrony config. The poll interval of 16 seconds means it gets 256 samples which also improved the stability.

I do also have that mentioned BH3SAP GPSDO sitting next to me on my desk. I had Claude update the firmware to allow "flywheel" behavior so it continues to generate pulses in the absence of GPS PPS. That's coming in another post. Also had Claude update the firmware to spit out TSIP (trimble proprietary protocol) info for lady heather.

Going to go through the comments and answer a couple other things I see.

Happy to answer any other questions!

reply
perdomon
59 minutes ago
[-]
I would not have considered artificially-working the SOC to maintain a steady temp! When the wattage is that low, however, it makes sense to burn a few extra bucks a year versus some kind of overengineered cooling system.
reply
easygenes
7 hours ago
[-]
You can go further and replace the cheap Pi oscillator crystal with a proper TCXO, as others using them for NTD have done and documented: https://raspberrypi.stackexchange.com/questions/74482/switch...

That should give you 4-5x less drift than his results (though you could pair the techniques for even better figures)

reply
auspiv
30 minutes ago
[-]
Believe me, I have read that post/comment quite a few times. There are actually Pi 4 hats for sale for audiophiles (that seem to believe that you need 54000000.000 MHz system clock or whatever it is for Pi4 (Pi3 is 19.2 MHz) for optimal audio) that have an OCXO on them. But in another comment I said I'm not sure my soldering skills are that good.
reply
boricj
7 hours ago
[-]
Instead of dedicating an entire Raspberry Pi with fancy pinning and temperature management by burning CPU time, wouldn't a micro-controller and a precise external oscillator fare better for time-keeping stability? I would assume that a STM32 discovery kit running a NTP server on its Ethernet port could probably do better.
reply
auspiv
5 hours ago
[-]
I do have that as well, but haven't done a write up on it yet. It was a $70 GPSDO from eBay (BH3SAP variant running Fredzo firmware with some changes to enable flywheel (generating pulse in absence of GPS PPS)). I have verified it can feed the Pi for NTP. Believe the STM32 driving the GPSDO can as well but it has no ethernet capability as-is
reply
Joel_Mckay
6 hours ago
[-]
In general, NTP is a time sensitive process, and application processor/SoC are indeed going to have far greater rates of clock slips than an MCU running off an XTAL or TCXO.

RTLinux has a module feature to sync the scheduler to an external pin state. It is an obscure feature...

Adding more processors creates a well-known named-problem with metastability:

https://en.wikipedia.org/wiki/Clock_domain_crossing

Real-time is not guaranteed latency, and the Pi is not like Zynq fpga. =3

reply
avidiax
11 hours ago
[-]
Why not put a resistor (for heating) and a bit of foam insulation on the crystal?

This is way more direct than spacebar heating.

You could also add a transistor attached to the resistor and a GPIO and use the clock drift as a proxy for temperature. PID is probably enough but since you have a 24 hour cycle you could calculate a baseline heating schedule.

reply
ErroneousBosh
9 hours ago
[-]
This is a technique that's been used for crystal oscillators for almost a century by now. I have some 1950s crystal ovens that are a little metal box that fits over the crystal (quite a large crystal, about the size of two or three SD cards stacked) and heats up to around 75°C. The crystals were supposed to be specially cut to have close to zero temperature coefficient around that temperature so the slight up and down drift caused by the thermostat wouldn't affect it.

I have test equipment made as recently as the early 2000s that uses a crystal oscillator in an oven as a frequency standard. It takes a good five minutes to fully stabilise.

reply
auspiv
5 hours ago
[-]
I did this at one point, putting the pi directly on some packing foam. It for sure dampened the ambient effects but really I just need to stick the whole Pi in a temperature controlled chamber.
reply
geerlingguy
12 hours ago
[-]
It's an SBC-scale OXCO. I half wonder if adding a larger heatsink, or even putting thermal mass around the existing oscillator could also help, or if the heating is more localized in the PCB itself.

Always fun new things to learn when doing something "simple" like setting up an NTP server!

reply
LeoPanthera
11 hours ago
[-]
Flirc makes a metal Pi case where the CPU is pressed against the metal body of the case, resulting in a huge thermal mass for passive cooling. I have a bunch of them and it works very well. No fan necessary.
reply
MayeulC
9 hours ago
[-]
> or even putting thermal mass around the existing oscillator

I was thinking along these lines as well. Put a metal block on the CPU and oscillator for thermal mass (not sure if separate blocks would be better). Ideally, with a large enough thermal capacity, the block should reach an average temperature and remain there.

Inertia is also good even if the temperature is not constant: clock drift can be measured and compensated. If the temperature rises slowly, the clock speed will increase slowly: the rate can be measured and compensated for. Jitter is the issue here, and thermal inertia should dampen it.

It may also be worth preventing convection from happening on the board. Putting the Pi in a wool sock may not be the best idea depending on its temperature, but an electrically insulating thermal conductor (or an electrical insulation layer + steel wool may do it).

Heatsinks may also be counter-productive (if they have a small thermal capacity), as their temperature depends on room temperature, which changes during the day.

reply
jauntywundrkind
12 hours ago
[-]
I was thinking it might be nice to add some insulation around some the pi's enclosure, to reduce its cooling significantly. A little bit to tamp down any potential rapid fluctuations in the room's temperature (if someone opens a window, steps out of the bath, whatever). But more so that it could save a watt or two of power, by having the time-burner cores working much less.

You're right that this is a over-controller oscillator. The goal generally with ovens is to keep heat! (To an extent of course.)

reply
IlikeKitties
12 hours ago
[-]
> I half wonder if adding a larger heatsink, or even putting thermal mass around the existing oscillator could also help, or if the heating is more localized in the PCB itself.

That would likely make it worse. The trick here is that the other cores are running at essentially their maximum temperature and and will dynamically reduce their clockspeed if required to keep from going above that limit. In essence, the environment becomes actively temperature controlled. If the ambient heat goes higher, the cores clock lower, if it gets colder the cores clock higher (up to a point).

If you add too much heat dissipation, the total power used by those cores might not be enough to keep well above ambient.

reply
mytailorisrich
12 hours ago
[-]
Extreme power dissipation would keep temperature stable so that this whole setup might not be needed, though.

Author should experiment with liquid nitrogen ;) [1]

[11] https://www.xda-developers.com/liquid-nitrogen-cooling-raspb...

reply
IlikeKitties
11 hours ago
[-]
The timing crystals don't work better when colder but worse. That's why they are heated in high end time appliances, not cooled.
reply
mytailorisrich
11 hours ago
[-]
Isn't the issue here temperature stability? (Also, humour)
reply
cap11235
10 hours ago
[-]
Right, and they are heated because a hot wire is much simpler than a fridge.
reply
mikewarot
1 hour ago
[-]
I too thought that using an Oven stabilized crystal oscillator world be the best approach, but as I read on, I realized that doing it entirely in software was an interesting way to go, and well worth the journey.
reply
conroydave
37 minutes ago
[-]
love this line FTA: "Is It Worth It? For 99.999% of use cases: absolutely not."
reply
ACCount37
12 hours ago
[-]
It's the good old OCXO - Oven Controlled Crystal Oscillator. But the heating element is the CPU. Fucking hilarious.
reply
astrostl
4 hours ago
[-]
> Instead of software thermal control, I could add an actively cooled heatsink with PWM fan control. This might achieve similar temperature stability while using less power overall.

Love this (honestly). Interesting article!

reply
hnchm
11 hours ago
[-]
There was a paper on this in 2022. Not sure if it's used in production or not.

https://www.usenix.org/conference/nsdi22/presentation/najafi

reply
anonymousDan
9 hours ago
[-]
Cool paper. Their modeling of the temperature response curve seems a more elegant (albeit non-trivial) solution than burning CPU.
reply
tw1984
8 hours ago
[-]
the only contribution of that paper is they found & reported that there are potentially dozens of temperature sensors in a typical server.

the method of using two PPS signals configured to be some delta time apart to detect jitter is not new. the whole learning the tempco of the crystal thing is like 80 years old.

they also avoided touching the most critical part of the issue - how would you be sure that such learned tempco is accurate enough for the estimated bound.

reply
anonymousDan
10 hours ago
[-]
Couldn't you model the effect of temperature on clock drift and try to factor that in dynamically (e.g. using a temperature sensor) instead of burning CPU unnecessarily?
reply
KeplerBoy
8 hours ago
[-]
Sure, that's what the chrony closed loop is already doing (the estimated residual frequency is pretty linear with temperature), but no matter how robust your closed loop is, it's strictly better to not have disturbances in the first place.
reply
mlichvar
9 hours ago
[-]
That's what the chrony tempcomp directive is for. But you would have to figure out the coefficients, it's not automatic.

An advantage of constantly loading at least one core of the CPU might be preventing the deeper power states from kicking in, which should make the RX timestamping latency more stable and improve stability of synchronization of NTP clients.

reply
auspiv
5 hours ago
[-]
Chrony does have ability to do temperature compensation. I've done this and need to do a write up on it. It's not super easy to keep all the parts working together. Basically you feed chrony a table of temperatures and expected clock frequency and it subtracts it out.
reply
throwaway81523
11 hours ago
[-]
I wonder about using an RPI Pico for this instead, using the Pico's synchronous PIO gizmo to intercept the PPS pulses.
reply
geerlingguy
6 hours ago
[-]
There was a discussion about this on the time-nuts mailing list that was enlightening; tldr was it's not going to get to the same accuracy as you'd expect compared to a dedicated timing circuit (like a good GPSDO), but it would be a fun learning exercise.
reply
HPsquared
9 hours ago
[-]
A microsecond is still quite a lot if GPS is involved, that's about 1000 light-feet!
reply
speedgoose
9 hours ago
[-]
How much is it in eagle wingspans ?
reply
dspillett
8 hours ago
[-]
Bald, golden, or other?

Or do we have a defined standard eagle these days?

reply
encom
4 hours ago
[-]
152,4 metric eagles.
reply
adgjlsfhk1
3 hours ago
[-]
When doing GPS stuff, you don't use local time.
reply
jojomodding
9 hours ago
[-]
or 300 light-meters
reply
Kerbonut
11 hours ago
[-]
Wouldn't a temperature compensating algorithm be just as effective?
reply
irjustin
12 hours ago
[-]
I love this. Chasing perfection for perfection alone.
reply
tw1984
8 hours ago
[-]
some ideas -

1. using a second hand single frequency Ublox LEA-M8T GNSS receiver is not a good idea in 2025 when a dual frequency GNSS receiver chip can be purchased online for as little as $20-30. buy one of those ublox ZED-F9P receivers with the 2018/2019 firmware, they get you the exact same timing performance as ZED-F9T with QEerr support. PPS jitter is going to drop from 20ns to less than 2ns after qErr compensation.

2. you can replace that highly unstable crystal oscillator with a $10 used OCXO + $20 PLL board and save that PID temperature controller to be used as the secondary oven for that OCXO. people have been doing that for Raspberry Pi for ages.

3. or you can configure your GNSS receiver to output a 10Hz signal from the PPSOUT pin so chrony can get 10 updates per second, GNSS receiver's internal TCXO is going to be more stable than raspberry Pi's crystal oscillator.

4. for more fun - just keep measuring the frequency drift vs. temperature change. a sample set of 24-48 hours of such measurements should be enough to figure out the tempco of that unstable crystal oscillator so you can get chrony to do the drift compensation. from memory, chrony supports such a temperature compensation lookup table to be specified.

good luck and have fun

reply
auspiv
5 hours ago
[-]
1 - I like to consider myself an expert eBay-er (my ebay account can legally buy alcohol in the US, and I am myself only 36 years old) but I have not found any ZED-F9P for less than $90 ish. I am aware of the Quactel (sp) modules, and am considering a set for some RTK experiments. Is that what you're referring to?

2 - I've thought about replacing with ocxo but the pads are tiny. My soldering skills aren't nearly as good as my ebaying skills.

3 - it is outputting 16 Hz, chrony is doing dpoll -4 poll 4.

4 - I have done tempcomp (it is not active for the post referenced by this HN post). Fun stuff and probably my next writeup.

reply
jauntywundrkind
12 hours ago
[-]
Amazing project, great write-up. Would love to see a temperature graph as well! I'm wondering how good the PID controller here is working.

For future improvements, a cheap but effective win might be to put a temperature sensor on the oscillator (or two or three in various places). And use that to drive the PID loop.

Even if just experimental & not long term, it would be nice to have data on how strong the correlation is between the cpu & oscillator temperatures. To see their difference and how much that changes over time. Another graph! CPU vs txco (vs ambient?) temperatures over time.

reply
auspiv
5 hours ago
[-]
I for sure thought I included one, updating with that in a min!

Others have mentioned the temperature compensation. I've done it and it sounds like I should do that for the next writeup. A simple DS18B20 close to the Pi produces reasonable results.

reply
Kerbonut
11 hours ago
[-]
> put a temperature sensor on the oscillator

At that point, couldn't we just use the temperature value to compensate for the drift?

reply
HPsquared
9 hours ago
[-]
You can do that as well, but (in theory) the correction will be smaller than it otherwise would need to be if the temperature is regulated within a narrower range.
reply
anthk
7 hours ago
[-]
For RPI's an industrial graded SD it's mandatory.
reply
ckocagil
10 hours ago
[-]
That's a neat software solution. My first inclination would be to grab a soldering iron and replace the crystal with either a TCXO or a socket to provide an external clock disciplined to the 1PPS.
reply
TZubiri
8 hours ago
[-]
Or just put an RTC in it
reply
nottorp
11 hours ago
[-]
The related question is:

Is the Pi going the Pentium 4 route?

reply
heresie-dabord
8 hours ago
[-]
In 2025, there are extremely efficient CPUs from Intel and Apple. Under 5W idle!

The old Intel CPUs were grotesquely inefficient. Every single generation of Raspberry Pi has been well under 5W idle. And just so it's clear, the author is using an old Raspberry Pi 3.

From TFA:

> The RPi 3B’s 19.2 MHz oscillator is physically located near the CPU on the Raspberry Pi board, so by actively controlling CPU temperature, we’re indirectly controlling the oscillator’s temperature.

Also note that the R.Pi can even be further optimised by switching off HDMI.

https://www.pidramble.com/wiki/benchmarks/power-consumption

https://www.jeffgeerling.com/blogs/jeff-geerling/raspberry-p...

reply
nottorp
8 hours ago
[-]
> In 2025, there are extremely efficient CPUs from Intel and Apple. Under 5W idle!

I don't think Intel has any "efficient" CPU that can go passively cooled at load though. Maybe Apple can do it for the low end SoCs.

The Pi 3 can go passively cooled and maybe even without a heatsink at load, but the newer Pis can't. Judging by the progression from 3 to 4 to 5, they will reach P4 levels of heat in the name of speed around ... 7?

> And just so it's clear, the author is using an old Raspberry Pi 3.

Yes, the author has harder problems to solve than what I'm whining about. But my concern is a bit related.

reply
heresie-dabord
7 hours ago
[-]
> The Pi 3 can go passively cooled and maybe even without a heatsink at load, but the newer Pis can't.

The Raspberry Pi 4 can be used without a fan. They are packaged inside keyboards, but both the Raspberry Pi 400 and the Raspberry Pi 500 are passively cooled.

reply
nottorp
7 hours ago
[-]
I have a 4 with a huge ass passive heatsink on at home. It's my minecraft server when I feel like it.

The heatsink is uncomfortable to touch (it's not in a case). Pretty sure it would downclock if i removed it. So it works without a fan, but not without a heatsink (I bet the keyboards have a heatsink for the Pis built in.).

A 3 would have worked fine without any heatsink at all. At least at normal room temperature.

reply
heresie-dabord
6 hours ago
[-]
> Pretty sure it would downclock if i removed it. So it works without a fan, but not without a heatsink

The Raspberry Pi 4 will throttle performance at 80C (if I recall correctly) but it can work without a heatsink. I have a R.Pi 4 working without a passive cooler in an enclosure and it reports 55C-62C most of the time.

> I bet the keyboards have a heatsink for the Pis built in

You are right, both models use a heatsink. The PCB is also different, although the respective CPUs are standard Pi 4 or Pi 5.

Incidentally, the CPU in the R.Pi 400 has a higher clock rate than the standard R.Pi 4, so it performs better.

reply
echelon_musk
6 hours ago
[-]
> I don't think Intel has any "efficient" CPU that can go passively cooled at load though. Maybe Apple can do it for the low end SoCs.

Does the N100 not fit this criteria?

reply
heresie-dabord
6 hours ago
[-]
The N100 requires active cooling.
reply
ThrowawayR2
3 hours ago
[-]
Not necessarily; for example, the LattePanda Iota SBC with an Intel N150 has a passive heatsink option. Also, industrial fanless PCs have around for a long time even for much more powerful x86 processors.
reply
geerlingguy
6 hours ago
[-]
Technically it can work with passive cooling. Just, you need a pretty hefty heatsink in comparison to other SoCs.
reply
mort96
10 hours ago
[-]
What is this even supposed to mean? What's "the Pentium 4 route"?
reply
nottorp
10 hours ago
[-]
I'm an old fart :)

Intel tried to scale frequency up with the Pentium 4 in the name of performance, and it ended up extremely hot and power hungry. Just like some high end CPUs now, but then it applied to every model from Intel.

I suppose you don't remember when a Raspberry Pi could run fine even without a heatsink, let alone active cooling. That's more recent than the Pentium 4.

reply
esskay
9 hours ago
[-]
It's already there really. It's heat output on the 4 and more so the 5 benefits from active cooling. The good news is the pi is practically pointless as a product for most people these days, and vastly better options are available cheaper, so unless you genuinely need the gpio theres little reason to buy one - very much their own fault for focusing on commercial applications but the Pi 5 as a product is practically pointless for a consumer use at this point. An old Pi 2 or 3 which dont need any cooling are very useful still for a range of applications but the newer ones are in a bit of a weird niche where they're overpriced compared to most options.
reply
sceptic123
6 hours ago
[-]
What are the cheaper options for e.g. running a media centre?
reply
nottorp
6 hours ago
[-]
Probably second hand intel/amd boxes.
reply
esskay
5 hours ago
[-]
yup bingo, you can pick up vastly better devices than a pi for quite literally half the price now.
reply
sgarland
5 hours ago
[-]
Thanks for giving me yet another reminder that I’m old. I caught the reference immediately and thought nothing of it, and then this shattered that.

The early ‘00s were a wild time. Intel boldly stating they expected to get the P4 up to 10 GHz, AMD having to assign clock speed equivalence ratings for their chips… I also remember thinking the P4EE was insanely priced ($1000, or about $1700 in 2025 USD), but now we have >$10K Threadrippers.

reply
mort96
10 hours ago
[-]
What's the point in reading posts like this when the solution "they" came up with is basically, "tell Claude to make a script which does whatever"? I read blog posts to read thoughts from people, not computers
reply
auspiv
5 hours ago
[-]
There is a difference between telling Claude to do something and actually doing it, and writing about it. The fact that this has 199 points as of writing this comments means people want to read about the results. You will probably be quite unhappy to learn that Claude write the majority of the blogpost as well -

"Hey claude read my previous posts and this script and generate a new blog post about this temp control stuff. Generate whatever plots you want. Here's how you can access my influxdb with the data. Here's how you can ssh into the Pi to get the exact running scripts. Ok here's my wordpress token - upload it and the pictures. Oh it looks like your .md to wordpress failed really bad, read a previous post to find out how to format stuff. Oh the tables are still not right, try again." <-- literally what I spent an hour doing last night and refining.

Maybe people would find that interesting as well!

reply
stavros
10 hours ago
[-]
Are you similarly frustrated that he didn't sit there 24/7, heating the oscillator with a small lighter when needed, but automated it instead? Why would this be more interesting for you if he'd written the script himself?
reply
mort96
10 hours ago
[-]
> Are you similarly frustrated that he didn't sit there 24/7, heating the oscillator with a small lighter when needed, but automated it instead?

No

> Why would this be more interesting for you if he'd written the script himself?

Was I unclear? I read blog posts to read thoughts from humans, not from computers.

reply
stavros
10 hours ago
[-]
Well, I guess your era of reading is over, sadly.
reply
KeplerBoy
8 hours ago
[-]
It doesn't become claude's insight just because the user instructed it to write some trivial bash script.

Then again these are all well known optimizations (core-pinning, frequency-locking, thermal stabilization for oscillators). The interesting part is the actual measurement results over multiple days. That's something you can't get from a single prompt.

reply
tensegrist
9 hours ago
[-]
But here's the key insight: what's the point in reading posts where the post itself is "tell Claude to write a post about…"
reply