Zen and the Art of Machine Learning Research
114 points
3 days ago
| 11 comments
| blog.jxmo.io
| HN
HarHarVeryFunny
1 minute ago
[-]
> on days we find insight, we sit.

> on days we do not find insight, we sit.

This reminds me of Ed Witten (greatest living physicist?) in an interview by Brian Green. Green asked Witten what his day-to-day was like at the Institute for Advanced Study ...

Wittens' reply: "I sit at my desk".

reply
jdw64
5 hours ago
[-]
I feel that the Zen used in the West and the Zen in East Asia are quite different. I think the Western Zen is probably the one from the 1970s book Zen and the Art of Motorcycle Maintenance. It usually carries a sense of equanimity and beginner's mind. But in East Asia, Zen actually emphasizes aimlessness or non‑purposefulness.

The point where I really feel the difference is that Western Zen seems to be about how to train the self to become stronger, whereas actual Seon (Zen) in East Asia is about going with nature, letting go of the self, and allowing things to flow. In the actual practice of Seon, it's about doubting the self, letting go of attachments, and realizing that achievement, comparison, and the desire for control are all just fleeting. There's a famous phrase: 'Banghasak (放下著)' — let it all go.

If anything, I think ancient Roman Stoicism feels more like Zen than Western Zen does

So that's fascinating. When I saw this article, I was expecting it to be about whether we should give up the desire for success, but instead it took a completely different direction, which was surprising

reply
tsumnia
8 minutes ago
[-]
> The point where I really feel the difference is that Western Zen seems to be about how to train the self to become stronger, whereas actual Seon (Zen) in East Asia is about going with nature, letting go of the self, and allowing things to flow.

I think the Western sentiment, and why it is attached to strength, comes from a combination of the West's allure to Eastern martial arts and the reality of plateauing during training. Once you've been doing something for 2 years, you are no longer seeing the massive learning gains you saw as a true novice. However in that journey toward "mastery" (a term I hate) you have to keep a positive outlook that the practice takes time.

I now use this phrase from my instructor: "Practice makes permanent". There's no such thing as "perfect practice", but whatever you practice is what will stick.

reply
supertroop
5 minutes ago
[-]
And if anyone actually read the book, Zen was about processing childhood trauma.
reply
peepee1982
5 hours ago
[-]
Similarly, the Western idea of Stoicism seems to focus mostly on controlling or even suppressing your emotions (at least on surface level), while the Stoicism you rightly call "Roman" (thanks for that, btw) is much more holistic and more of an ethical framework.
reply
jdw64
5 hours ago
[-]
Thank you for letting me know correctly.
reply
johndhi
2 hours ago
[-]
Who doesn't call stoicism Roman?
reply
BigGreenJorts
1 hour ago
[-]
Most pop stoics focus on the Greeks :P
reply
lelanthran
1 hour ago
[-]
> Who doesn't call stoicism Roman?

The Greeks?

reply
ffsm8
53 minutes ago
[-]
When I read the title I thought it was about running machine learning algorithms on AMD/Zen processors
reply
isoprophlex
3 hours ago
[-]
"To be done with doing", from Ursula K. LeGuin's Earthsea novels, always struck me as such a powerful phrase. An entire state of mind boiled down to 5 words. But then again I remember her saying eastern philosophy greatly influenced her writing, if I'm not mistaken
reply
jdw64
2 hours ago
[-]
The 'Le Guin' series actually had similar kinds of stories in Asia before. There's a strong Taoist influence, you see—more specifically, Chinese-style Taoism rather than a Buddhist perspective.

From the viewpoint of '不立文字 (Bù lì wén zì): truth is not confined to language; language is merely the finger pointing at the truth' — this is closer to Taoism than to Zen. In fact, the Chinese worldview runs deep throughout her worldbuilding. Le Guin's take on 'magic' reflects a profound understanding of Eastern philosophy. The reason Ged doesn't use magic lightly is precisely a matter of balance, and (without giving away spoilers) the final confrontation between Ged and the Shadow is essentially about embracing one's own dark side — which shows a deep grasp of Taoist thought.

Personally, I also love the Earthsea series. The philosophy underlying that world is exactly the kind that resonates especially well with East Asian readers

reply
isoprophlex
2 hours ago
[-]
Ha, wow, thanks for the refinement. Indeed use of language (especially at the end with the dragons) is a very important theme.

And I agree, it's more than excellent. The judicious magic, the way she manages to naturally - without it becoming a sermon - describe acts of kindness as the biggest miracles, is great.

Highly recommended.

reply
andai
1 hour ago
[-]
To be done with doing, would appear to require passive income?
reply
sph
2 hours ago
[-]
> Zen actually emphasizes aimlessness or non‑purposefulness

The visual metaphor from Taoists is being like 'uncarved wood'. Western Zen has been bastardised and commercialised, whereas one can look into Taoism to find many of the same concepts that, by virtue of their own simplicity, have remained timeless. The "problem", so to speak, with Zen is being associated with Buddhism, which has a long and intricate history and body of works attached to it, yet moves towards the same line of simplicity and spontaneity of Taoism.

In the words of Alan Watts, it all starts with the eternal Tao; all other religions are for people that need the same ideas overcomplicated with too many words.

reply
jdw64
2 hours ago
[-]
You seem to know quite a lot about the East. Buddhism and Taoism are a bit different, of course, but your understanding is largely in line with how Eastern popular thought actually sees things. It seems like you've done a fair amount of business with Easterners.
reply
turzmo
2 hours ago
[-]
Would either of you have a recommendation on where to start learning about either?
reply
sph
1 hour ago
[-]
My journey into this world started with Watts' "The Way of Zen", and later, with his posthumous book "Tao: The Watercourse Way"

And I am a big fan of Ron Hogan's "Getting Right with Tao" translation/modern interpretation of the Tao Te Ching.

reply
jdw64
1 hour ago
[-]
Lao Tzu: Tao Te Ching (Translated Ursula K. Le Guin) The Way of Lao Tzu (Wing-tsit Chan)
reply
sph
1 hour ago
[-]
I am just another western poser that has sought peace of mind reading Eastern philosophy. I am no expert.
reply
ahartmetz
1 hour ago
[-]
Kind of like when I had dark bread in Asia, it was white bread with food color.

Some things don't transfer well.

reply
Geezus_42
2 hours ago
[-]
That good 'ole Protestant work ethic. Idle hands are the Devil's play things!
reply
keybored
2 hours ago
[-]
Given AI’s impact on society, I read this more as Zen And The Practice of Kamikaze.
reply
rented_mule
5 hours ago
[-]
Around 2015, I found myself managing back end and machine learning engineers (not researchers) at the same time. Many of the back end engineers wanted to do more ML. Some of them did well when given a chance, but others wanted to revert to back end within a few months. At the same time, one of the ML leaders wanted to step away from ML and only do back end work to support ML.

As I studied these dynamics, something occurred to me... Different people need to see signs of success at different frequencies. Because of the nature of our product, measuring the performance of a new/updated model required the model to be live for at least a full calendar month. So, between initial work and final analysis, it was often a 2 month wait or more. For many back end tasks, you can build a quick prototype, run it to see if it works, and be on your way - the signals come all day long. The varying frequency needs of different people went a long way to determining which of them liked working on ML.

This is sort of a manager's version of feature engineering. ;-) The people on that team taught me a lot!

reply
dalvasorsali
44 minutes ago
[-]
I saw the same thing and always wondered how you can manage it effectively.

I had a team of data engineers that wanted to do more data science, and 2 data scientists that both wanted to be data engineers(one of them argued that everyone wants to be DS and so it was too crowded, saying that they could make more money as a DE).

I also remember a specific instance where, one day, my friend ranted about how he needs to step away from pure front end and that it's a dead end career (he was quite good at it too!) and then the next day at lunch a colleague started complaining about how front end developers get all the credit and he's considering moving.

reply
almarcher
51 minutes ago
[-]
Stepping away from the work to find inspiration, to allow the subconscious time to process everything, to present your conscious mind ideas is necessary. I try to pick a wild or almost outlandish idea from time to time, because if I only try what I think will work, then I'm not doing my job.
reply
mrmarket
2 hours ago
[-]
excellent essay. what a great read.

like the author said, so much of 'success' or 'progress' (in research but of course also across disciplines) depends upon temperament. just straight up having a good attitude about things. the skills that make a good researcher could not be more transferable: patience, innate curiosity, and a resilience against failure.

that said, these skills are increasingly rare/at a premium given our culture of minimizing discomfort tolerance via hyperconvenience. people have a harder and harder time waiting or failing.

reply
Scene_Cast2
2 hours ago
[-]
I think this also stems from ML being more like biology or alchemy and less like math or programming (where you can get down to the first principles, abstractions are rock solid, and non-determinism is limited in scope).
reply
WithinReason
4 hours ago
[-]
> If you want to solve a problem, the tried-and-true path to success is to attempt a solution, try it, reach a bottleneck, try to solve it, and only reach for literature when you’ve run out of ideas yourself.

I've found this to be the right balance between using your creativity and getting stuck too long

reply
sdfsefsdf
5 hours ago
[-]
Perhaps I've been deep in my own issues for too long, but it seems to me that the author is trying to say "don't trust the current evaluation suites too much"; scores only reflect a small part of the problem. What's interesting is discovering a new, stable evaluation metric, doing something new based on it, and having that new thing yield some unexpected intelligent results
reply
lostdog
6 hours ago
[-]
I have some coworkers that are similar in everything--education, work ethic, and intelligence--but some of the tick out ML ideas that work like clockwork, while others get hits rarely if ever. I cannot tell what makes it work for some and not others. Their ideas both sound equally good.

Sometimes a coworker will be an ML star for a year or two, but then suddenly run out of steam. It's brutal to watch.

I used to think most smart people had similar distributions of good ideas, and it was just that the hardest working tried out all 50 of their ideas to pick out the 2 good ones. But I've seen smart and hardworking people have a hit rate of 0.

reply
sdsdfsdff344sd
1 hour ago
[-]
It's not just ML research; that's just human nature.

We like to see hard-working, God-fearing people minting raw knowledge from Mount Olympus itself, whereby each shard of crystalline insight is carved meticulously by the Apprentice over the course of a productive and morally pure career.

The reality is it's some skill plus the occasional drive-by of an unknown force of nature, hitting you on the head with a shattered fragment of insight whose provenance you'll remain completely ignorant of. I'd say we just revert back to invoking the muses. It was a fine explanation.

reply
fyredge
6 hours ago
[-]
That's the nature of research. You try every idea that may be a good avenue and only a handful work out, if at all. That's why quantifying research credibility via publication and citation counts inherently lead to toxic work cultures. The best ideas must be given time to be discovered, not forced out and contorted to fit the requirements of a journal.
reply
bobmarleybiceps
5 hours ago
[-]
this is part of why I think most researchers get less productive over time... Someone gets some big result during grad school or early career, get some big job from it, and then struggle to get new results of similar quality :shrug:

With ML in particular, there's also the sheer volume of people basically all looking at (essentially) the same problems... so it's kind of like monkeys with type writers spamming ideas until some work.

reply
jack_pp
5 hours ago
[-]
In spirituality it is believed that ideas and inspirations aren't our own. That our mind is like an LLM that gets prompted by higher beings. In research everyone has high param count minds, trained for many years by studying. But just like LLMs by themselves are useless at creating new original work, no matter the compute you have available, so the mind can not create anything new without "inspiration"
reply
59nadir
5 hours ago
[-]
Wow, this makes ML sound even more like voodoo than I thought. Can you give examples of what the nature of these ideas is?
reply
stared
4 hours ago
[-]
It revolves around the sentiment of "go deeper" - but I think it is a double-edged sword. Sure, entropy, tensors and gradients are important - and yes, they are pretty much requirements.

But from what I see, it is the opposite - a lot (if not virtually all) progress in the last decade of deep learning was not because of a fundamental idea, but incremental, experimentally-verified practice. Even though I think there is good intuition for why ReLU is better than sigmoid (tl;dr: last layer is log(sigmoid) ~ ReLU, putting anything different inside kills the gradient), the original paper by Hinton himself was more or less "because it trains 3x faster".

Re-thinking fundamentals might help, but most "let's change the fundamentals" is rarely how it works. Even the most seminal papers, i.e. AlexNet and "Attention Is All You Need", are refinements of existing ideas, and show how they help.

Machine learning is an experimental science. Many mathematically cool ideas do not work. Many engineering ones do.

> I've tweeted before that one of the most important traits in a researcher is healthy paranoia. Be paranoid!

I have seen so many PhDs burned out to cinders; I don't think it is any more a good piece of advice than "depression is good for philosophers". Sure, be a relentless explorer.

> In short, holding on to ideas for too long can actually be counterproductive. Stay open-minded and refuse to let ego cloud your judgement.

Which I think is true.

reply
nathaah3
5 hours ago
[-]
This is gold!!!!
reply