Do LLMs identify fonts?
64 points
1 month ago
| 12 comments
| maxhalford.github.io
| HN
StellarScience
26 days ago
[-]
With the latest Microsoft Word, if you open a PDF that is a scanned image of a document and convert it to Word format, it does a pretty decent job of not only OCR (optical character recognition) but also picking matching fonts for various sections.

I just tested this with my internet connection disabled and it still worked. Since it's doing local processing, I suspect it uses traditional OCR algorithms rather than LLMs.

As the article concludes, LLMs aren't magic, they're just one useful tool to include in your toolbox.

reply
aaroninsf
26 days ago
[-]
It's pretty easy to imagine an evolved mess of an open ad hoc but broadly adopted ecosystem where LLM are surrounded by a bewildering array of Node-like domain-specific extensions.

Security concerns aside (...) that sounds pretty useful.

reply
StellarScience
26 days ago
[-]
Right, for example early LLMs were notoriously bad at math, as they had been trained on language. They'd get simple math right, likely due to "rote memorization", but couldn't do basic arithmetic with 3-digit numbers. The common AI agents seem much better now. I suspect they added separate math processing logic and trained the LLMs to recognize when and how to delegate to it, though I'm not certain of that.

Similarly coding-focused LLMs can access backend engines that actually run the code and get feedback, either to show the user or to internally iterate.

Having a whole host of such backend processors would be great. Users still only ever have to interact using natural language, but get the power of all these specialized tools in the backend. There are some tasks LLMs can do, but special-purpose algorithms may do better, faster, and/or with less energy usage.

reply
micromacrofoot
26 days ago
[-]
How close are the wrong guesses? Fonts are fairly incestuous because the shapes of the characters themselves can't be copyrighted (only the code), so there are sometimes dozens of clones of very similar fonts... especially on a free site like dafont
reply
Rastonbury
26 days ago
[-]
I've asked LLMs to suggest fonts/similar fonts for me from screenshots and seems like they are close enough to my untrained eye
reply
she46BiOmUerPVj
26 days ago
[-]
I would have never thought to not use "what the font"

https://www.myfonts.com/pages/whatthefont

reply
pwython
26 days ago
[-]
Yea, I just used WTF to help that guy who was waiting 2 years to find a font.

https://www.dafont.com/forum/read/522670/font-identification

reply
squigs25
26 days ago
[-]
WTF is often wrong, and actually, I don't think your answer in the 2 year old thread is correct
reply
StrangeDoctor
26 days ago
[-]
I agree I feel like it’s just blatantly funneling me into those dubious buy this font sites. I have somewhat better success with http://www.identifont.com/ usually

I don’t think the proposed font is correct either, I’m not even sure the concept of font works for that example though. Mainly the arches on the m are wrong, too arch like and whereas the example is more teardrop.

reply
Lemaxoxo
26 days ago
[-]
Op here. I tried what the font a bit but didn't mention it in the article. I didn't get good results with it. Although it's probably a good idea to ask it for a guess, and feed that to the LLM too.
reply
Doohickey-d
26 days ago
[-]
I'd be curious how much better a more expensive LLM would do - gpt-4o-mini and gemini-2.5-flash-preview-05-20 are definitely not the most capable LLMs one could have chosen.
reply
double051
26 days ago
[-]
Maybe they're just cheap and fast enough for the author to perform an affordable analysis?

I agree that using the frontier models would be much more interesting.

reply
Workaccount2
26 days ago
[-]
Maybe I simply don't know how advertising works, but wouldn't it be totally possible that these fonts are just one-off drawn in text?
reply
smallerize
26 days ago
[-]
The benchmark only evaluates responses once the community has identified the font.

E.g. https://www.dafont.com/forum/read/569491/taylor-swift-font-p...

reply
empath75
26 days ago
[-]
That isn't the font, you can look at it yourself, it doesn't match.
reply
tjader
26 days ago
[-]
To me it looks like the same font, but with letter spacing reduced so the letters don't flow into each other nicely but overlap a bit.

Edit: here's the same effect made on inkscape: https://i.postimg.cc/TYV6K6bt/taylorswift.png

reply
pessimizer
26 days ago
[-]
And no matter what, when preparing for print you're going to mess with all of the kerning until everything looks right or to get effects that you want. You don't just accept the kerning of any font. The only reason to buy expensive fonts is that you have to touch the kerning less often.
reply
rubyn00bie
26 days ago
[-]
I would say there’s a good chance they could be one-offs created by whoever was doing the ad. If you’re paying an artist, having them do the lettering could certainly be cheaper than licensing a font for the purpose (or developing a font that’ll never be used outside of one, or a series, of ads).
reply
k3liutZu
26 days ago
[-]
My fellow designer friends would often do this. But they would start from actual fonts and do slight (or more than slight) adjustments to them to match what they wanted as an outcome.
reply
bbarnett
26 days ago
[-]
Yes but if you mess with fonts too much, then this can happen:

https://www.youtube.com/watch?v=snjCj0ntG8E

reply
elicash
26 days ago
[-]
Results here are bad, obviously, but it'll be interesting when LLMs can not just identify fonts but unredact pieces of documents in places where just a few words are removed by analyzing the length of redaction, combos of letters that fit into it, and the context.
reply
lblume
26 days ago
[-]
Why would you even need LLMs for that? Notwithstanding context, finding text that fits into a given bounding box is already perfectly doable via a classical algorithm (in this case e.g. based on dynamic programming).
reply
elicash
26 days ago
[-]
> Notwithstanding context

This is actually quite important! Especially when you're not talking about a single word/name but a group of several words/names.

reply
mopsi
26 days ago
[-]
I recently tried to identify a font from a screenshot of an ad and used everything I could find, from WhatTheFont to LLMs. The LLMs were hopeless at identifying the font from the screenshot, but ChatGPT eventually led me to the correct result after I threw away the image and started describing the font in plain text: monospacing, a dot in the middle of the 0, and (presumably) wide usage. It turned out to be Ubuntu Mono. It was surprising that so many obscure fonts were suggested, none of which were even a reasonably close match, while Ubuntu Mono was completely overlooked.
reply
larodi
26 days ago
[-]
What makes us think font information made it into the traning set atll, rather than something more along the line of "all chars that look like this one are to be interpreted as 'a'". doesnt need to provide font name for it.
reply
cormullion
26 days ago
[-]
I suspect that few professional (paid for) adverts use any fonts from dafont.com, and many fonts would anyway be unavailable to ordinary users. The current font recogniser programs are usually trained on commercially available fonts
reply
gdudeman
26 days ago
[-]
Missing from the methodology: - was thinking on or off? (At least for Gemini) - was web search allowed? - was tool use allowed?

It’s quite likely LLMs don’t “know” the fonts in the dataset, but they could figure many of them out.

reply
HocusLocus
26 days ago
[-]
Why don't you just ask the document creator?

Every time I turn around these days I encounter someone ready to use an infinite amount of energy that is being paid for by other people, to 'simulate' some analog process by temporarily taking the reins of some data center that is burning megawatts of energy. We are being given the reins for 0.5 seconds but very soon the horse will gallop away unless we have a lot of money to spend.

reply
qezz
26 days ago
[-]
It's not always possible to ask the creator, especially for the old pieces
reply
mjburgess
26 days ago
[-]
In this case, it's seems it is highly likely to be possible: https://www.studioheavenly.com/our-work/mira-wellness (1 google)
reply