Comparing GPT-4o vs. GPT-4o-Mini: How Different AI Models Rank the Same Content
3 points
12 hours ago
| 1 comment
| lightcapai.medium.com
| HN
hadiai
12 hours ago
[-]
Built a quick experiment to see how AI models differ in their judgment of writing quality. Fed the same Medium article titles to both GPT-4o-mini and GPT-4o to see how their rankings would compare. The interesting bit isn't just the rankings themselves, but how the models diverge in their evaluation criteria - the "mini" model seems to have subtly different preferences despite being from the same family. Code is included (Python scraper + API calls), along with full logs showing the ranking rationale from each model. Started this as a voice-dictated script on a cold walk home. Sometimes the best experiments come from random "what if" thoughts.
reply