Are VC's just that lazy about making investment decisions? Is this yet another side-effect of ZIRP[2] and too much money chasing a return? Is nobody looking too hard in the hope of catching the next rocket to the moon?
From the outside, investing based on GitHub stars seems insane. Like, this can't be a serious way of investing money. If you told me you were going to invest my money based on GitHub stars, I'd laugh, and then we'd have an awkward silence while I realize there isn't a punchline coming.
[0] I'm from Cleveland. I get to pick on them.
[1] https://en.wikipedia.org/wiki/List_of_Cleveland_Browns_seaso... I think their record speaks for itself.
Here are the things I look at in order:
* last commit date. Newer is better
* age. old is best if still updating. New is not great but tolerable if commits aren't rapid
* issues. Not the count, mind you, just looking at them. How are they handled, what kind of issues are lingering open.
* some of the code. No one is evaluating all of the code of libraries they use. You can certainly check some!
What does stars tell me? They are an indirect variable caused by the above things (driving real engagement and third interest) or otherwise fraud. Only way to tell is to look at the things I listed anyway.
I always treated stars like a bookmark "I'll come back to this project" and never thought of it as a quality metric. Years ago when this problem first surfaced I was surprised (but should not have been in retrospect) they had become a substitute for quality.
I hope the FTC comes down hard on this.
Edit:
* commit history: just browse the history to see what's there. What kind of changes are made and at what cadence.
You might not have but the makers of dependencies that you use might so still problematic.
I have limited time on this Earth and at my employer. My job is not critical to life. I am comfortable with this level of pragmatism.
Build a SaaS and you'll have "journalists" asking if they can include you in their new "Top [your category] Apps in [current year]", you just have to pay $5k for first place, $3k for second, and so on (with a promotional discount for first place, since it's your first interaction).
You'll get "promoters" offering to grow your social media following, which is one reason companies may not even realize that some of their own top accounts and GitHub stars are mostly bots.
You'll get "talent scouts" claiming they can find you experts exactly in your niche, but in practice they just scrape and spam profiles with matching keywords on platforms like LinkedIn once you show interest, while simultaneously telling candidates that they work with companies that want them.
And in hiring, you'll see candidates sitting in interview farms quite clearly in East Asia, connecting through Washington D.C. IPs, present themselves with generic European names, with synthetic camera backgrounds, who somehow ace every question, and list experience with every technology you mention in the job post in their CVs already (not hyperbole, I've seen exactly this happen).
If a metric or signal matters, there is already an ecosystem built to fake it, and faking it starts to be operational and just another part of doing business.
I'm increasingly convinced the issue isn't feedback itself, but centralized, global, aggregated feedback that becomes game-able without stronger identity signals.
Right now the incentives are tied (correctly or not) to these global metrics, so you get a market for faking them, with money flowing to whoever is best at juicing that signal.
If instead the signal was based on actual usage and attributions by actual developers, the incentives shift. With localized insight (think "Yeah, I like Golang") it becomes both harder to fake and harder to get at the metric rollup.
Useful reputation on the web is actually much more localized and personal. I gladly receive updates on and would support the repos I've starred. If I could chose where to put my dollars (not an investors), it would likely include the list of repos I've personally curated.
This suggests a different direction: instead of asking "how many stars does this have?", ask "who is actually depending on this, and in what context?" or better retroactively compare your top-n repos to mine and we'll get a metric seen through our lenses. If you want to include everyone in that aggregation you'll end up where we are now, but if in stead you chose the list, well, the stars could align as a good metric once more.
The interesting part is that the web already contains most of that information, we just don't treat identity as a part of the signal (yet? universally?).
Short term, you pay the cost of fake signaling, which is simply deadweight loss. People spend resources to inflate signals instead of improving the actual thing.
Medium term, I suppose you could see how it increases consumption, since users would probably try something with 100k stars instead of 2, GitHub wants to seem that it's used more than it really is, repo owner is also benefiting.
Long term, the correspondence between how important a (distorted) system is perceived (Github, OSS, IT in general) vs how important it really is collapses quite abruptly and unnecessarily, and you end up with a lemon market [0] where signals stop being reliable at all.
We've recently decided to complicate life of AI bots in our repo https://archestra.ai/blog/only-responsible-ai, hoping they will just choose those AI startups who are easier to engage with.
Specifically someone submitted a library that was only several days old, clearly entirely AI generated, and not particularly well built.
I noted my concerns with listing said library in my reply declining to do so, among them that it had "zero stars". The author was very aggressive and in his rant of a reply asked how many stars he needed. I declined to answer, that's not how this works. Stars are a consideration, not the be all end all.
You need real world users and more importantly real notability. Not stars. The stars are irrelevant.
This conversation happened on GitHub and since then I have had other developers wander into that conversation and demand I set a star count definition for my "vague notability requirement". I'm not going to, it's intentionally vague. When a metric becomes a target it ceases to be a good metric as they say.
I don't want the page to get overly long, and if I just listed everything with X star count I'd certainly list some sort of malware.
I am under no obligation to list your library. Stop being rude.
I've been thinking about this a lot. These metrics are all just marketing signals to draw people's attention, trying to make some kind of deals. So the fix should be: make the cost of the signal match what it claims to represent. I'm obsessed with something called DUKI /djuːki/ (Decentralized Universal Kindness Income, a form of UBI) — the idea is that instead of stars or reviews, trust comes from deals pledging real money to the world for all as the deal happens. You can't fake that cheaply.
So the metric becomes the money itself — if you fake X amount, it costs you X, and the world will thank you by paying attention...
Why am I not surprised big Capital corrupts everything. Also, Goodhart's law applies again: "When a measure becomes a target, it ceases to be a good measure".
HN Folks: What reliant, diverse signals do you use to quickly eval a repo's quality? For me it is: Maintenance status, age, elegance of API and maybe commit history.
PS: From the article:
> instead tracks unique monthly contributor activity - anyone who created an issue, comment, PR, or commit. Fewer than 5% of top 10,000 projects ever exceeded 250 monthly contributors; only 2% sustained it across six months.
> [...] recommends five metrics that correlate with real adoption: package downloads, issue quality (production edge cases from real users), contributor retention (time to second PR), community discussion depth, and usage telemetry.
I think as a proxy it fails completely: astroturfing aside stars don't guarantee popularity (and I bet the correlation is very weak, a lot of very fundamental system libraries have small number of stars). Stars also don't guarantee the quality.
And given that you can read the code, stars seem to be a completely pointless proxy. I'm teaching myself to skip the stars and skim through the code and evaluate the quality of both architecture and implementation. And I found that quite a few times I prefer a less-"starry" alternative after looking directly at the repo content.
Imagine you're choosing between 3 different alternatives, and each is 100,000 LOC. Is 'reading the code' really an option? You need a proxy.
Stars isn't a good one because it's an untrusted source. Something like a referral would be much better, but in a space where your network doesn't have much knowledge a proxy like stars is the only option.
100k is small, but you're right, it can be millions. I usually skim through the code tho, and it's not that hard. I don't need to fully read and understand the code.
What I look at is: high-level architecture (is there any, is it modular or one big lump of code, how modular it is, what kind of modules and components it has and how they interact), code quality (structuring, naming, aesthetics), bus factor (how many people contribute and understand the code base).
Looking at the commit history, closed vs open issues and pull requests provides a much more useful signal if you can't decide from the code.
(Sometimes still is, but the agents garbage does not help)
It’s more expensive to compute, but the resulting scores would be more trustworthy unless I’m missing something.
If the number of stars are in the thousands, tens of thousands, or hundreds of thousands, that might correlate with a serious project. But that should be visible by real, costly activity such as issues, PRs, discussion and activity.
It is the meaning of having dozens or hundreds of stars that is undermined by the practice described at the linked post.
The fake accounts often star my old repos to look like real users. They are usually very sketchy if you think for a minute, for example starring 5,000 projects in a month and no other GitHub activity. One time I found a GitHub Sponsor ring, which must be a money laundering / stolen credit cards thing?
Two projects could look exactly the same from visible metrics, and one is complete shell and the other a great project.
But they choose not to publish it.
And those same private signals more effectively spot the signal-rich stargazers than PageRank.
That said, I believe the core problem is that GitHub belongs to Microsoft, and so it will still go more towards operating like a social network than not - i.e. engagement matters. It will still take a good will to get rid of Social Network Disease at scale.
There are much better ways of finding those who have good taste.
Even 10 years ago most VCs we spoke to had wisened up and discarded Github stars as a vanity metric.
one VC told me, you'll get more funding and upvotes if u don't put "india" in your username.
Founders need the ability to get traction, so if a VC gets a pitch and the project's repo has 0 stars, that's a strong signal that this specific team is just not able to put themselves out there, or that what they're making doesn't resonate with anyone.
When I mentioned that a small feature I shared got 3k views when I just mentioned it on Reddit, then investors' ears perked right up and I bet you're thinking "I wonder what that is, I'd like to see that!" People like to see things that are popular.
By the way, congrats on 200 stars on your project, I think that is definitely a solid indicator of interest and quality, and I doubt investors would ignore it.
I think VCs just know that there are no reliable systems, so they go with whatever's used.
* https://arxiv.org/abs/2412.13459 (2024/2025) - Six Million (Suspected) Fake Stars in GitHub: A Growing Spiral of Popularity Contests, Spams, and Malware
As a side note it's kind of disheartening that everytime there is a metric related to popularity there would be some among us that will try to game it for profit, basically to manipulate our natural bias.
As a side note it's always a bit sad how the parasocial nature of the modern web make us like machine interfacing via simple widgets, becoming mechanical robot ourselves rationalising IO via simple metrics kind of forgetting that the map is never the territory.
In my opinion, nothing could be more wrong. GitHub's own ratings are easily manipulated and measure not necessarily the quality of the project itself, but rather its Popularity. The problem is that popularity is rarely directly proportional to the quality of the project itself.
I'm building a product and I'm seeing what important is the distribution and comunication instead of the development it self.
Unfortunately, a project's popularity is often directly proportional to the communication "built" around it and inversely proportional to its actual quality. This isn't always the case, but it often is.
Moreover, adopting effective and objective project evaluation tools is quite expensive for VCs.
I'm not supporting this view but it is what it is unfortunately.
VCs that invest based on stars do know something I guess or they are just bad investors.
IMO using projects based on start count is terrible engineering practice.
Hype helps raise funds, of course, and sells, of course.
But it doesn't necessarily lead to long-term sustainability of investments.
We should do a hall of shame!
You'd want to discard a lot of the noise in the bottom 20% of linking power. You want to focus more on the 'trust' factor.
In general, I’ve been dissatisfied with GitHub’s code search. It would be nice to see innovation here.
Unfortunately I still look at them, too, out of habit: The project or repo's star count _was_ a first filter in the past, and we must keep in mind it no longer is.
> Good reminder that everything gets gamed given the incentives.
Also known as Goodhart's law [1]: "When a measure becomes a target, it ceases to be a good measure".
Essentially, VCs screwed this one up for the rest of us, I think?
I agree that it has been a first filter, but should it ever have been? A star only says that someone had a passing interest in a project. Not significantly different from a 'like' on a social media post.
Id suggest the first question to ask is "if the project is an AI project or not?" If it is, dont pay attention to the stars - if it's not, use the stars as a first filter. That's the way I analyse projects on Github now.
Github stars is akin to 'link popularity' or 'pagerank' which is ripe for abuse.
One way around it is to trust well known authors/users more. But it's hard to verify who is who. And accounts get bought/closed/hacked.
Another way is to hand over the algo in a way where individuals and groups can shape it, so there's no universal answer to everyone.
It does feel like everything is a scam nowadays though. All the numbers seem fake; whether it's number of users, number of likes, number of stars, amount of money, number of re-tweets, number of shares issued, market cap... Maybe it's time we focus on qualitative metrics instead?
> When nobody is forking a 157,000-star repository, nobody is using it
that is completely not true, i don't fork a repo when i use it, only when i want to contribute to it (and usually cleanup my forks)
Specifically if those avatars are cute animie girls.
I know you are half joking/not joking, but this is definitely a golden signal.
https://github.com/karakeep-app/karakeep
Sounds useful.
I’ll star it and check it out later ;)
I guess it's like fake followers on other social media platforms.
To me, it just reflects a behaviour that is typical of humans: in many situations, we make decisions in fields we don't understand, so we evaluate things poorly.
We figured out a workaround to limit activity to prior contributors only, and add a CI job that pushes a coauthored commit after passing captcha on our website. It cut the AI slop by 90%. Full write-up https://archestra.ai/blog/only-responsible-ai
I'd give a lot of credit to Microsoft and the Github team if they went on a major ban/star removal wave of affected repos, akin to how Valve occasionally does a major sweep across CSGO2 banning verified cheaters.
For Microsoft this is another kind of sunk cost, so idk how much incentive they have to fix this situation.
I am not successful at all with my current projects (admittedly am not trying to be nowadays), so feel free to dismiss this advice that predates a time before LLM driven development, but in the past, I have had decent success in forums interacting with those with a specific problem my project did address. Less in stars, more in actual exchange of helpful contributions.
My first Open Source project easily got off the ground just by being listed in SourceForge.
On Github stars, I'd argue they are the most suitable comparison, as all the funny business regarding stars should be, if at all, detectable by Github directly and ideally, bans would have the biggest deterrent effect, if they happened in larger waves, allowing the community to see who did engage in fraudulent behaviour.
I paid github for years to keep my repos private...
But then I don't participate in the stars "economy" anyway, I don't star and I don't count stars, so I'm probably irrellevant for this study.
It’s supposed to get people to actually try your product. If they like it, they star it. Simple.
At that point, forcing the action just inflates numbers and strips them of any meaning.
Gaming stars to set it as a positive signal for the product to showcase is just SHIT.
> Runa Capital publishes the ROSS (Runa Open Source Startup) Index quarterly, ranking the 20 fastest-growing open-source startups by GitHub star growth rate. Per TechCrunch, 68% of ROSS Index startups that attracted investment did so at seed stage, with $169 million raised across tracked rounds. GitHub itself, through its GitHub Fund partnership with M12 (Microsoft's VC arm), commits $10 million annually to invest in 8-10 open-source companies at pre-seed/seed stages based partly on platform traction.
This all smells like BS. If you are going to do an analysis you need to do some sound maths on amount of investment a project gets in relation to github starts.
All this says is stars are considered is some ways, which is very far from saying that you get the fake stars and then you have investment.
This smells like bait for hating on people that get investment
> As one commenter put it: "You can fake a star count, but you can't fake a bug fix that saves someone's weekend."
I'm curious what the research says here---can you actually structurally undermine the gamification of social influence scores? And I'm pretty sure fake bugfixes are almost trivial to generate by LLMs.
“gstack is not a hypothetical. It’s a product with real users:
75,000+ GitHub stars in 5 weeks
14,965 unique installations (opt-in telemetry, so real number is at least 2x higher)
305,309 skill invocations recorded since January 2026
~7,000 weekly active users at peak”
GitHub stars are a meaningless metric but I don’t think a high star count necessarily indicates bought stars. I don’t think Garry is buying stars for his project.
People star things because they want to be seen as part of the in-crowd, who knows about this magical futuristic technology, not because they care to use it.
Some companies are buying stars, sure, but the methodology for identifying it in this article is bad.