Who could possibly have predicted that happening?
Predictably, everyone started talking in Slack like their jobs depended on it. Everyone was responding to everything. Instead of writing out a complete message and pressing enter, they'd send each fragment of the sentence as a new line.
The Slack leaderboard was never shown again. Unfortunately the habit remained because people were afraid they were going to be secretly judged by how much Slack activity they generated.
I expect the same thing is going to happen at companies who had token leaderboards. Once you've instilled that fear in people, they internalize the expectation.
No amount of "this isn't used for anything" will change that. It's inherent in human nature in the 21st century to believe any and all metrics will be used against them, and therefore must be gamed.
It's why you also have to set UNBELIEVABLY clear goals and have incentives tied to those goals. Incentives meaning money. If you want to measure things, measure them. But have clear, consistent, and meaningful goals tied to bonuses or something if you want a thing done correctly.
The answer is simpler on the surface: focus.
Generally the problem is the larger the firm’s operations, the harder it is to focus.
Apple is the only firm that has done well on this consistently and doesn’t have a huge grave yard of failures to show for it.
Insanity
> Oh wow! If I paid for this myself I would have spent a lot of money! Are other people spending as much as me? I’m going to create a leaderboard!
> Oh no, my misinformed manager is using the leaderboard as a slight of hand for work. I need to game this now.
Then the leaderboard is banned… I can’t see how this ever really goes up the chain beyond director.
Charles Goodhart :-)
Everyone except the executives who get paid millions to predict exactly that.
It's a hard job, someone has to not pay consequences for bad decisions.
people who make it to managers tend to have bozo tendencies & are yes men.
before it was lines of code, Jira tickets closed. Now it's tokens spent.
IMO claude, chatgpt/codex, etc should be able to optimize the PDF use case to be extremely token efficient as it's a very obvious use case. But when I start to explain to my wife/friends why it burns through so much quota, I find myself thinking "why should they have to understand this aspect of it". to me, that the details of PDF parsing and extracting are relevant to users (instead of solved such that you don't have to pay attention to it) shows how these tools are not nearly as "ready" as they are made out to be. I may be preaching to the choir on this one, but just my 2c
Source; my last job working with accessibility and that nightmare.
This discussion was about measures, goals and incentives. Follow the incentives.
This workflow is highly optimized.
You can rack up token consumption extremely quickly when you embed LLMs into automated processes or products.
I'd be very surprised if these numbers are just typical coding usage with no scripting/pipeline/automation stuff
The subscriptions are for personal use not enterprise.
i.e. [1] "This article is about paid Max plans for individual consumers. If you're part of an organization looking to use Claude with your team, refer to Team and Enterprise Plans."
[1]: https://support.claude.com/en/articles/11049741-what-is-the-...
I could believe it, but I'd want to see something a little more concrete.
Just wonder what happens when more and more companies introduce similar restrictions. Will that lead to devaluations of the LLM companies?
It wants to see faster R&D, higher revenues from existing assets, greater operating margins, higher sales to invested capital ratio and so on…
The best way to measure that for a software firm is up-time of services, usage and project completion duration
If so, your metric cannot distinguish between a bad engineer and a good one.
If not, you have the same problem you started with: measuring contributions to “uptime”.
You clearly don’t understand valuation - the value of an asset is a function of expected FUTURE cash flows….
Don’t bother replying unless you have a clue about what you’re talking about
This is also not easy. In particular proactively preventing bugs is not rewarded
When shit just works for months or years no one is going to come and praise you for stuff you did a while back.
You are better off breaking stuff and then fixing them to show how useful you are.
Just a pristine comment section yap.
it's not that difficult to say it confidently if you use any of their services and applications because exactly nothing has changed.
For reference most labor productivity increases for the last 50 years amounted to about 2% per year. If a hypothetical FB engineer had doubled their productivity with their gazillion tokens that would be 30 years of productivity gains in one year. I'd wager the evidence would be quite evident if you opened any of their apps
I'd argue most of the AI value is related to how 'Dead' the internet is.
Ultimately the spend on tokens has to benefit the firm financially or it won’t continue spending on it.