chatgpt – Grumpy Coworker

Filed under: Things Procurement Will Blame On You In Q3

Remember when your $20/month ChatGPT Plus subscription felt like getting away with something? Remember when Claude Pro at $20 and then Claude Max at $100 or $200 felt like a steal because you were burning through context windows like a chain smoker at a tax audit? Remember when Cursor was $20/month and Windsurf was $15 and you told your manager “this pays for itself in like fifteen minutes of saved Stack Overflow scrolling”?

Enjoy it. Take a picture. Frame it next to your “Unlimited Data” cell phone bill from 2011.

The Pricing Model Was Always A Lie

Flat-fee AI coding subscriptions exist for exactly one reason: customer acquisition. They are the free shrimp at the casino buffet. They are not a business model. They are a user-acquisition-cost line item that some VP of Growth is going to have to defend on a board call in approximately six weeks.

Here’s the math nobody at your company wants to do out loud:

A moderately active developer using an agentic coding tool burns through 5–20 million tokens per day on a real project. Agents re-read files. They re-read them again. They re-read them a third time because the first two times didn’t count apparently.
At current API rates for a frontier model, that’s roughly $15–$75/day in raw inference cost. Call it $400–$1,500/month per developer in actual compute.
Your company is paying $20. Maybe $200 if somebody upgraded to the “Max” tier.

You don’t need an MBA to see where this goes. You need a calculator and the emotional maturity to accept bad news.

The Three-Act Play You’re Currently In The Middle Of

Act I: The Honeymoon (You Are Here → Six Months Ago) Everything is unlimited. Everyone is vibing. Engineering leadership is posting on LinkedIn about “10x productivity.” A staff engineer somewhere is quietly running a background agent that generates 400,000 tokens every ninety seconds and nobody notices.

Act II: The Soft Rug Pull (Happening Right Now, Actually) Suddenly there are “fair use limits.” “Weekly caps.” “Priority queues.” “Usage-based pricing on top of your subscription for heavy workloads.” Your tool mysteriously gets slower after 2pm. The context window “has been optimized.” A new tier appears above the one you’re on. Then another one above that. The $20 plan still exists, technically, the way a 1994 Geo Metro still exists.

Act III: The Enterprise-Only Endgame (Coming To A Q4 Budget Meeting Near You) The consumer tier quietly dies or becomes a toy. The real product is now a minimum $50K/year enterprise contract with a six-week procurement cycle, a SOC 2 questionnaire, a mandatory “AI Center of Excellence” kickoff call, and per-seat-plus-per-token pricing where “per token” is the part nobody read. Your individual developer access? API keys only. Billed by the token. At retail rates. Welcome back to metered computing, it’s 1974 again and your mainframe has opinions.

The Cost Delta Nobody Is Putting In The Slide Deck

Let me do the math your finance team is going to do for you in about four months, except I’ll do it now so you can panic on your own schedule.

Today, flat fee:

Individual developer: $20–$200/month
Team of 10: $200–$2,000/month
Annual, 10 devs: $2,400–$24,000

Tomorrow, token-metered reality:

Light use (1M tokens/day, 20 workdays): ~$300/month/dev at blended frontier pricing
Moderate agentic use (5M tokens/day): ~$1,500/month/dev
Heavy agentic use (15M tokens/day, which is what your tools actually do when you let them): ~$4,500/month/dev
Team of 10, moderate use, annual: ~$180,000
Team of 10, heavy use, annual: ~$540,000

So the pricing delta between what you pay now and what you’re about to pay is somewhere between 7.5x and 22x. For the exact same work. That you’re already doing. With the exact same tool. On the exact same codebase.

And that’s before the enterprise contract markup, which historically runs 20–40% above raw API costs because somebody has to pay for the Customer Success Manager who schedules the quarterly business reviews you never attend.

“But Grumpy, Won’t Model Costs Come Down?”

Sure. And then the models will get bigger. And the agents will become more aggressive. And the context windows will grow. And you’ll go from “re-read the file three times” to “re-read the entire monorepo on every turn because the reasoning model decided it needed to be thorough.” Jevons paradox isn’t a theory, it’s a calendar reminder.

Token costs per million have dropped roughly 10x in eighteen months. Token consumption per task has gone up roughly 50x in the same window. You do the arithmetic. Actually, don’t. It’ll ruin your weekend.

What This Looks Like In Practice At Your Company

March: “We’re standardizing on [Tool X] for the whole org! Productivity revolution!”
April: A Slack channel called #ai-tools-feedback gets created. It has 400 members by Friday.
June: An email from IT: “We’re transitioning to a new billing model to better align with usage patterns.”
July: Finance sends a spreadsheet. The spreadsheet has a column called “Projected Q4 Overage.”
August: A new policy: “AI tool usage must be pre-approved by your manager for tasks exceeding 500,000 tokens.” Nobody knows what a token is. The policy is enforced anyway.
September: Your skip-level asks in a 1:1, “Are you really getting value from these tools?” It is not a real question.
October: The tool is replaced with a cheaper in-house wrapper around a smaller model that hallucinates import statements.
November: Leadership announces “disciplined AI spend” in the all-hands. Everybody claps.
December: The same VP who launched the rollout gets promoted for “cost optimization.”

The Grumpy Prediction

By this time next year, flat-fee AI coding subscriptions will exist in approximately the same way that unlimited mobile data plans exist: on paper, with an asterisk, a throttle, and a paragraph of fine print longer than the Treaty of Westphalia. The real action will be enterprise contracts, API keys, and a new role at your company called “AI FinOps Lead” whose entire job is to tell you to stop using the thing you were told to start using six months ago.

The developers who saw this coming will be the ones who kept a .env file, a personal API key, and a healthy skepticism toward any business model that includes the word “unlimited.” The developers who didn’t will be the ones writing retrospective blog posts about “lessons learned from our AI tooling journey.”

Guess which one gets invited to speak at the conference.

The author is a Grumpy Coworker who has been through exactly this movie with cloud compute, with CI minutes, with observability vendors, with feature flag platforms, and with at least two code search tools. The ending is always the same. The popcorn is always stale.

Tag: chatgpt