Tag: claude

  • The Flat-Fee Coding Subscription Is Already Dead. You Just Haven’t Gotten the Email Yet.

    Filed under: Things Procurement Will Blame On You In Q3


    Remember when your $20/month ChatGPT Plus subscription felt like getting away with something? Remember when Claude Pro at $20 and then Claude Max at $100 or $200 felt like a steal because you were burning through context windows like a chain smoker at a tax audit? Remember when Cursor was $20/month and Windsurf was $15 and you told your manager “this pays for itself in like fifteen minutes of saved Stack Overflow scrolling”?

    Enjoy it. Take a picture. Frame it next to your “Unlimited Data” cell phone bill from 2011.

    The Pricing Model Was Always A Lie

    Flat-fee AI coding subscriptions exist for exactly one reason: customer acquisition. They are the free shrimp at the casino buffet. They are not a business model. They are a user-acquisition-cost line item that some VP of Growth is going to have to defend on a board call in approximately six weeks.

    Here’s the math nobody at your company wants to do out loud:

    • A moderately active developer using an agentic coding tool burns through 5–20 million tokens per day on a real project. Agents re-read files. They re-read them again. They re-read them a third time because the first two times didn’t count apparently.
    • At current API rates for a frontier model, that’s roughly $15–$75/day in raw inference cost. Call it $400–$1,500/month per developer in actual compute.
    • Your company is paying $20. Maybe $200 if somebody upgraded to the “Max” tier.

    You don’t need an MBA to see where this goes. You need a calculator and the emotional maturity to accept bad news.

    The Three-Act Play You’re Currently In The Middle Of

    Act I: The Honeymoon (You Are Here → Six Months Ago) Everything is unlimited. Everyone is vibing. Engineering leadership is posting on LinkedIn about “10x productivity.” A staff engineer somewhere is quietly running a background agent that generates 400,000 tokens every ninety seconds and nobody notices.

    Act II: The Soft Rug Pull (Happening Right Now, Actually) Suddenly there are “fair use limits.” “Weekly caps.” “Priority queues.” “Usage-based pricing on top of your subscription for heavy workloads.” Your tool mysteriously gets slower after 2pm. The context window “has been optimized.” A new tier appears above the one you’re on. Then another one above that. The $20 plan still exists, technically, the way a 1994 Geo Metro still exists.

    Act III: The Enterprise-Only Endgame (Coming To A Q4 Budget Meeting Near You) The consumer tier quietly dies or becomes a toy. The real product is now a minimum $50K/year enterprise contract with a six-week procurement cycle, a SOC 2 questionnaire, a mandatory “AI Center of Excellence” kickoff call, and per-seat-plus-per-token pricing where “per token” is the part nobody read. Your individual developer access? API keys only. Billed by the token. At retail rates. Welcome back to metered computing, it’s 1974 again and your mainframe has opinions.

    The Cost Delta Nobody Is Putting In The Slide Deck

    Let me do the math your finance team is going to do for you in about four months, except I’ll do it now so you can panic on your own schedule.

    Today, flat fee:

    • Individual developer: $20–$200/month
    • Team of 10: $200–$2,000/month
    • Annual, 10 devs: $2,400–$24,000

    Tomorrow, token-metered reality:

    • Light use (1M tokens/day, 20 workdays): ~$300/month/dev at blended frontier pricing
    • Moderate agentic use (5M tokens/day): ~$1,500/month/dev
    • Heavy agentic use (15M tokens/day, which is what your tools actually do when you let them): ~$4,500/month/dev
    • Team of 10, moderate use, annual: ~$180,000
    • Team of 10, heavy use, annual: ~$540,000

    So the pricing delta between what you pay now and what you’re about to pay is somewhere between 7.5x and 22x. For the exact same work. That you’re already doing. With the exact same tool. On the exact same codebase.

    And that’s before the enterprise contract markup, which historically runs 20–40% above raw API costs because somebody has to pay for the Customer Success Manager who schedules the quarterly business reviews you never attend.

    “But Grumpy, Won’t Model Costs Come Down?”

    Sure. And then the models will get bigger. And the agents will become more aggressive. And the context windows will grow. And you’ll go from “re-read the file three times” to “re-read the entire monorepo on every turn because the reasoning model decided it needed to be thorough.” Jevons paradox isn’t a theory, it’s a calendar reminder.

    Token costs per million have dropped roughly 10x in eighteen months. Token consumption per task has gone up roughly 50x in the same window. You do the arithmetic. Actually, don’t. It’ll ruin your weekend.

    What This Looks Like In Practice At Your Company

    • March: “We’re standardizing on [Tool X] for the whole org! Productivity revolution!”
    • April: A Slack channel called #ai-tools-feedback gets created. It has 400 members by Friday.
    • June: An email from IT: “We’re transitioning to a new billing model to better align with usage patterns.”
    • July: Finance sends a spreadsheet. The spreadsheet has a column called “Projected Q4 Overage.”
    • August: A new policy: “AI tool usage must be pre-approved by your manager for tasks exceeding 500,000 tokens.” Nobody knows what a token is. The policy is enforced anyway.
    • September: Your skip-level asks in a 1:1, “Are you really getting value from these tools?” It is not a real question.
    • October: The tool is replaced with a cheaper in-house wrapper around a smaller model that hallucinates import statements.
    • November: Leadership announces “disciplined AI spend” in the all-hands. Everybody claps.
    • December: The same VP who launched the rollout gets promoted for “cost optimization.”

    The Grumpy Prediction

    By this time next year, flat-fee AI coding subscriptions will exist in approximately the same way that unlimited mobile data plans exist: on paper, with an asterisk, a throttle, and a paragraph of fine print longer than the Treaty of Westphalia. The real action will be enterprise contracts, API keys, and a new role at your company called “AI FinOps Lead” whose entire job is to tell you to stop using the thing you were told to start using six months ago.

    The developers who saw this coming will be the ones who kept a .env file, a personal API key, and a healthy skepticism toward any business model that includes the word “unlimited.” The developers who didn’t will be the ones writing retrospective blog posts about “lessons learned from our AI tooling journey.”

    Guess which one gets invited to speak at the conference.


    The author is a Grumpy Coworker who has been through exactly this movie with cloud compute, with CI minutes, with observability vendors, with feature flag platforms, and with at least two code search tools. The ending is always the same. The popcorn is always stale.

  • Best Practices for an AI CLI Code Agent: A Containment Strategy

    Look, I get it. You’ve got a shiny new AI agent that promises to “accelerate your development velocity” (translation: write code so you don’t have to). Congratulations. Now here’s the part nobody talks about: that thing is a liability wrapped in a transformer architecture, and you need to treat it like a biohazard until proven otherwise.

    Here’s the workflow. In order.

    1) Buy a new PC for every session

    Before you even think about unleashing an AI agent on your codebase, you need a clean machine. Not clean like “I ran Windows Update.” Clean like “this device has never touched the internet, has never installed a package, and exists in a state of primordial computational innocence.”

    This isn’t paranoia. This is recognizing that every previous session leaves ghosts: stray pip packages, environment variables polluting the namespace, leftover processes drinking memory. Run the agent in that mess and you get non-deterministic behavior. You want to know if the agent broke something? You need a baseline. A fresh PC is your baseline.

    Will this tank your budget? Absolutely. Is it worth it? Ask yourself: how much is it worth to know exactly what your code does?

    2) Air-gap the development machine from the network

    Unplug the ethernet. Turn off WiFi. Go full Faraday cage if you’re feeling theatrical (you should be).

    Why? Because an AI agent with internet access is an AI agent that can install arbitrary dependencies, call home to telemetry servers, or—and let’s be honest—do things you didn’t explicitly ask it to do. It can’t exfiltrate your data if there’s no network. It can’t surprise you with a midnight API call if the cables are all disconnected.

    Plus, no internet means the agent can’t download the latest version of some npm package that got compromised last Tuesday. Your supply chain is as trustworthy as your local filesystem.

    3) Document every prompt you fed it

    Keep a log. Every. Single. One. Write down the system message, the context window dump, the user prompts, the intermediate questions you asked, all of it.

    This isn’t busywork. When the agent does something weird—and it will—you need the full input state to understand why. “The model just decided to rewrite my entire build system” is not a bug report. “The model was given 50KB of malformed Makefile as context and hallucinated a solution” is actionable.

    Also, someday, an auditor will ask: “How did this code get written?” You’ll have an airtight answer backed by timestamped evidence. That’s worth something.

    4) Never run it unsupervised

    This is the non-negotiable one. Keep your eyes on the terminal. Have your hand near the kill switch. The moment it tries to rm -rf / or starts writing to files it shouldn’t touch, you Ctrl+C it into oblivion.

    You wouldn’t let a junior dev commit to prod without watching the deploy. Don’t let a probabilistic text completion engine run loose on your codebase without supervision. It’s not smart enough to know when it’s about to do something catastrophic, and neither is your CI/CD pipeline if you didn’t set up guards.

    5) Git commit at every working point

    After each discrete task, commit. Don’t wait for the entire session to finish. Don’t consolidate into one massive commit at the end.

    This serves two purposes: (a) you can revert surgical strikes if the agent pivots into insanity mid-task, and (b) you preserve a narrative of what the agent was thinking at each step. If it goes off the rails at commit 7, commits 1–6 are still usable.

    Also, git bisect becomes your friend when you’re trying to figure out which of the agent’s “improvements” introduced the regression.

    6) Review every diff line-by-line before merging

    I don’t care if the agent’s changes look obviously correct. I don’t care if it’s “just a bug fix.” Read the entire diff. Every. Line.

    LLMs hallucinate. They make logical leaps that seem correct on first pass but introduce subtle bugs three months later. They’ll add a dependency you didn’t ask for. They’ll “optimize” something into oblivion. They’ll introduce a race condition that only manifests under load.

    This is code review with paranoia. Do it anyway.

    7) Have a human sign off on the final commit

    The agent can’t push to main. Period. A carbon-based lifeform—you, preferably, or someone who understands the codebase—has to explicitly approve and merge.

    This isn’t security theater. This is your release gate. You’re saying: “I, a human with a reputation and a salary, reviewed this code and deemed it acceptable for production.” That’s not nothing.

    8) Quarantine the binaries in a sandbox before production

    Compile the code in an isolated VM. Run the test suite there. Observe for suspicious behavior: unexpected disk writes, network calls, zombie processes, memory leaks that only manifest under realistic load.

    You’re doing dynamic analysis on untrusted output. This is what you should be doing with any third-party code anyway. The fact that it came from an AI agent just makes it more important.

    9) Keep a “kill switch” branch

    Maintain a known-good branch. Tag it. Freeze it. If the agent’s changes cause production incidents, you roll back instantly.

    Don’t debate which commit was safe. Don’t try to cherry-pick the “good” changes. You have an escape pod. Use it.

    10) Sacrifice a rubber duck to the testing gods before execution

    Quack once for unit tests. Twice for integration tests. Three times for “please don’t delete my home directory.”

    At this point, you’ve built so many safety layers that you might as well be honest about the remaining uncertainty. There’s chaos, and there’s the chaos you can predict. The duck represents the chaos you can’t. Respect it.

    11) Rotate the PC’s hard drive into a locked evidence locker

    After the session, physically remove the hard drive. Store it in a cabinet. Maybe a Faraday cage if you’re feeling extra.

    Why? Because if your organization ever gets audited, sued, or subpoenaed, you might need forensic evidence of what the agent actually touched. A hard drive is immutable once you stop writing to it. It’s your audit trail.

    12) Burn it afterwards

    Wipe the drive. Use a utility that writes random data three times over. Or just smash it with a hammer if you’re feeling visceral about it.

    At this point, you’ve extracted all value from the machine. It’s served its purpose. Don’t let it become a liability. Don’t let someone else inherit it with “ooh, I can repurpose this.” No. Burn it. Ashes.


    The Meta-Take

    By step 12, you’ve introduced enough overhead that you’ve eliminated most of the time savings the agent provided. You’re now doing: hardware procurement + network isolation + prompt documentation + active supervision + granular commits + paranoid code review + human approval + sandbox testing + branch management + physical archival + divine intervention + hard drive incineration.

    At that point, why not just code it yourself?

    Because the agent still wrote something. Your job wasn’t to eliminate the work; it was to shift the work from “typing code” to “verifying code.” And verification scales better than creation. You can have an agent generate 10,000 lines and verify them in a reasonable time. Typing 10,000 lines yourself takes forever.

    The joke is exposing the uncomfortable truth: AI code agents are useful but not trustworthy enough to leave unsupervised. You’re getting velocity (the agent wrote something), but you’re paying for it with process overhead and justified paranoia.

    The best practices aren’t about enabling the agent. They’re about containing and verifying its output. It’s a productivity tool that requires adult supervision.

    Treat it accordingly.

  • Is Your CTO Dabbling in LLM Cults? Here Are the Signs

    Look, I’m not saying your CTO has been compromised by the Church of the Latter-day Tokens, but if they’ve started using “MBiC” unironically in Slack, we need to talk.

    Here are common acronyms your CTO might start using and their LLM cult meanings:

    MBiC – “My Brother in Copilot/Cursor/Claude”

    • Normal people think: My Brother in Christ, a Gen Z riff on the influence of Christianity on culture regardless of actual religion of the recipient.
    • What they mean: A term of endearment for fellow AI-assisted developers
    • Red flag level: 🚩🚩 (Yellow – concerning but not terminal)

    LGTM – “Let GPT Train Me”

    • Normal people think: Looks Good To Me
    • What they mean: They’ve stopped learning and just accept whatever the spicy autocomplete says
    • Red flag level: 🚩🚩🚩 (Orange – intervention recommended)

    YOLO – “Your Output’s Likely Off”

    • Normal people think: You Only Live Once
    • What they mean: Dismissive response when someone questions AI-generated code that definitely has bugs
    • Red flag level: 🚩🚩🚩🚩 (Red – quarantine immediately)

    SMH – “Seeking More Hallucinations”

    • Normal people think: Shaking My Head
    • What they mean: When the AI’s first answer wasn’t convincing enough, so they’re regenerating
    • Red flag level: 🚩🚩🚩 (Orange – they know it’s wrong but persist)

    IMHO – “In My HuggingFace Opinion”

    • Normal people think: In My Humble Opinion
    • What they mean: About to cite some open-source LLM as an authority on architecture decisions
    • Red flag level: 🚩🚩🚩🚩 (Red – open source models have opinions now)

    TBH – “Tokens Be Hallucinating”

    • Normal people think: To Be Honest
    • What they mean: Acknowledging the AI made something up, but they’re going with it anyway
    • Red flag level: 🚩🚩🚩🚩🚩 (Critical – they’ve accepted hallucinations as reality)

    FWIW – “Fine-tuned With Insufficient Weights”

    • Normal people think: For What It’s Worth
    • What they mean: Excuse for why their custom model is confidently wrong about everything
    • Red flag level: 🚩🚩🚩🚩 (Red – they fine-tuned something)

    IDK – “Inference Definitely Knows”

    • Normal people think: I Don’t Know
    • What they mean: They don’t know, but Claude/GPT probably does, hold on
    • Red flag level: 🚩🚩 (Yellow – at least they’re honest about outsourcing cognition)

    RTFM – “Run The F***ing Model”

    • Normal people think: Read The F***ing Manual
    • What they mean: Why read documentation when you can just ask an AI that was trained on it?
    • Red flag level: 🚩🚩🚩🚩🚩 (Critical – manuals are now deprecated)

    WFH – “Working From HuggingFace”

    • Normal people think: Working From Home
    • What they mean: Entire day spent on model repos instead of actual work
    • Red flag level: 🚩🚩🚩 (Orange – at least they’re still technically working?)

    BRB – “Be Right Back (asking Claude)”

    • Normal people think: Be Right Back
    • What they mean: Every conversation now has a 30-second AI consultation pause
    • Red flag level: 🚩🚩🚩 (Orange – human-to-human communication deprecated)

    AFAIK – “According to Fine-tuned AI Knowledge”

    • Normal people think: As Far As I Know
    • What they mean: They asked an LLM and stopped researching
    • Red flag level: 🚩🚩🚩🚩 (Red – epistemology has left the building)

    TL;DR – “Too Long; Didn’t Rewrite (with AI)”

    • Normal people think: Too Long; Didn’t Read
    • What they mean: Everything must now be AI-summarized, including two-sentence emails
    • Red flag level: 🚩🚩🚩 (Orange – reading comprehension outsourced)

    IIRC – “If I Regenerate Context”

    • Normal people think: If I Recall Correctly
    • What they mean: They’ve lost track of which conversation was with humans vs. chatbots
    • Red flag level: 🚩🚩🚩🚩🚩 (Critical – reality boundaries dissolving)

    FYI – “Feed Your Inference”

    • Normal people think: For Your Information
    • What they mean: Attaching 47 documents to “give the AI context” for a simple question
    • Red flag level: 🚩🚩🚩 (Orange – prompt engineering has become lifestyle)

    NGL – “Not Gonna Lint”

    • Normal people think: Not Gonna Lie
    • What they mean: AI wrote it, AI approved it, linting is for people who don’t trust the silicon
    • Red flag level: 🚩🚩🚩🚩🚩 (Critical – code quality gates removed)

    BTW – “Before Training Weights”

    • Normal people think: By The Way
    • What they mean: Referencing the mythical pre-LLM era when people coded with their actual brains
    • Red flag level: 🚩 (Green – nostalgia is healthy)

    ICYMI – “In Case Your Model Ignored”

    • Normal people think: In Case You Missed It
    • What they mean: Reposting because they think you’re also using AI to read Slack
    • Red flag level: 🚩🚩🚩 (Orange – assumes everyone else is also AI-dependent)

    Warning Signs Your CTO Has Fully Converted:

    1. Begins sentences with “As an AI language model” in standup
    2. Refers to the engineering team as “the training data”
    3. Insists all PRs include a “prompt” section explaining what was asked
    4. Says “regenerate that thought” when they don’t like someone’s opinion
    5. Measures performance reviews in “tokens per second”
    6. Has replaced their profile picture with a neural network diagram
    7. Sends meeting agendas as “system prompts”
    8. Refers to coffee breaks as “context window refreshes”
    9. Calls the office “the inference cluster”
    10. Has started ending emails with “Stop sequence: [END]”

    What To Do If Your CTO Is Converting:

    Stage 1 (Early): Gentle reminders that humans still write code sometimes

    Stage 2 (Moderate): Intervention involving unplugged coding exercises and whiteboard sessions

    Stage 3 (Advanced): Emergency contact with former CTO’s mentors from the pre-LLM era

    Stage 4 (Terminal): Accept your new AI overlords and start learning prompt engineering

    The Reality Check:

    Look, AI coding assistants are genuinely useful tools. I use them. You probably use them. But when your leadership starts communicating primarily in LLM-cult acronyms and treating the AI as a team member with voting rights in architecture decisions, we’ve crossed from “productivity tool” to “cargo cult.”

    The warning sign isn’t that they’re using AI. It’s that they’ve stopped being able to tell where the AI stops and their own judgment begins.

    If your CTO asks you to “vibe check the embeddings” one more time, it might be time to update your LinkedIn.

    MBiC (My Buddy in Coding, the normal way),

    Grumpy


    Is your CTO showing signs of LLM cult membership? Drop a 👇 in the comments with the weirdest AI-related acronym you’ve heard in your workplace.

    Disclaimer: No CTOs were harmed in the making of this post. Several were mildly roasted. All AI assistants cited gave their consent to be satirized. Probably. I didn’t actually ask them. They’re just autocomplete.

  • Building Fast in the Wrong Direction: An AI Productivity Fairy Tale

    Oh good, another breathless LinkedIn post about how AI just 10x’d someone’s development velocity. Fantastic. You know what else moves fast? A semi truck in the mountains of Tennessee with brakes that have failed. Speed is great until you realize your only hope for survival is a runaway truck ramp.

    Runaway Truck Ramp
    Runaway Truck Ramp image from public domain pictures

    Here’s the thing nobody wants to admit at their AI productivity [ahem… self-congratulatory gathering]: AI doesn’t matter if you don’t have a clue what to build.

    I’ve watched teams use ChatGPT to crank out five different implementations of features nobody wanted in the time it used to take them to build one feature nobody wanted. Congratulations, you’ve quintupled your output of garbage. Your CEO must be so proud. Maybe you can have ChatGPT restyle your resume to look like VS Code or the AWS Console, but it’s not going to change the experience you have listed on it.

    Going fast in the wrong direction gets you to the wrong place faster. But it’s still the wrong place. You’re just confidently incorrect at scale now.

    Agile Saves You From Your Own Stupidity (Sometimes)

    You know why Agile actually works when it works? Not because of the stand-ups or the poker planning or whatever cult ritual your scrum master insists on. It works because it forces you to pause every couple weeks and ask “wait, is this actually the right thing?”

    Short iterations exist to limit the blast radius of your terrible decisions. When you inevitably realize you’ve been building the wrong thing, you’ve only wasted two weeks instead of six months. It’s damage control, not strategy.

    But sure, let’s use AI to speedrun through our sprints so we can discover we built the wrong thing in three days instead of ten. Efficiency!

    Product Strategy: The Thing You Skipped

    Here’s a wild idea: what if you actually figured out what to build before you built it?

    I know, I know. Product strategy and user research are boring. They don’t give you that dopamine hit of shipping code. They require talking to actual users, which is terrifying because they might tell you your brilliant idea is stupid.

    But you know what product strategy and research actually do? They narrow down your options. They give you constraints. They help you make informed bets instead of random guesses.

    Because here’s the math that AI evangelists keep missing: Improving your odds of success by building the right thing will always beat building the wrong things 10 times faster.

    Building the wrong feature in three days instead of two weeks doesn’t make you 5x more productive. It makes you 5x more wrong. You’ve just accelerated your march into irrelevance.

    AI as a Validation Tool, Not a Strategy Replacement

    Now, I’m not saying AI is useless. It’s actually pretty good at helping you validate ideas faster. Rapid prototyping, quick mockups, testing assumptions—yeah, that stuff is genuinely helpful.

    But AI can’t tell you what to validate. It can’t tell you which customer problem is worth solving. It can’t tell you if your market actually exists or if you’re just building another solution in search of a problem.

    That still requires thinking. Remember thinking? That thing we used to do before we decided to outsource our brains to autocomplete?

    The Uncomfortable Truth

    The dirty secret of software development has always been that most of our productivity problems aren’t technical. (See the reprint of the “No Silver Bullet” essay from 1986 in a collection of timeless project managements essays, The Mythical Man-Month) They’re strategic. We build the wrong things, for the wrong reasons, at the wrong time. (Ok, yes, they’re also communication and coordination problems… fortunately, we have Slack for that <insert eye roll emoji here>)

    AI speeds up the building part. Great. But if you’re speeding toward the wrong destination, you’re just failing faster.

    Maybe instead of celebrating how quickly you can ship features, you should figure out which features are worth shipping in the first place. Crazy thought, I know.

    But hey, what do I know? I’m just a grumpy coworker who thinks you should know where you’re going before you hit the gas.


    Now get back to work. And for the love of god, talk to your users and other humans instead of spending all day chatting with a chatbot that declares you a deity when you correct it.