Cloud costs vs revenue: The profitability gap in AI products

The most dangerous thing that can happen to an AI product is that people actually use it.

That sounds wrong. It is not. In most software businesses, more users means more revenue, and marginal costs stay flat or decline. A SaaS product serving ten thousand users costs roughly the same to run as one serving five thousand, give or take a few database queries. But AI products do not work that way. Every inference call, every model query, every generated response costs real money. More users means more cost. Proportionally. Sometimes more than proportionally.

And yet the pricing models most teams are using were designed for a world where marginal costs were negligible.

That is the gap. And it is widening.

The usage penalty

I spent a few months earlier this year advising a team building an AI-powered document analysis tool. Smart people. Good product instincts. They had built something genuinely useful: upload a contract, get a structured summary with risk flags in under a minute. Users loved it.

But we tracked inference costs per API call with an obsessiveness that bordered on paranoia. Every query to the model, every token processed, every response generated had a price tag. We built a spreadsheet that modelled costs at a thousand users, ten thousand, a hundred thousand. The maths did not work. At their current pricing (a flat monthly subscription of forty dollars per seat), every heavy user was costing more to serve than they were paying. The light users subsidised the heavy ones, but the heaviest users were growing fastest. Which is exactly what you want in a product. Except it was also exactly what would destroy the margin.

The team had a term for it. They called it the usage penalty. The better your product performs, the more people use it. The more people use it, the more it costs. But revenue stays flat because the pricing model was built for a world where serving one more user cost almost nothing.

This is not a new problem in the abstract. Telecoms dealt with something similar decades ago: bandwidth costs scaled with usage while customers expected flat-rate plans. But the AI version is sharper, because inference costs are not just high. They are unpredictable. A simple query might cost a fraction of a penny. A complex one might cost fifty times that. And you often cannot tell which it will be until the model has already processed it.

A bakery that loses money on every loaf

Think about a bakery. Every loaf of bread requires flour, energy, labour. If the bakery sells more bread, it buys more flour. That is obvious. Nobody expects a bakery to absorb unlimited flour costs while charging the same price per loaf.

But that is precisely what most AI product pricing does. It says: pay us a fixed monthly fee, and use the product as much as you want. The team absorbs the flour costs. For light users, the margin works. For heavy users, every additional loaf is a loss.

The bakery analogy breaks down in one important way, though. A baker knows roughly how much flour each loaf needs. An AI product team often does not know how much inference a given user session will consume until it has already been consumed. It is like running a bakery where some customers order a baguette and others order a seven-tier wedding cake, and you have to serve both at the same price. You would go broke. That is what is happening.

The profitability inversion

I saw a version of this at Freshworks, years before AI costs were part of the conversation. As the product grew and moved from small-team customers into larger deployments, infrastructure costs scaled in ways nobody had modelled accurately. The assumption had been that costs would grow linearly. They did not. Growth brought complexity: heavier usage patterns, more data, higher support load, bigger compute requirements. Revenue grew. But costs grew faster than anyone had predicted during the early, scrappy phase.

The team adjusted. Pricing changed. Architecture changed. But the lesson stuck with me: the economics that work at a thousand users often break at a hundred thousand, and the break is not gentle. It is sudden. You cross a threshold and the model inverts. Revenue is still growing, and somehow profitability is shrinking. I have come to think of it as the profitability inversion, and it is one of the least discussed structural risks in product.

With AI products, the profitability inversion arrives earlier and hits harder. Traditional SaaS had the luxury of relatively flat marginal costs, so the inversion usually showed up at extreme scale. AI products can hit it within the first year. Sometimes within the first quarter after launch. Success is the thing that breaks the model, which is a sentence I never expected to write about a product doing well.

What the spreadsheet misses

Most teams model AI costs as a line item. They estimate average inference cost per query, multiply by projected usage, and compare against projected revenue. The spreadsheet looks manageable. But the spreadsheet misses three things.

First, usage is not average. Power users consume disproportionately. And power users are often the ones you most want to keep, because they are the ones who tell other people about the product. Penalising them with usage caps or throttling feels like punishing your best customers for liking you too much.

Second, model costs change. Sometimes they drop (which is great). Sometimes the product team wants to upgrade to a more capable model, and costs jump. The roadmap and the cost model are in tension, and the cost model usually loses because nobody wants to ship a worse product to save on inference.

Third, the competitive pressure is real. If your competitor offers a similar product at a flat rate and absorbs the cost (funded by venture capital or cross-subsidised by another business line), you are in a pricing war where the floor is set by someone who is willing to lose money longer than you are. That is not a product problem. That is a survival problem.

The uncomfortable question

There is no clean answer here. Usage-based pricing feels fair but alienates users who want predictability. Flat pricing feels friendly but breaks at scale. Tiered models split the difference but create complexity that small teams struggle to manage.

I have watched three teams wrestle with this in the past six months alone. The ones making progress share a common trait: they treat pricing as a product problem, not a finance problem. They test pricing the way they test features. They instrument cost per session with the same rigour they apply to conversion funnels. They accept that the pricing model will change multiple times in the first two years, and they build for that flexibility instead of locking in a structure too early.

Nobody has solved this yet. The teams that will figure it out are the ones honest enough to look at their own spreadsheets and say: this does not work at scale. And then redesign before scale arrives, not after.

That takes a specific kind of courage. The kind where you look at a product people love, with metrics going in the right direction, and admit that the economics underneath are quietly rotting. Most teams wait until the rot is visible. By then, the options are worse.

The best time to fix the model is when everything looks fine. That is also when it is hardest to convince anyone it needs fixing.

Cloud costs vs revenue: The profitability gap in AI products

The usage penalty

A bakery that loses money on every loaf

The profitability inversion

What the spreadsheet misses

The uncomfortable question

Enjoyed this article?

More like this

The removal economy

CAC as the silent killer

Subscription fatigue and the case for model diversification

More like this

The removal economy

CAC as the silent killer

Subscription fatigue and the case for model diversification