Why most AI ROI numbers fall apart
The skeptics are right about the average AI pilot: it shows a demo, claims a vague productivity gain, and never proves a return. The fix is not a better story, it is a narrower scope you can actually measure. AI ROI falls apart when it is calculated across a whole program, because the costs hide in a dozen budgets and the value is asserted rather than counted. The way to get a number you can defend is to pick one use case, draw a tight boundary around it, and account for everything inside that boundary honestly. A small, real, measured return beats a large, imaginary one every time you have to explain it to a CFO.
Step 1: pick one narrow, measurable use case
Choose a single workflow where you can name the unit of value: hours saved per case, tickets resolved, errors caught, revenue influenced. If you cannot state the value in one sentence with a number in it, the use case is too broad to measure and you should narrow it until you can. This discipline is not a limitation; it is the whole method. Programs that try to prove ROI on AI as a category fail. Programs that prove it on one workflow, then the next, build a track record nobody can argue with.
Step 2: count the full cost, including the hidden parts
Most AI ROI math undercounts cost badly. The model or platform fee is the visible part. The hidden parts are the ones that sink the number: the integration work to wire it into real systems, the data preparation, the human review time you designed in, the ongoing cost per call at production volume, and the cost of the failures you have to catch. Add all of it. The hidden cost of a pilot is usually infrastructure and integration friction, not the model price, and a return calculated without those costs is fiction that will not survive contact with finance.
Step 3: measure the value against a real baseline
You cannot claim a gain you never baselined. Before the AI goes live, record how the workflow performs today: how long it takes, how much it costs, how often it errs. Then measure the same numbers with AI in place, over a long enough window to be real rather than a launch-week spike. The difference, minus the full cost from step two, is your return. Be conservative about attribution: if other things changed at the same time, do not credit all the gain to the AI. A defensible ROI is one where you can show your work.
Step 4: make cost and value visible going forward
A one-time ROI study ages out the moment prices or usage change. The durable version is instrumentation: attribute cost and value per use case continuously, so you always know which deployments earn their keep and which to cut. This is the instrument most AI programs are missing, and it is why they cannot answer the ROI question on demand. Treat AI spend as capital allocation, reviewed like any other investment, and you turn ROI from an annual argument into a number you can read off a dashboard. That visibility is what lets you scale the winners and stop the losers with confidence instead of politics.