Most AI conversations focus on building AI systems. Far fewer address running them. But the operational model, how you monitor, maintain, improve, and govern AI in production, determines whether your AI investment delivers sustained value or becomes another piece of neglected enterprise software.
What You Need to Know
- Building an AI system is a project. Running it is a capability. The transition from project to operations is where most enterprise AI investments lose momentum.
- AI systems require different operational practices than traditional software. Model performance degrades, data distributions shift, and outputs need ongoing quality monitoring that traditional software doesn't require.
- The three pillars of AI operations are: monitoring, maintenance, and improvement. Miss any one of them and your AI system will decay.
- AI operations isn't just a technical function. It requires business ownership, user feedback, and governance oversight alongside technical monitoring and maintenance.
60%
of enterprise AI systems show measurable performance degradation within 12 months without active operational management
Source: McKinsey, Managing AI at Scale, 2025
Why AI Operations Is Different
Traditional software is deterministic. Given the same input, it produces the same output. Once tested and deployed, it behaves predictably (bugs aside) until the code changes.
AI systems are probabilistic and context-dependent. Several factors cause AI performance to change without any code change:
Data drift. The data flowing into your AI system today may differ from the data it was built on. New document formats, new terminology, seasonal patterns, organisational changes, any of these can shift the data distribution and degrade AI performance.
Model updates. If you're using hosted models (Claude, GPT), the provider may update the model. These updates usually improve general performance but can change behaviour on your specific tasks. An update that improves average performance might degrade performance on your edge cases.
User behaviour changes. How people use the AI changes over time. They find workarounds, ask different types of questions, or use the system for tasks it wasn't designed for. These behavioural shifts can expose weaknesses the original testing didn't cover.
Knowledge staleness. For RAG systems, the knowledge base needs to stay current. Policies change. Regulations update. Products evolve. A knowledge base that isn't maintained becomes a source of wrong answers.
The Operational Model
Monitoring
Performance monitoring. Track accuracy, latency, cost, and throughput continuously. Set baselines from the first month of production and alert on deviations.
Quality monitoring. A regular cadence of human review of AI outputs. Not every output. A statistically significant sample, reviewed by domain experts. Weekly for high-stakes applications, monthly for lower-stakes.
Usage monitoring. Who's using the system? How often? Which features? Where do they drop off? Usage patterns tell you whether the AI is delivering value and where it's falling short.
Cost monitoring. AI inference costs can spike unexpectedly: higher-than-expected volume, model pricing changes, inefficient prompts. Track cost per transaction and total AI spend.
Maintenance
Data pipeline maintenance. Data sources change. APIs break. File formats evolve. The data pipelines feeding your AI system need ongoing attention.
Knowledge base updates. For RAG systems, schedule regular knowledge base reviews. How often depends on how fast your domain knowledge changes. Weekly for fast-moving domains, monthly for stable ones.
Prompt maintenance. Prompts need updating as use cases evolve, edge cases emerge, and model versions change. Maintain a version-controlled prompt library.
Model evaluation. When providers release new model versions, evaluate them against your task-specific benchmarks before switching. Don't assume newer is better for your workload.
Improvement
Feedback integration. User feedback (flagged errors, suggestions, support tickets) should feed directly into improvement cycles. The best AI systems get better over time because they learn from their mistakes.
Expansion planning. As the AI system proves value, users will want more from it. New use cases, new data sources, new integrations. Improvement means deliberate expansion, not scope creep.
Cost optimisation. As you understand usage patterns better, optimise: route simpler tasks to cheaper models, cache common queries, reduce unnecessary processing. AI cost optimisation is ongoing, not one-time.
The Handover Checklist
Before an AI system transitions from build to operations, confirm: (1) monitoring dashboards are live, (2) quality review cadence is defined, (3) knowledge base update schedule is set, (4) the operations team has been trained, (5) escalation paths are documented. Missing any of these is a setup for operational failure.
The People Model
AI operations requires a combination of roles:
Technical operations. Monitoring, pipeline maintenance, model evaluation. This can be a dedicated AI ops team or part of a broader engineering team, depending on scale.
Domain oversight. Regular review of AI outputs by domain experts. This is business responsibility, not IT responsibility. The claims team reviews the claims AI. The compliance team reviews the compliance AI.
Governance. Ongoing governance review: compliance with AI policies, audit trail integrity, bias monitoring results. This maps to existing governance structures (risk committee, compliance team) with AI-specific additions.
Product ownership. Someone who owns the AI capability's roadmap, balancing improvement, maintenance, and new requests. Without product ownership, AI systems either stagnate or grow chaotically.
Starting Before You're Ready
You don't need a mature AI ops function before deploying your first AI system. But you do need the basics:
- Define who owns operations before you deploy. Not after.
- Set up monitoring during the build, not after launch.
- Schedule the first quality review for week two of production. Don't wait for a problem.
- Budget for operations separately from the build budget. Ongoing operations should be 15-25% of the initial build cost annually.
The organisations that treat AI operations as an afterthought end up with AI systems that deliver diminishing value. The ones that plan for operations from the start build AI that compounds.
A well-built system with no operational model decays within months; a modestly built system with strong operations improves continuously. That's where sustained value lives.
Tim Hatherley-Greene
Chief Operating Officer

