On Kaggle, the winning model is the most accurate. In enterprise, the winning model is the one that's accurate enough, fast enough, cheap enough, explainable enough, and maintainable by the team you have. These constraints change the algorithm selection calculus entirely.
What You Need to Know
- Enterprise algorithm selection is a multi-objective optimisation problem, not a single-metric competition
- The constraints that matter most: latency, cost, explainability, data volume, maintainability, and sovereignty
- Simpler models that meet all constraints often outperform complex models that excel on accuracy but fail on deployment
- The selection process should be empirical: test candidates against your constraints, not theoretical benchmarks
The Constraint Matrix
Before evaluating any algorithm, map your constraints:
| Constraint | Question | Impact on Selection |
|---|---|---|
| Latency | How fast must the response be? | Sub-second rules out large models without caching or distillation |
| Cost | What's the per-query budget at production volume? | Eliminates expensive API models at high volume |
| Explainability | Do humans need to understand why the model decided? | Favours interpretable models or models with built-in explanation |
| Data Volume | How much labelled data exists? | Small datasets favour few-shot LLMs; large datasets enable fine-tuning |
| Sovereignty | Must data and processing stay in-region? | Limits cloud-only models; favours locally deployable options |
| Maintainability | Who will maintain this in 2 years? | Complex ensemble models need specialist staff |
| Regulatory | Are there compliance requirements on automated decisions? | May require audit trails, human oversight, or specific model types |
In optimisation theory, the feasible region is defined by constraints. The optimal solution exists within that region, not outside it. An algorithm that's 98% accurate but violates your latency constraint is infeasible. An algorithm that's 91% accurate and meets every constraint is optimal. The mathematics is clear on this.
Dr Vincent Russell
Machine Learning (AI) Engineer
Selection by Use Case Type
Document Classification
Enterprise context: Routing incoming documents, emails, or requests to the right team or process.
Recommended approach: Start with a fine-tuned BERT-class model if you have 1,000+ labelled examples. Fall back to zero-shot classification with an LLM if labelled data is sparse.
Why not the latest LLM for everything? Cost. At 10,000 documents per day, an LLM classification costs 10-50x more than a fine-tuned transformer. The accuracy difference is typically 1-3%, which doesn't justify the cost difference at scale.
Information Extraction
Enterprise context: Pulling structured data from unstructured documents (invoices, contracts, reports).
Recommended approach: LLM with structured output (JSON mode) for complex, variable documents. Rules-based extraction with regex/NLP for structured, predictable documents.
The hybrid wins: Use rules for the fields that are always in the same place (invoice number, date). Use an LLM for the fields that vary (line item descriptions, special terms). The hybrid approach is cheaper and more reliable than either alone.
Retrieval (RAG)
Enterprise context: Answering questions from a corporate knowledge base.
Recommended approach: Bi-encoder for initial retrieval (fast, scalable). Cross-encoder for re-ranking the top results (accurate, but slower). LLM for answer generation from the retrieved context.
The key decision: Embedding model selection. Test at least three embedding models on your specific data. Performance varies significantly across domains, and the default choice is rarely the best for your content.
Anomaly Detection
Enterprise context: Fraud detection, compliance monitoring, quality assurance.
Recommended approach: Isolation Forest or Autoencoder for unsupervised anomaly detection. Gradient Boosted Trees (XGBoost, LightGBM) for supervised classification when labelled fraud examples exist.
Why not deep learning? For tabular enterprise data, gradient boosted trees consistently match or outperform deep learning approaches while being faster to train, easier to explain, and simpler to maintain. The research is clear on this.
The Empirical Selection Process
- Define constraints using the matrix above
- Identify 3-4 candidate approaches that appear feasible
- Build minimal prototypes of each (1-2 days per candidate)
- Evaluate on your data with your metrics against your constraints
- Select the candidate that meets all constraints with the best primary metric
- Document the decision including why alternatives were rejected
This process takes 1-2 weeks. It prevents months of deployment trouble from choosing the wrong algorithm based on theoretical benchmarks.
Algorithm selection in enterprise AI is engineering, not science. The goal isn't to find the best possible model. It's to find the best model that works within your specific constraints. Start with the constraints, evaluate empirically, and prefer simplicity over complexity when performance is comparable. Your future self, maintaining this system at 3am during an incident, will thank you.

