DeepSeek and the Open-Source AI Question for Enterprise

DeepSeek R1 dropped in January 2025 and the AI world took notice. An open-source reasoning model, developed by a Chinese AI lab, matching frontier proprietary models on key benchmarks. It's not just a technical milestone. It's a strategic inflection point for every enterprise making AI platform decisions.

What You Need to Know

DeepSeek R1 is an open-source reasoning model that performs competitively with GPT-4o and Claude 3.5 Sonnet on reasoning-heavy tasks, at a fraction of the inference cost.
Open-source AI doesn't mean free. The model weights are free; the infrastructure, security, talent, and ongoing operations are not.
For most enterprises, the right answer isn't "open-source or proprietary." It's a deliberate mix based on the sensitivity, scale, and requirements of each use case.
The competitive pressure from open-source models is driving down proprietary model pricing, which benefits everyone regardless of which models you use.

67%

of enterprises were evaluating open-source LLMs for production use by late 2024

Source: Andreessen Horowitz, Enterprise AI Survey 2024

Why DeepSeek Matters

DeepSeek R1 matters not because it's the best model available (it's competitive, not dominant) but because of what it proves about the trajectory of open-source AI.

Eighteen months ago, open-source models were a generation behind proprietary ones. Llama 2 was useful but clearly inferior to GPT-4. The gap suggested that only well-funded labs with massive compute budgets could produce frontier AI.

DeepSeek challenges that narrative. A relatively small team, using efficient training techniques, produced a model that competes on reasoning tasks that were previously the exclusive territory of OpenAI and Anthropic. The gap between open-source and proprietary is narrowing, and it's narrowing faster than most enterprise leaders expected.

The implication: Building your entire AI strategy around a single proprietary provider is increasingly risky. Not because proprietary models are bad (they're excellent), but because the competitive field is shifting in ways that reward flexibility.

The Real Trade-Offs

When Open-Source Makes Sense

Data sovereignty. If your data cannot leave your infrastructure (regulated industries, government, defence), open-source models deployed on your own infrastructure are the only option that provides complete control. No API calls, no third-party processing, no data residency questions.

High-volume, cost-sensitive workloads. Running thousands of document classifications per day through a proprietary API adds up. Self-hosted open-source models have a higher setup cost but lower marginal cost at scale. The crossover point is typically 50,000-100,000 API calls per month.

Customisation depth. Fine-tuning proprietary models is possible but constrained. Open-source models can be fine-tuned, quantised, distilled, and modified without restrictions. If your use case requires deep model customisation, open-source gives you more control.

Vendor independence. Open-source models can't change their terms of service, increase pricing, or deprecate features. The model you deploy today is the model you have tomorrow.

When Proprietary Makes Sense

Frontier capability. For tasks requiring the absolute best performance (complex reasoning, detailed analysis, sophisticated generation), proprietary models from OpenAI and Anthropic still lead. The gap is smaller than it was, but it exists.

Speed to value. An API call is faster to implement than a self-hosted deployment. If you need AI capabilities in weeks rather than months, proprietary APIs are the pragmatic choice.

Operational simplicity. Self-hosting models means managing GPU infrastructure, handling scaling, monitoring performance, and maintaining security. Proprietary APIs outsource all of this. For most enterprises, this operational simplicity is worth the premium.

$0.002

per 1K tokens for DeepSeek R1 vs $0.01-0.03 for comparable proprietary models

Source: DeepSeek API Pricing, January 2025

Cost Per 1K Tokens: Open-Source vs Proprietary

Source: DeepSeek API Pricing; OpenAI/Anthropic published pricing, January 2025

Enterprise support. When your AI system breaks at 2am, who do you call? Proprietary providers offer SLAs, support channels, and incident response. Open-source means your team owns every problem.

The Platform Approach

The enterprises getting this right aren't choosing sides. They're building model-agnostic foundations that can use different models for different workloads.

A practical architecture:

Workload	Model choice	Rationale
Customer-facing chat	Proprietary (Claude, GPT-4o)	Best quality, enterprise SLAs
Internal document classification	Open-source (fine-tuned)	High volume, cost-sensitive, internal data
Complex analysis and reasoning	Proprietary (frontier)	Highest capability required
Data extraction pipelines	Open-source or smaller proprietary	Structured task, cost at scale matters
Sensitive data processing	Open-source (self-hosted)	Data cannot leave infrastructure

This isn't about ideology. It's about matching the right model to the right workload based on cost, capability, control, and compliance requirements.

What This Means for NZ/AU Enterprises

The NZ and Australian markets have specific considerations:

Data sovereignty is a real concern. For government and regulated industries, the ability to process data entirely within NZ/AU infrastructure matters. Open-source models deployed on local cloud regions address this directly.

Talent is the constraint. Self-hosting and fine-tuning open-source models requires ML engineering talent that's scarce in NZ/AU. The cost of hiring and retaining this talent often exceeds the savings from avoiding proprietary API fees.

Start with APIs, evolve to hybrid. For most NZ/AU enterprises, the pragmatic path is: start with proprietary APIs to prove value and build your data and integration layers. Once you have proven use cases at scale, evaluate whether self-hosted open-source models reduce cost or improve control for specific workloads.

The Strategic Takeaway

DeepSeek R1 is a signal, not a destination. The signal is: the AI model layer is commoditising. Proprietary models will continue to lead on frontier capability, but the gap is shrinking and the cost is falling.

Your competitive advantage doesn't come from which model you use. It comes from your data, your integration with business workflows, your governance framework, and the compound effect of shared AI infrastructure. Those are the layers that are hard to replicate, and they work regardless of whether the model underneath is open-source or proprietary.

Build your foundation to be model-agnostic. Use the best model for each job. And watch the open-source space closely, because it's moving faster than most enterprise roadmaps account for.

Should we switch from GPT-4 to DeepSeek R1?: Not as a wholesale replacement. DeepSeek R1 is competitive on reasoning tasks but not universally superior. Evaluate it for specific workloads, particularly high-volume or data-sensitive ones, where its cost and deployment advantages matter. Keep proprietary models for tasks where frontier capability is essential.
Is it safe to use a model developed in China for enterprise workloads?: The model weights are open-source and auditable. When self-hosted, no data leaves your infrastructure. The geopolitical concern is real but manageable: audit the model, deploy on your own infrastructure, and apply the same security controls you would to any software component. The risk profile is comparable to using any open-source software.
How much does it cost to self-host an open-source model?: Initial setup for a production-grade deployment runs $15-40K (infrastructure, engineering time, security). Ongoing costs depend on scale: $2-8K/month for GPU compute on cloud infrastructure. Compare this to your projected API spend to find the crossover point. For most enterprises, self-hosting only makes financial sense above 50,000+ monthly API calls per model.